🚧 This page needs work

Note: Verify AI-generated output

Document RAG Guide

Query documents using vector embeddings and semantic search

Document RAG (also called “basic RAG”, “naive RAG”, or simply “RAG”) is a retrieval-augmented generation approach that uses vector embeddings to find relevant document chunks and provides them as context to an LLM for generating responses.

What is Document RAG?

Document RAG works by:

Chunking documents into smaller pieces
Embedding each chunk as a vector
Storing vectors in a vector database
Retrieving similar chunks based on query embedding
Generating responses using retrieved context

When to Use Document RAG

✅ Use Document RAG when:

You need semantic search over documents
Questions can be answered from isolated passages
You want simple, fast implementation
Document context is self-contained

⚠️ Consider alternatives when:

You need to understand relationships between entities → Use Graph RAG
You need structured schema-based extraction → Use Ontology RAG
Answers require connecting information across documents → Use Graph RAG

Prerequisites

Before starting:

✅ TrustGraph deployed (Quick Start)
✅ Understanding of Core Concepts
✅ Documents ready to load

Step-by-Step Guide

Step 1: Prepare Your Documents

TrustGraph supports multiple document formats:

PDF files (.pdf)
Text files (.txt)
Markdown (.md)
HTML (.html)

Best practices:

Keep documents focused on specific topics
Use clear formatting and structure
Remove unnecessary metadata or headers
Ensure text is extractable (not scanned images)

Step 2: Configure Document Processing

Configure chunking parameters in your flow:

Chunk Size: Number of characters per chunk

Small (500-800): Better precision, more chunks
Medium (1000-1500): Balanced approach (recommended)
Large (2000-3000): More context, fewer chunks

Chunk Overlap: Characters shared between consecutive chunks

Typical: 50-100 characters
Purpose: Ensures context continuity at boundaries

Example configuration:

chunker:
  type: recursive
  chunk_size: 1000
  overlap: 50

Step 3: Load Documents

Using CLI

Load a single PDF:

tg-load-pdf my-document.pdf

Load from a directory:

for file in documents/*.pdf; do
  tg-load-pdf "$file"
done

Load with specific collection:

tg-load-pdf --collection my-project document.pdf

Using the Workbench

Navigate to Library page at http://localhost:8888
Click Upload or drag-and-drop documents
Documents appear in the library
Select documents and click Submit
Choose a processing flow
Click Submit to start processing

Step 4: Process Documents

Documents must be processed to create embeddings:

Using CLI:

# Check flow status
tg-show-flows

# Start the default flow
tg-start-flow default-flow

# Monitor processing
tg-show-processor-state

Using Workbench:

Go to Library page
Select unprocessed documents
Click Submit in action bar
Select processing flow
Click Submit

Monitor in Grafana:

Access http://localhost:3000
Watch processing backlog
Track chunk embeddings created
Monitor LLM token usage

Step 5: Query Using Document RAG

CLI Method

Basic query:

tg-invoke-document-rag "What is the main topic of these documents?"

Query specific collection:

tg-invoke-document-rag --collection my-project "Summarize the key findings"

Adjust number of retrieved chunks:

tg-invoke-document-rag --limit 5 "What are the main conclusions?"

API Method

Endpoint: /api/document-rag

Request:

{
  "query": "What is the main topic?",
  "collection": "my-project",
  "limit": 3
}

Response:

{
  "answer": "The main topic is...",
  "sources": [
    {
      "text": "Relevant chunk...",
      "score": 0.85,
      "document": "document-name.pdf"
    }
  ]
}

Workbench Method

Navigate to Document RAG tab
Select collection (optional)
Enter your question
Click Submit
View answer and source chunks
Click sources to see context

Step 6: Verify and Refine

Check retrieval quality:

# View vector search results
tg-invoke-vector-search "your query term"

Tune parameters if needed:

Increase chunk size if answers lack context
Decrease chunk size if results are too broad
Adjust overlap if context boundaries are poor
Increase retrieval limit if missing relevant information

Understanding Document RAG Results

Source Attribution

Document RAG returns:

Answer: LLM-generated response
Sources: Retrieved chunks used for context
Scores: Similarity scores for each chunk
Documents: Origin documents for each chunk

Confidence Indicators

High confidence (score > 0.8):

Query closely matches document content
Retrieved chunks directly relevant

Medium confidence (score 0.6-0.8):

Semantic similarity present
May need broader context

Low confidence (score < 0.6):

Weak match to query
Consider query reformulation

Common Patterns

Multi-Document Search

Query across all documents:

tg-invoke-document-rag "What trends appear across all reports?"

Collection-Specific Queries

Query within a specific project:

tg-invoke-document-rag --collection project-2024 "What are the Q4 results?"

Start broad, then narrow:

# Broad query
tg-invoke-document-rag "What topics are covered?"

# Focused follow-up
tg-invoke-document-rag "Explain the methodology in detail"

Troubleshooting

Poor Retrieval Quality

Problem: Irrelevant chunks retrieved

Solutions:

Verify documents processed successfully: tg-show-processor-state
Check embedding quality: tg-invoke-vector-search "test query"
Adjust chunk size in flow configuration
Reformulate query for better semantic match

Missing Context

Problem: Answers lack necessary context

Solutions:

Increase chunk size (e.g., 1000 → 1500)
Increase retrieval limit (more chunks)
Increase chunk overlap (50 → 100)
Use Graph RAG for relationship-based context

Slow Queries

Problem: Document RAG queries take too long

Solutions:

Reduce number of documents in collection
Optimize vector database configuration
Use more powerful hardware
Consider indexing strategies

Empty Results

Problem: No results returned

Solutions:

Verify documents are processed: tg-show-processor-state
Check collection name is correct
Verify embeddings created: tg-show-graph
Check for processing errors in logs

Advanced Configuration

Custom Embedding Models

Configure different embedding models in your flow:

embeddings:
  model: sentence-transformers/all-mpnet-base-v2
  dimension: 768

Popular choices:

all-mpnet-base-v2: Balanced quality/speed (768d)
all-MiniLM-L6-v2: Fast, smaller (384d)
bge-large-en: High quality (1024d)

Retrieval Tuning

Adjust retrieval parameters:

# Get more context (more chunks)
tg-invoke-document-rag --limit 10 "query"

# Focus on top matches (fewer chunks)
tg-invoke-document-rag --limit 2 "query"

Collection Management

Create collection:

tg-set-collection my-project

List collections:

tg-list-collections

Delete collection:

tg-delete-collection my-project

Document RAG vs. Other Approaches

Aspect	Document RAG	Graph RAG	Ontology RAG
Retrieval	Vector similarity	Graph relationships	Schema-based
Context	Isolated chunks	Connected entities	Structured data
Best for	Semantic search	Complex relationships	Typed extraction
Setup	Simple	Medium	Complex
Speed	Fast	Medium	Medium

Use multiple approaches:

Document RAG for quick semantic search
Graph RAG when relationships matter
Ontology RAG for structured extraction

Next Steps

Explore Other RAG Types

Graph RAG - Leverage knowledge graph relationships
Ontology RAG - Use structured schemas for extraction

Advanced Features

Structured Processing - Extract typed objects
Agent Extraction - AI-powered extraction workflows
Object Extraction - Domain-specific extraction

API Integration

Document RAG API - API reference
CLI Reference - Command-line tools
Examples - Code samples

Core Concepts - Understanding embeddings and chunks
Vector Search - How semantic search works
Deployment - Scaling for production
Troubleshooting - Common issues

Document RAG Guide

What is Document RAG?

When to Use Document RAG

Prerequisites

Step-by-Step Guide

Step 1: Prepare Your Documents

Step 2: Configure Document Processing

Step 3: Load Documents

Using CLI

Using the Workbench

Step 4: Process Documents

Step 5: Query Using Document RAG

CLI Method

API Method

Workbench Method

Step 6: Verify and Refine

Understanding Document RAG Results

Source Attribution

Confidence Indicators

Common Patterns

Multi-Document Search

Collection-Specific Queries

Iterative Refinement

Troubleshooting

Poor Retrieval Quality

Missing Context

Slow Queries

Empty Results

Advanced Configuration

Custom Embedding Models

Retrieval Tuning

Collection Management

Document RAG vs. Other Approaches

Next Steps

Explore Other RAG Types

Advanced Features

API Integration

Related Resources