Graph RAG: Relationship-Aware Search

Graph RAG enhances traditional search by understanding relationships between concepts in your documents. Perfect for when you need to understand how ideas connect.

What is Graph RAG?

Traditional search finds documents based on keywords or semantic similarity. Graph RAG goes further by:

Understanding relationships between entities (people, companies, concepts)
Traversing connections to find related information
Answering relationship questions like "How are X and Y connected?"
Discovering insights through multi-hop reasoning

How It Works

When to Use Graph RAG

Perfect For

Relationship Questions:

"How are Person A and Person B connected?"
"What companies does Organization X work with?"
"Which projects involve both Technology A and Technology B?"

Multi-Hop Reasoning:

"What does Company X's partner do?"
"Who worked on projects related to the same technology?"
"Trace the connection between Concept A and Concept B"

Entity-Centric Queries:

"Show me everything about Product X"
"What are all the locations mentioned for Company Y?"
"List all people who worked at Organization Z"

Not Needed For

Simple keyword search:

"Find documents about topic X"
"Show me the latest updates"
Standard semantic similarity works better

Single-concept queries:

"What is RAG?"
"Explain how authentication works"
Regular vector search is sufficient

Entity Types Detected

Automatically Identified

People:

Names and roles
Organizational affiliations
Mentions across documents

Organizations:

Companies and institutions
Teams and departments
Partners and subsidiaries

Locations:

Cities and countries
Offices and facilities
Geographic regions

Concepts:

Technologies and products
Projects and initiatives
Domain-specific terms

Dates & Events:

Temporal references
Project timelines
Historical events

Relationship Types

Common Connections

Professional Relationships:

WORKS_FOR: Person → Organization
PART_OF: Team → Department → Organization
REPORTS_TO: Person → Person

Location Relationships:

LOCATED_IN: Organization → Location
BASED_IN: Person → Location
OPERATES_IN: Company → Region

Project Relationships:

WORKS_ON: Person → Project
USES: Project → Technology
DEVELOPED_BY: Product → Organization

Content Relationships:

MENTIONS: Document → Entity
RELATED_TO: Entity → Entity
REFERENCES: Document → Document

Real-World Examples

Example 1: Finding Connections

Question: "How are John Doe and Acme Corp connected?"

Graph RAG Process:

Identifies "John Doe" (Person) and "Acme Corp" (Organization)
Searches for paths between them
Finds: John Doe → WORKS_ON → Project X → DEVELOPED_BY → Acme Corp
Retrieves relevant document chunks
Generates answer with full context

Answer: "John Doe works on Project X, which is developed by Acme Corp."

Example 2: Multi-Hop Discovery

Question: "What products use the same technology as Project Alpha?"

Graph RAG Process:

Finds Project Alpha entity
Identifies technology it uses
Traverses to other projects using same technology
Collects information about those projects
Assembles comprehensive answer

Answer: Lists all related projects with context about each

Example 3: Temporal Queries

Question: "What happened in the company in 2024?"

Graph RAG Process:

Identifies "2024" date entity
Finds all events linked to that timeframe
Gathers related people, projects, decisions
Organizes chronologically
Provides timeline

Answer: Timeline of 2024 events with context

Setup & Configuration

Optional Neo4j Integration

Graph RAG is optional. It requires Neo4j database.

When to enable:

Your documents contain many interconnected entities
Relationship questions are common
Multi-hop reasoning needed
Worth the additional infrastructure

When to skip:

Simple Q&A sufficient
Minimal entity relationships
Resource constraints
Just getting started

Getting Started

Option 1: Managed Neo4j (Easiest)

Sign up at neo4j.com/aura (free tier available)
Create new database
Copy connection details
Add to Scrapalot configuration
Enable Graph RAG in settings

Option 2: Self-Hosted

Run Neo4j in Docker
Configure connection
Enable Graph RAG
System automatically builds graph

Automatic Processing:

Entities extracted during document processing
Relationships detected automatically
Graph built in background
No manual configuration needed

Performance Considerations

Graph Size

Small graphs (1000s of entities):

Very fast queries
Minimal resource usage
Works on modest hardware

Medium graphs (10,000s of entities):

Fast with proper indexing
Moderate resource usage
Recommended for most deployments

Large graphs (100,000+ entities):

Requires optimization
Higher resource needs
Consider managed Neo4j

Query Performance

Fast queries:

Direct entity lookups
1-2 hop relationships
Indexed properties

Slower queries:

Deep traversals (3+ hops)
Complex pattern matching
Full graph scans

Optimization:

Automatic indexing on common properties
Query depth limits
Result set size limits
Smart caching

Combining with Vector Search

Graph RAG works alongside:

Dense semantic search (vector embeddings)
Sparse keyword search (BM25)
Graph-based search (entity relationships)

Intelligent routing:

System chooses best search method(s)
Combines results when beneficial
Balances precision and recall

Example:

Question mentions specific entities → Use graph search
Question is conceptual → Use vector search
Question has exact terms → Use keyword search
Complex question → Use all three, fuse results

Privacy & Data Sovereignty

Data Storage

What's stored in Neo4j:

Entity names and types
Relationship types
Document references
Confidence scores

What's NOT stored:

Full document content (in PostgreSQL)
Vector embeddings (in pgvector)
User data (in PostgreSQL)

Self-Hosting

Complete control:

Run Neo4j on your infrastructure
Data never leaves your network
Full audit trail
Custom backup strategy

Monitoring

Graph Health

Track graph metrics:

Total entities
Total relationships
Entity type distribution
Relationship type distribution
Query performance

Access via:

Admin dashboard
Neo4j browser
Query logs

Usage Patterns

Understand how Graph RAG helps:

Questions using graph search
Average traversal depth
Most common entity types
Popular relationship queries

Best Practices

Document Preparation

Maximize graph value:

Use clear entity names
Maintain consistent terminology
Include context about relationships
Structure content logically

Query Formulation

Get better results:

Name specific entities
Ask about relationships explicitly
Use "how" and "why" questions
Request connections and paths

Graph Maintenance

Keep graph healthy:

Monitor entity quality
Review relationship accuracy
Clean up duplicates
Update deprecated entities

Troubleshooting

No Relationship Found

Common causes:

Entities not in same document context
Relationship type not detected
Traversal depth limit reached
Indirect connection too distant

Solutions:

Check entity names are correct
Review document content
Increase traversal depth
Try semantic search instead

Slow Graph Queries

Optimize:

Reduce traversal depth
Limit result set size
Use more specific entity names
Check graph size

Poor Entity Detection

Improve:

Use clearer entity names in documents
Add context around entities
Review detection confidence
Consider manual entity tagging

RAG Strategy - How Graph RAG fits in
Context Expansion - Enhanced understanding
Model Management - Entity extraction models
Deployment Guide - Neo4j setup

Graph RAG is powerful but optional. Start with standard RAG, add Graph RAG when you need relationship understanding.

Graph RAG: Relationship-Aware Search ​

What is Graph RAG? ​

How It Works ​

When to Use Graph RAG ​

Perfect For ​

Not Needed For ​

Entity Types Detected ​

Automatically Identified ​

Relationship Types ​

Common Connections ​

Real-World Examples ​

Example 1: Finding Connections ​

Example 2: Multi-Hop Discovery ​

Example 3: Temporal Queries ​

Setup & Configuration ​

Optional Neo4j Integration ​

Getting Started ​

Performance Considerations ​

Graph Size ​

Query Performance ​

Combining with Vector Search ​

Tri-Modal Fusion ​

Privacy & Data Sovereignty ​

Data Storage ​

Self-Hosting ​

Monitoring ​

Graph Health ​

Usage Patterns ​

Best Practices ​

Document Preparation ​

Query Formulation ​

Graph Maintenance ​

Troubleshooting ​

No Relationship Found ​

Slow Graph Queries ​

Poor Entity Detection ​

Related Documentation ​

Graph RAG: Relationship-Aware Search

What is Graph RAG?

How It Works

When to Use Graph RAG

Perfect For

Not Needed For

Entity Types Detected

Automatically Identified

Relationship Types

Common Connections

Real-World Examples

Example 1: Finding Connections

Example 2: Multi-Hop Discovery

Example 3: Temporal Queries

Setup & Configuration

Optional Neo4j Integration

Getting Started

Performance Considerations

Graph Size

Query Performance

Combining with Vector Search

Tri-Modal Fusion

Privacy & Data Sovereignty

Data Storage

Self-Hosting

Monitoring

Graph Health

Usage Patterns

Best Practices

Document Preparation

Query Formulation

Graph Maintenance

Troubleshooting

No Relationship Found

Slow Graph Queries

Poor Entity Detection

Related Documentation