Skip to content

Intelligent Search Strategies

Scrapalot uses advanced Retrieval Augmented Generation (RAG) to find the most relevant information in your documents and deliver accurate, well-sourced answers. The system automatically selects the best search approach for each question, ensuring you get precise results every time.

How It Works

When you ask a question, Scrapalot analyzes what you're looking for and automatically chooses the optimal search strategy. You don't need to configure anything—the system adapts to your needs in real-time.

Intelligent Question Understanding

The system classifies your questions to find the best search method:

Question Types

  • Factual Questions - Direct questions with clear answers

    • Example: "What is the capital of France?"
    • Best for: Quick fact lookup
  • Conceptual Questions - Abstract ideas requiring deeper understanding

    • Example: "Explain quantum entanglement"
    • Best for: Learning new concepts
  • Relational Questions - Understanding how things connect

    • Example: "How are modules X and Y connected?"
    • Best for: Exploring relationships
  • Analytical Questions - Comparison and evaluation

    • Example: "Compare Jung and Freud's theories"
    • Best for: Understanding differences and similarities
  • Conversational Questions - Follow-up questions in context

    • Example: "What about the other approach?"
    • Best for: Natural conversation flow

Complexity Assessment

The system evaluates question complexity to deliver appropriately detailed answers:

ComplexityQuestion ExampleAnswer Style
Simple"What is the capital of France?"Quick, direct fact
Moderate"Papers from 2023 about AI ethics"Filtered results with context
Multi-part"Explain quantum entanglement"Comprehensive explanation
Complex"Compare Jung and Freud's theories"In-depth analysis
Deep"How did archetype theory influence modern psychology?"Synthesized insights from multiple sources

When to Use Different Search Approaches

For Exact Matches

What you need: Specific error codes, version numbers, dates, or IDs

Examples:

  • "Show me error 221" (not 220 or 222)
  • "Changes in v2.1.3" (not v2.1.2)
  • "Reports from March 15, 2024"

How it works: Combines understanding your intent with precise keyword matching to find exactly what you specified.

Why it matters: Prevents getting similar but incorrect results—when you need error 221, you don't want error 220.

For Code and Technical Content

What you need: Specific syntax, commands, or technical patterns

Examples:

  • "Find queries using JOIN"
  • "Functions using async/await"
  • "YAML files with ports: 8080"

How it works: Searches for exact technical terms while understanding the broader context.

Why it matters: Technical terms mean specific things—"JOIN" in SQL is different from general discussions about "combining" or "merging."

For Understanding Connections

What you need: How entities, concepts, or components relate to each other

Examples:

  • "How are modules X and Y connected?"
  • "What depends on service Z?"
  • "Show the hierarchy from A to B"

How it works: Maps relationships between entities and traces connections through your knowledge base.

Why it matters: Finds hidden dependencies and relationships that keyword search would miss.

For Time-Specific Information

What you need: Documents from specific dates, versions, or time periods

Examples:

  • "Documents from Q3 2024"
  • "Code changes between v1.0 and v2.0"
  • "Latest research papers"

How it works: Filters by date metadata while understanding temporal expressions like "latest" or "recent."

Why it matters: Gets you the right version or time period without wading through outdated information.

For Technical Terms and Acronyms

What you need: Domain-specific jargon that must match exactly

Examples:

  • Finding "RAG" (not general retrieval discussion)
  • "API" documentation (not general interface concepts)
  • "CPU" specifications (distinct from "processor")

How it works: Balances exact keyword matching with semantic understanding.

Why it matters: Technical terms have precise meanings—substituting synonyms changes the meaning.

Search Quality Benefits

Accuracy Improvements

  • 15-25% better results compared to generic search
  • Prevents wrong answers by matching exact requirements when needed
  • Finds hidden connections through relationship mapping
  • Adapts automatically to question complexity

Speed and Efficiency

  • Instant analysis of your question (typically under 1 second)
  • Cached results for frequently asked questions
  • Parallel processing for complex queries
  • Progressive refinement if initial results aren't sufficient

Use Cases

Research and Learning

Best for: Understanding complex topics, comparing theories, exploring concepts

Example workflow:

  1. Ask: "Explain the main differences between Jung and Freud's psychology"
  2. System analyzes as complex comparative question
  3. Retrieves relevant content from both theorists
  4. Synthesizes comprehensive comparison
  5. Provides citations to source material

Benefits: Deep understanding with properly sourced information

Technical Troubleshooting

Best for: Finding specific errors, configuration issues, code examples

Example workflow:

  1. Ask: "How to fix error code 404 in the authentication module?"
  2. System recognizes exact match need (error 404)
  3. Filters to authentication context
  4. Returns precise solution
  5. Includes related documentation

Benefits: Fast, accurate problem resolution

Document Discovery

Best for: Finding documents by date, author, topic, or type

Example workflow:

  1. Ask: "Show me all reports from Q2 2024 about user engagement"
  2. System applies temporal filter (Q2 2024)
  3. Filters by topic (user engagement)
  4. Filters by type (reports)
  5. Returns matched documents with previews

Benefits: Quick access to exactly what you need

Exploratory Analysis

Best for: Understanding how concepts connect, discovering patterns

Example workflow:

  1. Ask: "How do different modules in the system interact?"
  2. System maps entity relationships
  3. Traces connections and dependencies
  4. Visualizes the network
  5. Highlights key interaction points

Benefits: Discover hidden patterns and dependencies

How Search Strategies Work Together

For complex questions, the system may combine multiple approaches:

Example: "How did Einstein's work influence modern quantum computing?"

  1. Factual search - Einstein's key discoveries
  2. Conceptual search - Quantum computing principles
  3. Relational search - How physics theories connect to computing
  4. Synthesis - Comprehensive answer showing the progression

Performance Characteristics

Response Times

Search ComplexityTypical Response TimeWhat You Get
Simple fact lookup< 1 secondDirect answer
Moderate complexity1-3 secondsDetailed explanation
Complex analysis3-5 secondsComprehensive synthesis
Deep research5-10 secondsMulti-source insights

Quality Metrics

Accuracy: The system prioritizes correct answers over speed

  • Validates information across sources
  • Provides citations for verification
  • Indicates confidence levels when uncertain

Relevance: Results match your actual intent

  • Understands context from conversation
  • Adapts to your knowledge level
  • Filters out tangentially related content

Completeness: Comprehensive answers without overload

  • Covers all aspects of your question
  • Provides appropriate detail level
  • Links to deeper information when available

Tips for Best Results

Write Clear Questions

Good: "What caused the authentication error on March 15th?" Why: Specific, includes context, clear intent

Less effective: "Something broke" Why: Vague, no context, unclear what you need

Use Natural Language

Good: "Show me recent papers about machine learning in healthcare" Why: Natural phrasing, clear filters, conversational

Less effective: "papers AND (machine_learning OR ML) AND healthcare AND date>2023" Why: No need for technical syntax—the system understands natural language

Provide Context

Good: "In the authentication module, how does password reset work?" Why: Specifies scope, easier to find relevant information

Less effective: "How does password reset work?" Why: Might search entire codebase instead of specific module

Ask Follow-up Questions

Good: "What about OAuth instead?" Why: System remembers you were discussing authentication

Less effective: Starting a completely new topic without context Why: Loses the conversation thread

What Makes This Different

  • Matches exact words only
  • Doesn't understand intent
  • Misses related concepts
  • Returns too many irrelevant results
  • Understands what you mean
  • Finds related concepts
  • Adapts to question complexity
  • Returns precisely what you need

Example: "How do I improve performance?"

Traditional search: Returns every document mentioning "improve" and "performance"

Scrapalot:

  1. Understands you want optimization techniques
  2. Considers your context (web app, database, etc.)
  3. Finds performance best practices
  4. Prioritizes actionable solutions
  5. Cites specific examples from your docs

Continuous Improvement

The search system learns from usage:

  • Adapts to your document collection
  • Improves relevance over time
  • Refines understanding of your domain
  • Optimizes for common question patterns

This intelligent search system ensures you spend less time searching and more time getting answers. The automatic strategy selection means you don't need to think about how search works—just ask your question naturally and get accurate, relevant results.

Released under the MIT License.