Skip to content

Quick Start Guide

Get Scrapalot Desktop up and running in under 15 minutes! This guide walks you through setting up the open-source desktop application locally.

Choose Your Path

What You'll Have

By the end of this guide:

Desktop

Fully functional Scrapalot Desktop

Running locally on your machine

Documents

Upload & chat with documents

PDF, EPUB, and text formats

AI

AI-powered answers

With source citations

Web

Web interface

At http://localhost:3000

Models

Cloud or local AI models

Your choice

Privacy

Your data stays local

On your machine

Prerequisites

RequirementVersionCheck CommandPurpose
Node.js18+node --versionFrontend
Python3.12+python --versionBackend
PostgreSQL16+psql --versionDatabase
GitAnygit --versionClone repos

Optional but Recommended

  • Docker - Easiest way to run PostgreSQL + pgvector
  • AI API Key - OpenAI, Anthropic, or use Ollama (local, free)

Quick Start (4 Steps)

Step 1: Download Scrapalot Desktop

Download the latest release from our website or GitHub:

bash
# Option 1: Download from website
# Visit https://scrapalot.app/download

# Option 2: Clone from GitHub (when available)
# Desktop app repository will be announced soon
# UI and connector development kit are open source

Open Source Components

  • Desktop App Core - Full RAG capabilities (coming soon)
  • UI Components - React frontend (open source)
  • Connector Development Kit - Build custom connectors (open source)
  • Documentation - This site is public, code is for reference only

Step 2: Database Setup (Docker - Easiest)

Using Docker (Recommended):

bash
# PostgreSQL 16+ with pgvector
docker run -d \
  --name scrapalot-db \
  -e POSTGRES_PASSWORD=your_password \
  -e POSTGRES_DB=scrapalot \
  -p 5432:5432 \
  pgvector/pgvector:pg16

# Verify it's running
docker ps
Alternative: Manual PostgreSQL Install

Ubuntu/Debian:

bash
sudo apt-get install postgresql-16 postgresql-16-pgvector
sudo -u postgres createdb scrapalot
sudo -u postgres psql -d scrapalot -c "CREATE EXTENSION vector;"

macOS:

bash
brew install postgresql@15 pgvector
createdb scrapalot
psql scrapalot -c "CREATE EXTENSION vector;"

Windows: Download PostgreSQL 16+ from postgresql.org and install pgvector extension.

Step 3: Install & Configure

Desktop App Coming Soon

The desktop application is currently in development. Installation instructions will be updated when the release is available.

For now, you can sign up for the free Researcher Plan on our cloud platform at scrapalot.app to start using Scrapalot immediately.

When the desktop app is released:

bash
# Install desktop app (instructions will be provided)
# Configure database connection
# Set up AI provider (OpenAI, Anthropic, Google, or local Ollama)

AI Model Options

  • Cloud providers: OpenAI, Anthropic, Google (API key required)
  • Local models: Ollama (free, runs on your machine)
  • Cost-effective: DeepSeek (10x cheaper than OpenAI)

Step 4: Launch Scrapalot

Once the desktop app is installed:

bash
# Launch the application
# Desktop app will start on http://localhost:3000

🎉 Open your browser to http://localhost:3000

You'll see the Scrapalot login/signup page!

First Steps in Scrapalot

1. Create Your Account

  1. Click "Sign Up" on the homepage
  2. Enter your email and password
  3. Click "Create Account"
  4. You'll be automatically logged in

2. Upload Your First Document

  1. Click the Upload button (top right)
  2. Select a PDF, Word doc, or text file
  3. Wait 10-30 seconds for processing
  4. You'll see a notification when it's ready

Supported Formats:

  • PDF (.pdf)
  • EPUB (.epub)
  • Word (.docx, .doc)
  • Text (.txt)
  • Markdown (.md)
  • CSV (.csv)

3. Ask Your First Question

  1. Click on your uploaded document
  2. Type a question in the chat box
  3. Press Enter or click Send
  4. Get an AI-powered answer with sources!

Example Questions:

  • "What is this document about?"
  • "Summarize the main points"
  • "What does it say about [specific topic]?"
  • "List all the key findings"

Configuration Options

Using Local Models (Ollama)

Install Ollama:

bash
# Install Ollama
curl -fsSL https://ollama.com/install.sh | sh

# Pull a model
ollama pull nemotron-3-nano:30b

# Start Ollama (runs on http://localhost:11434 by default)
ollama serve

Configure in Scrapalot UI:

  1. Open Settings → AI Providers tab (Remote AI Providers)
  2. Click + Add Provider button
  3. Select Ollama from the provider list
  4. The endpoint will be auto-filled with http://localhost:11434
  5. Click Test Connection to verify Ollama is running
  6. Click Fetch Models to see all your pulled Ollama models
  7. Select the models you want to use and click Add Provider

AI SettingsAI Providers tab showing Ollama, vLLM, and LM Studio configuration

No API keys needed - your models run completely offline on your machine!

Other Local Model Providers

You can also use vLLM and LM Studio the same way:

  • vLLM: High-performance inference server (requires custom endpoint URL)
  • LM Studio: User-friendly desktop app (default: http://localhost:1234/v1)

All three are configured in the AI Providers tab, not the Local AI tab (which is for server-side GGUF models).

Advanced Settings

Scrapalot includes many other configuration options accessible through the Settings UI:

  • General Settings - Theme, language, font preferences
  • Documents - Chunking strategies, embedding models
  • Prompts - Custom prompt templates
  • Workspaces - Team collaboration settings

General SettingsGeneral settings for theme and appearance customization

Verify Installation

Check that everything is working:

bash
# Backend health check
curl http://localhost:8090/health

# Should return:
# {"status":"healthy"}

# Frontend should be accessible at:
# http://localhost:3000

Troubleshooting

Backend Won't Start

Error: "Connection to database failed"

bash
# Check if PostgreSQL is running
docker ps  # if using Docker
# or
sudo systemctl status postgresql  # if local

# Check connection settings in .env
# Make sure POSTGRES_HOST, POSTGRES_PORT match your setup

Error: "ModuleNotFoundError"

bash
# Reinstall dependencies
pip install -r requirements.txt --force-reinstall

Frontend Won't Start

Error: "EADDRINUSE: address already in use"

bash
# Port 3000 is already in use, use a different port:
npm run dev -- --port 3001

Error: "Cannot connect to backend"

bash
# Make sure backend is running on port 8090
curl http://localhost:8090/health

# Check VITE_API_URL in frontend
# Should be http://localhost:8090

Database Issues

Error: "relation does not exist"

bash
# Run migrations
cd scrapalot-chat
python -m alembic upgrade head

Error: "PGVector extension not found"

bash
# Install PGVector extension
# Connect to your database
psql -U scrapalot -d scrapalot

# Run:
CREATE EXTENSION IF NOT EXISTS vector;
\q

Upload Not Working

Documents stuck in "Processing"

bash
# Check backend logs for errors
# Make sure you have enough disk space
# Try a smaller document first (< 5MB)

Next Steps

Now that you have Scrapalot running:

Learn More

Advanced Features

Deployment

Tips for Success

Performance Tips

  1. Use SSD storage for faster document processing
  2. Allocate 4GB+ RAM for optimal performance
  3. Use local models for privacy and cost savings
  4. Enable caching for frequently accessed documents

Best Practices

  1. Organize documents into collections by topic
  2. Use descriptive names for easy searching
  3. Ask specific questions for better answers
  4. Review sources to verify information
  5. Share collections with your team

Security Tips

  1. Change default passwords in .env
  2. Use strong passwords for user accounts
  3. Enable HTTPS in production
  4. Keep API keys secure (never commit to git)
  5. Regular backups of your database

Getting Help

Community Support

Documentation

You're Ready!

Congratulations! You now have Scrapalot running. Start uploading documents and asking questions!


Try Scrapalot Now

Desktop App: Coming soon - sign up for updates at scrapalot.app

Cloud (Free Researcher Plan): Start immediately at scrapalot.app

Need Team Features? Check out our Professional and Enterprise plans.

Released under the MIT License.