Local RAG Pipeline

Local-first hybrid retrieval system — BM25 + vector search over messy enterprise documents with ONNX embeddings, reranking, and resilient indexing. No API calls, no cloud dependency.

Private Repository

A local CLI tool for indexing and searching company documents with hybrid retrieval. Combines BM25 full-text search, semantic vector search, and optional reranking into one local-first pipeline for knowledge retrieval across messy internal files. Built with Bun and TypeScript, backed by SQLite with FTS5 and sqlite-vec — no external API calls, no cloud dependency, everything runs on the machine.

117
TypeScript source files
15
Test files
Hybrid
BM25 + vector search
Local
ONNX embeddings
12+
File formats supported
SQLite
FTS5 + sqlite-vec

What was built

  • Hybrid retrieval engine — BM25 full-text search via SQLite FTS5 and semantic vector search via sqlite-vec, combined into a single ranked result set with optional second-stage local reranking
  • Local embedding pipeline — ONNX-based embedding model running locally, converting document chunks into vectors without round-tripping to an API
  • Multi-format document ingestion — parsers for PDF, Word, PowerPoint, HTML, Excel, CSV, email (.eml), images, and plaintext, handling the messy reality of enterprise file systems
  • Resilient indexing pipeline — content hashing to skip unchanged files, crash isolation for heavy parsing loads, and queued processing to keep the system stable under scale
  • Live re-indexing — file watcher that detects changes and automatically re-indexes affected documents, keeping the search index current without manual intervention
  • Configurable chunking — document splitting with adjustable chunk size and overlap, feeding both the BM25 and vector indexes from a single chunking pass
BM25 full-text search via SQLite FTS5
Semantic vector search via sqlite-vec
Optional second-stage local reranker
Local ONNX embedding model — no API calls
PDF, Word, PowerPoint, HTML, Excel, CSV, email, image parsing
Content hashing — unchanged files skipped on re-index
File watcher for automatic re-indexing on changes
Crash isolation for resilient parsing under heavy loads
Chunking pipeline with configurable overlap and size
CLI interface for indexing, searching, and configuration

Architecture

Codebase by Layer

117 total
Parsers (PDF, Word, PPTX, HTML, Excel, CSV, email, images)24
Search / Retrieval / Reranker18
Indexer / Chunking / Embeddings22
CLI / Config / Watcher / DB38
Tests15

Tech stack

Bun · TypeScript · SQLite · FTS5 · sqlite-vec · ONNX Runtime · CLI

Development

Feb 2026
Repo created, core retrieval pipeline and parser architecture
Mar 2026
Hybrid search, reranker, file watcher, crash isolation, 12+ format support

Why this project matters

This is retrieval infrastructure, not a chatbot wrapper. It covers the full arc of a RAG system — document parsing, chunking, embedding, hybrid search, reranking — built to work on real enterprise files without depending on cloud services. It demonstrates search systems design, document intelligence, and the practical engineering of making AI retrieval reliable on messy, real-world data.