Guides
Build search you own.
Practical, self-hosted-first guides to RAG, AI search, and vector databases. No fluff, no vendor pitches — just how to run it yourself.
Self-Hosted RAG
Private AI knowledge bases — retrieval-augmented generation you run yourself.
The Best Local Embedding Models for RAG (2026)
A practical comparison of local, self-hostable embedding models for RAG — nomic-embed-text, mxbai-embed-large, bge, e5, gte — with dimensions, licenses, and how to pick.
Read guide
Build a Private RAG System on a VPS: A Step-by-Step Tutorial
A hands-on tutorial to build a private, self-hosted RAG system on a VPS: provision the box, run Ollama, stand up a vector store, build the pipeline, and ship a FastAPI.
Read guide
Self-Hosted RAG vs OpenAI + Pinecone: A Real Cost Breakdown
An honest, itemized cost comparison of self-hosted RAG versus OpenAI embeddings plus Pinecone — compute, embeddings, storage, hidden costs, and when managed wins.
Read guide
Self-Hosted RAG: The Complete Guide to Private AI Knowledge Bases
Build a private, self-hosted RAG system you fully own. The reference stack, embedding and vector-store choices, VPS sizing, pitfalls, and when not to self-host.
Read guide
How to Evaluate a RAG System: Metrics, Golden Sets, and Regression Testing
Evaluate RAG properly: retrieval metrics (recall@k, MRR, nDCG), generation metrics (faithfulness, relevance), golden sets, RAGAS and LLM-as-judge, self-hosted.
Read guide
Production RAG: Taking Self-Hosted Retrieval From Demo to Reliable Service
Take self-hosted RAG to production: caching, observability, latency and cost control, access control, data freshness, eval in CI, and scaling the vector store.
Read guide
RAG Chunking Strategies: How to Split Documents for Better Retrieval
A practical guide to chunking strategies for RAG: fixed-size, recursive, semantic and structure-aware splitting, overlap, parent-document retrieval and sizing.
Read guide
RAG vs Fine-Tuning: Which One Do You Actually Need? (2026)
A clear decision guide to RAG vs fine-tuning — what each does, the cost, latency and maintenance tradeoffs, hallucinations, and when to combine both.
Read guide
RAG vs Long Context: Do You Still Need Retrieval in 2026?
Honest 2026 take on RAG vs long-context LLMs: cost, latency, accuracy, 'lost in the middle', when stuffing context wins, when retrieval wins, plus the hybrid.
Read guide
RAG Reranking: How a Two-Stage Retrieve-Then-Rerank Pipeline Beats Raw Top-K
Add a reranker to your RAG pipeline: why retrieve-then-rerank beats raw vector top-k, cross-encoders vs bi-encoders, self-hostable models, latency tradeoffs.
Read guide
Chat With Your Documents, Self-Hosted: Build a Private PDF Q&A Assistant
Build a private 'chat with your PDFs and docs' assistant you self-host: ingest, embed, store, retrieve and answer with a local LLM and a UI. Real commands.
Read guideOpen-Source AI Search
Self-hosted Perplexity alternatives and neural answer engines.
Open-Source Perplexity Alternatives: Self-Hosted AI Search (2026)
The best open-source, self-hosted Perplexity alternatives in 2026 — Vane (formerly Perplexica), Khoj, SurfSense and SearXNG compared, with setup and privacy notes.
Read guide
Self-Host SearXNG: Your Own Private Metasearch Engine (No Tracking)
How to self-host SearXNG for private, ad-free metasearch — Docker setup, configuration basics, privacy benefits, and when to add an Ollama LLM for AI answers.
Read guide
How to Self-Host Vane (formerly Perplexica): A Complete Guide
Step-by-step guide to self-hosting Vane (ex-Perplexica), the top open-source Perplexity alternative — Docker setup, Ollama or cloud LLMs, config, and privacy.
Read guide
How to Self-Host Khoj: Your Private AI Second Brain
A guide to self-hosting Khoj — the AGPL-3.0 open-source AI second brain with pgvector. Docker setup, connecting your docs and local LLMs, and Khoj vs Vane.
Read guideVector Databases
How vector search works and which engines to self-host.
Foundations
Semantic search, embeddings, and the concepts behind modern search.
What Are Embeddings? A Plain-English Guide for Developers
What embeddings are, how text and images become vectors, what dimensions and cosine similarity mean, and how they power semantic search and RAG systems.
Read guide
What Is Semantic Search? Embeddings, Keywords, and Hybrid Explained
A plain-English guide to semantic search: how it differs from keyword search, what embeddings are, how hybrid search works, and when to use each approach.
Read guide