Guides

Build search you own.

Practical, self-hosted-first guides to RAG, AI search, and vector databases. No fluff, no vendor pitches — just how to run it yourself.

Self-Hosted RAG

Private AI knowledge bases — retrieval-augmented generation you run yourself.

Self-Hosted RAG

The Best Local Embedding Models for RAG (2026)

A practical comparison of local, self-hostable embedding models for RAG — nomic-embed-text, mxbai-embed-large, bge, e5, gte — with dimensions, licenses, and how to pick.

Read guide

Self-Hosted RAG

Build a Private RAG System on a VPS: A Step-by-Step Tutorial

A hands-on tutorial to build a private, self-hosted RAG system on a VPS: provision the box, run Ollama, stand up a vector store, build the pipeline, and ship a FastAPI.

Read guide

Self-Hosted RAG

Self-Hosted RAG vs OpenAI + Pinecone: A Real Cost Breakdown

An honest, itemized cost comparison of self-hosted RAG versus OpenAI embeddings plus Pinecone — compute, embeddings, storage, hidden costs, and when managed wins.

Read guide

Self-Hosted RAG

Self-Hosted RAG: The Complete Guide to Private AI Knowledge Bases

Build a private, self-hosted RAG system you fully own. The reference stack, embedding and vector-store choices, VPS sizing, pitfalls, and when not to self-host.

Read guide

Self-Hosted RAG

How to Evaluate a RAG System: Metrics, Golden Sets, and Regression Testing

Evaluate RAG properly: retrieval metrics (recall@k, MRR, nDCG), generation metrics (faithfulness, relevance), golden sets, RAGAS and LLM-as-judge, self-hosted.

Read guide

Self-Hosted RAG

Production RAG: Taking Self-Hosted Retrieval From Demo to Reliable Service

Take self-hosted RAG to production: caching, observability, latency and cost control, access control, data freshness, eval in CI, and scaling the vector store.

Read guide

Self-Hosted RAG

RAG Chunking Strategies: How to Split Documents for Better Retrieval

A practical guide to chunking strategies for RAG: fixed-size, recursive, semantic and structure-aware splitting, overlap, parent-document retrieval and sizing.

Read guide

Self-Hosted RAG

RAG vs Fine-Tuning: Which One Do You Actually Need? (2026)

A clear decision guide to RAG vs fine-tuning — what each does, the cost, latency and maintenance tradeoffs, hallucinations, and when to combine both.

Read guide

Self-Hosted RAG

RAG vs Long Context: Do You Still Need Retrieval in 2026?

Honest 2026 take on RAG vs long-context LLMs: cost, latency, accuracy, 'lost in the middle', when stuffing context wins, when retrieval wins, plus the hybrid.

Read guide

Self-Hosted RAG

RAG Reranking: How a Two-Stage Retrieve-Then-Rerank Pipeline Beats Raw Top-K

Add a reranker to your RAG pipeline: why retrieve-then-rerank beats raw vector top-k, cross-encoders vs bi-encoders, self-hostable models, latency tradeoffs.

Read guide

Self-Hosted RAG

Chat With Your Documents, Self-Hosted: Build a Private PDF Q&A Assistant

Build a private 'chat with your PDFs and docs' assistant you self-host: ingest, embed, store, retrieve and answer with a local LLM and a UI. Real commands.

Read guide

Open-Source AI Search

Self-hosted Perplexity alternatives and neural answer engines.

Open-Source AI Search

Open-Source Perplexity Alternatives: Self-Hosted AI Search (2026)

The best open-source, self-hosted Perplexity alternatives in 2026 — Vane (formerly Perplexica), Khoj, SurfSense and SearXNG compared, with setup and privacy notes.

Read guide

Open-Source AI Search

Self-Host SearXNG: Your Own Private Metasearch Engine (No Tracking)

How to self-host SearXNG for private, ad-free metasearch — Docker setup, configuration basics, privacy benefits, and when to add an Ollama LLM for AI answers.

Read guide

Open-Source AI Search

How to Self-Host Vane (formerly Perplexica): A Complete Guide

Step-by-step guide to self-hosting Vane (ex-Perplexica), the top open-source Perplexity alternative — Docker setup, Ollama or cloud LLMs, config, and privacy.

Read guide

Open-Source AI Search

How to Self-Host Khoj: Your Private AI Second Brain

A guide to self-hosting Khoj — the AGPL-3.0 open-source AI second brain with pgvector. Docker setup, connecting your docs and local LLMs, and Khoj vs Vane.

Read guide

Vector Databases

How vector search works and which engines to self-host.

Vector Databases

What Is a Vector Database? ANN Indexes, Filtering, and Hybrid Search

What a vector database is and how it works — ANN indexes (HNSW, IVF), metadata filtering, hybrid search, and when you need one vs pgvector or FAISS.

Read guide

Foundations

Semantic search, embeddings, and the concepts behind modern search.

Foundations

What Are Embeddings? A Plain-English Guide for Developers

What embeddings are, how text and images become vectors, what dimensions and cosine similarity mean, and how they power semantic search and RAG systems.

Read guide

Foundations

What Is Semantic Search? Embeddings, Keywords, and Hybrid Explained

A plain-English guide to semantic search: how it differs from keyword search, what embeddings are, how hybrid search works, and when to use each approach.

Read guide