How to Self-Host Khoj: Your Private AI Second Brain

An open-source AI second brain you run yourself — chat with your own documents and the web, with local or cloud models, on your hardware.

By Aquila Team Updated June 19, 2026

Khoj is an open-source, self-hostable “AI second brain” — a personal AI assistant that answers from both the web and your own documents, runs any local or cloud LLM, and is reachable from your browser, Obsidian, Emacs, desktop, and phone. Where Vane is a Perplexity-style engine pointed at the live web, Khoj is built around your stuff: your PDFs, Markdown notes, Word docs, Notion pages, and more. This guide covers what Khoj is, how to self-host it with Docker, how to connect your documents and a local model, and exactly when to choose Khoj over Vane.

The canonical repository is github.com/khoj-ai/khoj. For the wider field of AI answer engines, see our roundup of open-source Perplexity alternatives.

What Khoj is

Khoj describes itself as an AI second brain. In practice that means it does three things hosted assistants make you choose between:

  1. Answers from your own documents. You point Khoj at your files — PDF, Markdown, Word, Notion, org-mode, even images — and it indexes them for semantic search so you can ask questions and get grounded, cited answers from your own knowledge.
  2. Answers from the web. It can also search online to answer questions that aren’t in your documents.
  3. Acts as an assistant. Beyond search, it supports custom agents, automations, and scheduled or “deep” research that runs on a schedule.

It reaches you wherever you work: a web app, an Obsidian plugin, an Emacs package, a desktop app, a phone app, and even WhatsApp. That multi-client reach is a defining feature — your second brain follows you across tools.

Khoj
What it isSelf-hostable AI second brain / personal assistant
LicenseAGPL-3.0
GitHub stars~35.2k (as of June 2026)
BackingY Combinator (W24)
Vector storepgvector (PostgreSQL + pgvector extension)
Document typesPDF, Markdown, Word, Notion, org-mode, images
Local LLMsYes — Ollama (llama3, qwen, gemma, mistral)
Cloud LLMsgpt, claude, gemini, deepseek
ClientsBrowser, Obsidian, Emacs, Desktop, Phone, WhatsApp

Two facts shape how you should think about Khoj. First, it’s AGPL-3.0 licensed — strong copyleft, which matters if you intend to embed Khoj inside a product you distribute (compare that to Vane’s permissive MIT license). For running it for yourself or your team, AGPL is no obstacle at all. Second, it uses pgvector — the PostgreSQL extension — as its vector store, which means the semantic search underneath your documents runs on plain, battle-tested Postgres rather than a separate vector database. If you want to understand that layer, see our rundown of self-hosted vector databases and what embeddings are.

How Khoj works under the hood

When you ask Khoj a question about your documents, it’s running retrieval-augmented generation (RAG) — though you never have to think in those terms. The pipeline:

  1. Ingestion. Your files are chunked and converted to embeddings — numerical vectors that capture meaning.
  2. Storage. Those vectors live in pgvector inside PostgreSQL.
  3. Retrieval. Your question is embedded the same way, and pgvector finds the most semantically similar chunks via semantic search.
  4. Generation. The retrieved chunks are handed to your chosen LLM, which writes a grounded answer with citations back to your source files.

A typical self-hosted Khoj deployment therefore brings up a few services together — the Khoj app, a PostgreSQL + pgvector database, and supporting pieces (a SearXNG instance for web search and a sandboxed code-execution environment for some advanced features). Docker Compose wires these together so you don’t assemble them by hand.

Why self-host Khoj

You can use Khoj’s hosted cloud version, but self-hosting is the point if you care about ownership:

  • Your documents stay yours. With a local model, your files, their embeddings, and the answers generated from them never leave your infrastructure — nothing is sent to a third party, nothing trains someone else’s model.
  • Model choice. Run a local LLM through Ollama for full privacy, or wire in a cloud model (GPT, Claude, Gemini, DeepSeek) when you want more reasoning horsepower.
  • No subscription, no caps. Index as many documents as your hardware allows; there’s no per-seat or per-document billing on your own box.
  • It’s your assistant. Custom agents and automations run on your terms, on your schedule, against your data.

The tradeoff is the usual self-hosting one: you provision it, patch it, and keep Postgres healthy. Khoj is heavier to run than a bare metasearch engine because it carries a database and a document index — but that’s exactly what buys you private document Q&A.

Self-hosting Khoj: setup overview

This is the general shape of a Khoj deployment. Always follow the project’s own documentation and README for exact, current commands — Khoj ships a Docker Compose setup that it maintains and updates between releases.

1. Prerequisites

You need a host with Docker and Docker Compose. Because Khoj runs PostgreSQL plus a document index (and optionally a local LLM), give it more headroom than a metasearch engine — a 4 GB+ RAM machine is a sensible floor, and more if you index a large corpus or run local models on the same box. If you’ll do local inference, a GPU helps considerably.

2. Get the Docker Compose setup

Khoj publishes a docker-compose.yml that defines the whole stack — the Khoj server, the PostgreSQL + pgvector database, and supporting services. The standard flow is to download that compose file (or clone the repo), set a few environment values, and bring everything up:

docker compose up -d

On first run, the database initializes and the Khoj server starts. Give it a minute, then open the web UI at the address shown in the compose file (commonly a local port).

3. Create your account and choose a model

On first launch you set up an admin account through the web UI. Then you configure your LLM:

  • Local (Ollama): run Ollama, pull a model (ollama pull llama3.1), and point Khoj at your Ollama endpoint. This keeps generation fully on your hardware.
  • Cloud: add an API key for GPT, Claude, Gemini, or DeepSeek when you want frontier-grade answers.

4. Connect your documents

This is the part that makes Khoj a second brain rather than a web search box. You connect your knowledge sources:

  • Direct uploads of PDFs, Word, Markdown, and other files through the web UI.
  • The Obsidian or Emacs plugin, which syncs your existing notes vault into Khoj automatically.
  • Notion and other supported sources.

Khoj indexes what you connect into pgvector. From then on, your questions search across your own knowledge, and answers cite the source files they came from. The first index of a large vault takes a while (it has to embed everything); subsequent updates are incremental.

5. Use it from your preferred client

Install whichever client fits your workflow — the browser app, the Obsidian plugin, the desktop or phone app — and they all talk to the same self-hosted server. Your index and your conversations stay on your hardware; the clients are just front-ends.

Connecting local models for full privacy

Like any self-hosted AI tool, Khoj presents a privacy-vs-quality dial in the model layer:

Local (Ollama)Cloud LLM
Where your data goesStays on your hardwarePrompt + retrieved chunks sent to provider
PrivacyFully private possibleProvider sees your document excerpts
Answer qualityGood; bounded by hardwareFrontier-grade
CostHardware onlyPer-token API fees
Best forSensitive personal/work documentsHardest questions, best phrasing

For a second brain, the privacy stakes are higher than for web search — you’re feeding it your actual notes, contracts, and research. If that’s sensitive, run everything local: Ollama for generation, local embedding models for indexing, pgvector on your own box. The whole loop stays inside your network. Reach for a cloud model only when a particular hard question justifies sending those excerpts out, and make that choice deliberately.

When to choose Khoj vs. Vane

Both are excellent self-hosted AI tools, but they’re built for different jobs.

KhojVane (ex-Perplexica)
Primary jobChat with your own documentsPerplexity-style web answers
Best metaphorAI second brainAI search engine
Your-docs Q&ACore feature (pgvector RAG)Limited
Web searchYesYes (core, via bundled SearXNG)
ClientsBrowser, Obsidian, Emacs, desktop, phone, WhatsAppWeb UI
LicenseAGPL-3.0 (copyleft)MIT (permissive)
Stars (June 2026)~35.2k~35.4k

The decision is mostly about where your questions come from:

  • Choose Khoj when your goal is to search and converse with your own knowledge — notes, documents, a research library — across many devices, with web search as a bonus. It’s the strongest option here for a personal or team knowledge assistant.
  • Choose Vane when your goal is Perplexity-style answers from the live web, and document Q&A isn’t your priority.
  • Mind the license if you’re embedding it in a product you’ll distribute: Vane’s MIT is permissive; Khoj’s AGPL-3.0 is copyleft. For internal use, neither matters.
  • Run both if you want — many people use Vane for the open web and Khoj for their own files. They don’t conflict.

For the full landscape including SurfSense and SearXNG, see open-source Perplexity alternatives.

Privacy and data ownership

Khoj’s whole proposition is that your second brain belongs to you. Self-hosted, pointed at a local model, with pgvector on your own server, the entire loop — your documents, their embeddings, your questions, and the generated answers — stays inside your network. No third party indexes your files, no provider trains on your notes, and you can run it air-gapped.

The one place that changes is the model layer: if you choose a cloud LLM, the prompt and the retrieved document excerpts go to that provider for generation. Given that those excerpts come from your private files, weigh this carefully. Keep generation local for anything sensitive, and reserve cloud models for questions where the quality is worth the exposure. The same data-ownership logic threads through our self-hosted RAG guide.

FAQ

What is Khoj used for? Khoj is a self-hostable AI “second brain” — a personal assistant that answers questions from your own documents (PDF, Markdown, Word, Notion, org-mode, images) and from the web. It supports custom agents and scheduled research, and you can reach it from the browser, Obsidian, Emacs, desktop, phone, and WhatsApp.

Is Khoj free and open source? Yes. Khoj is open source under the AGPL-3.0 license and is fully self-hostable, so there’s no software cost. Note that AGPL is a copyleft license — relevant only if you plan to embed and distribute Khoj inside your own product, not for running it yourself.

Does Khoj work with local LLMs? Yes. Khoj works with local models through Ollama (llama3, qwen, gemma, mistral, and others), so you can run it entirely on your own hardware. It also supports cloud models (GPT, Claude, Gemini, DeepSeek) when you want more reasoning power.

What database does Khoj use? Khoj uses pgvector — the vector extension for PostgreSQL — as its vector store for semantic search over your documents. A self-hosted deployment runs PostgreSQL with pgvector as part of its Docker Compose stack.

Khoj vs. Vane — which is better? Neither is “better”; they do different jobs. Khoj is built for chatting with your own documents across many clients. Vane is built for Perplexity-style answers from the live web. Choose by whether your questions are about your own files or the open web — and run both if you need both.


Khoj is the most full-featured way to self-host a private AI assistant over your own knowledge. From here, understand the RAG pipeline underneath it, compare it with Vane and other answer engines, or read up on the vector database layer it runs on. Aquila is the independent home for AI search you own. Own your search.

Keep going

More guides on self-hosted AI search, RAG, and vector databases.