How to Self-Host Khoj: Your Private AI Second Brain
An open-source AI second brain you run yourself — chat with your own documents and the web, with local or cloud models, on your hardware.
Khoj is an open-source, self-hostable “AI second brain” — a personal AI assistant that answers from both the web and your own documents, runs any local or cloud LLM, and is reachable from your browser, Obsidian, Emacs, desktop, and phone. Where Vane is a Perplexity-style engine pointed at the live web, Khoj is built around your stuff: your PDFs, Markdown notes, Word docs, Notion pages, and more. This guide covers what Khoj is, how to self-host it with Docker, how to connect your documents and a local model, and exactly when to choose Khoj over Vane.
The canonical repository is github.com/khoj-ai/khoj. For the wider field of AI answer engines, see our roundup of open-source Perplexity alternatives.
What Khoj is
Khoj describes itself as an AI second brain. In practice that means it does three things hosted assistants make you choose between:
- Answers from your own documents. You point Khoj at your files — PDF, Markdown, Word, Notion, org-mode, even images — and it indexes them for semantic search so you can ask questions and get grounded, cited answers from your own knowledge.
- Answers from the web. It can also search online to answer questions that aren’t in your documents.
- Acts as an assistant. Beyond search, it supports custom agents, automations, and scheduled or “deep” research that runs on a schedule.
It reaches you wherever you work: a web app, an Obsidian plugin, an Emacs package, a desktop app, a phone app, and even WhatsApp. That multi-client reach is a defining feature — your second brain follows you across tools.
| Khoj | |
|---|---|
| What it is | Self-hostable AI second brain / personal assistant |
| License | AGPL-3.0 |
| GitHub stars | ~35.2k (as of June 2026) |
| Backing | Y Combinator (W24) |
| Vector store | pgvector (PostgreSQL + pgvector extension) |
| Document types | PDF, Markdown, Word, Notion, org-mode, images |
| Local LLMs | Yes — Ollama (llama3, qwen, gemma, mistral) |
| Cloud LLMs | gpt, claude, gemini, deepseek |
| Clients | Browser, Obsidian, Emacs, Desktop, Phone, WhatsApp |
Two facts shape how you should think about Khoj. First, it’s AGPL-3.0 licensed — strong copyleft, which matters if you intend to embed Khoj inside a product you distribute (compare that to Vane’s permissive MIT license). For running it for yourself or your team, AGPL is no obstacle at all. Second, it uses pgvector — the PostgreSQL extension — as its vector store, which means the semantic search underneath your documents runs on plain, battle-tested Postgres rather than a separate vector database. If you want to understand that layer, see our rundown of self-hosted vector databases and what embeddings are.
How Khoj works under the hood
When you ask Khoj a question about your documents, it’s running retrieval-augmented generation (RAG) — though you never have to think in those terms. The pipeline:
- Ingestion. Your files are chunked and converted to embeddings — numerical vectors that capture meaning.
- Storage. Those vectors live in pgvector inside PostgreSQL.
- Retrieval. Your question is embedded the same way, and pgvector finds the most semantically similar chunks via semantic search.
- Generation. The retrieved chunks are handed to your chosen LLM, which writes a grounded answer with citations back to your source files.
A typical self-hosted Khoj deployment therefore brings up a few services together — the Khoj app, a PostgreSQL + pgvector database, and supporting pieces (a SearXNG instance for web search and a sandboxed code-execution environment for some advanced features). Docker Compose wires these together so you don’t assemble them by hand.
Why self-host Khoj
You can use Khoj’s hosted cloud version, but self-hosting is the point if you care about ownership:
- Your documents stay yours. With a local model, your files, their embeddings, and the answers generated from them never leave your infrastructure — nothing is sent to a third party, nothing trains someone else’s model.
- Model choice. Run a local LLM through Ollama for full privacy, or wire in a cloud model (GPT, Claude, Gemini, DeepSeek) when you want more reasoning horsepower.
- No subscription, no caps. Index as many documents as your hardware allows; there’s no per-seat or per-document billing on your own box.
- It’s your assistant. Custom agents and automations run on your terms, on your schedule, against your data.
The tradeoff is the usual self-hosting one: you provision it, patch it, and keep Postgres healthy. Khoj is heavier to run than a bare metasearch engine because it carries a database and a document index — but that’s exactly what buys you private document Q&A.
Self-hosting Khoj: setup overview
This is the general shape of a Khoj deployment. Always follow the project’s own documentation and README for exact, current commands — Khoj ships a Docker Compose setup that it maintains and updates between releases.
1. Prerequisites
You need a host with Docker and Docker Compose. Because Khoj runs PostgreSQL plus a document index (and optionally a local LLM), give it more headroom than a metasearch engine — a 4 GB+ RAM machine is a sensible floor, and more if you index a large corpus or run local models on the same box. If you’ll do local inference, a GPU helps considerably.
2. Get the Docker Compose setup
Khoj publishes a docker-compose.yml that defines the whole stack — the Khoj server, the PostgreSQL + pgvector database, and supporting services. The standard flow is to download that compose file (or clone the repo), set a few environment values, and bring everything up:
docker compose up -d
On first run, the database initializes and the Khoj server starts. Give it a minute, then open the web UI at the address shown in the compose file (commonly a local port).
3. Create your account and choose a model
On first launch you set up an admin account through the web UI. Then you configure your LLM:
- Local (Ollama): run Ollama, pull a model (
ollama pull llama3.1), and point Khoj at your Ollama endpoint. This keeps generation fully on your hardware. - Cloud: add an API key for GPT, Claude, Gemini, or DeepSeek when you want frontier-grade answers.
4. Connect your documents
This is the part that makes Khoj a second brain rather than a web search box. You connect your knowledge sources:
- Direct uploads of PDFs, Word, Markdown, and other files through the web UI.
- The Obsidian or Emacs plugin, which syncs your existing notes vault into Khoj automatically.
- Notion and other supported sources.
Khoj indexes what you connect into pgvector. From then on, your questions search across your own knowledge, and answers cite the source files they came from. The first index of a large vault takes a while (it has to embed everything); subsequent updates are incremental.
5. Use it from your preferred client
Install whichever client fits your workflow — the browser app, the Obsidian plugin, the desktop or phone app — and they all talk to the same self-hosted server. Your index and your conversations stay on your hardware; the clients are just front-ends.
Connecting local models for full privacy
Like any self-hosted AI tool, Khoj presents a privacy-vs-quality dial in the model layer:
| Local (Ollama) | Cloud LLM | |
|---|---|---|
| Where your data goes | Stays on your hardware | Prompt + retrieved chunks sent to provider |
| Privacy | Fully private possible | Provider sees your document excerpts |
| Answer quality | Good; bounded by hardware | Frontier-grade |
| Cost | Hardware only | Per-token API fees |
| Best for | Sensitive personal/work documents | Hardest questions, best phrasing |
For a second brain, the privacy stakes are higher than for web search — you’re feeding it your actual notes, contracts, and research. If that’s sensitive, run everything local: Ollama for generation, local embedding models for indexing, pgvector on your own box. The whole loop stays inside your network. Reach for a cloud model only when a particular hard question justifies sending those excerpts out, and make that choice deliberately.
When to choose Khoj vs. Vane
Both are excellent self-hosted AI tools, but they’re built for different jobs.
| Khoj | Vane (ex-Perplexica) | |
|---|---|---|
| Primary job | Chat with your own documents | Perplexity-style web answers |
| Best metaphor | AI second brain | AI search engine |
| Your-docs Q&A | Core feature (pgvector RAG) | Limited |
| Web search | Yes | Yes (core, via bundled SearXNG) |
| Clients | Browser, Obsidian, Emacs, desktop, phone, WhatsApp | Web UI |
| License | AGPL-3.0 (copyleft) | MIT (permissive) |
| Stars (June 2026) | ~35.2k | ~35.4k |
The decision is mostly about where your questions come from:
- Choose Khoj when your goal is to search and converse with your own knowledge — notes, documents, a research library — across many devices, with web search as a bonus. It’s the strongest option here for a personal or team knowledge assistant.
- Choose Vane when your goal is Perplexity-style answers from the live web, and document Q&A isn’t your priority.
- Mind the license if you’re embedding it in a product you’ll distribute: Vane’s MIT is permissive; Khoj’s AGPL-3.0 is copyleft. For internal use, neither matters.
- Run both if you want — many people use Vane for the open web and Khoj for their own files. They don’t conflict.
For the full landscape including SurfSense and SearXNG, see open-source Perplexity alternatives.
Privacy and data ownership
Khoj’s whole proposition is that your second brain belongs to you. Self-hosted, pointed at a local model, with pgvector on your own server, the entire loop — your documents, their embeddings, your questions, and the generated answers — stays inside your network. No third party indexes your files, no provider trains on your notes, and you can run it air-gapped.
The one place that changes is the model layer: if you choose a cloud LLM, the prompt and the retrieved document excerpts go to that provider for generation. Given that those excerpts come from your private files, weigh this carefully. Keep generation local for anything sensitive, and reserve cloud models for questions where the quality is worth the exposure. The same data-ownership logic threads through our self-hosted RAG guide.
FAQ
What is Khoj used for? Khoj is a self-hostable AI “second brain” — a personal assistant that answers questions from your own documents (PDF, Markdown, Word, Notion, org-mode, images) and from the web. It supports custom agents and scheduled research, and you can reach it from the browser, Obsidian, Emacs, desktop, phone, and WhatsApp.
Is Khoj free and open source? Yes. Khoj is open source under the AGPL-3.0 license and is fully self-hostable, so there’s no software cost. Note that AGPL is a copyleft license — relevant only if you plan to embed and distribute Khoj inside your own product, not for running it yourself.
Does Khoj work with local LLMs? Yes. Khoj works with local models through Ollama (llama3, qwen, gemma, mistral, and others), so you can run it entirely on your own hardware. It also supports cloud models (GPT, Claude, Gemini, DeepSeek) when you want more reasoning power.
What database does Khoj use? Khoj uses pgvector — the vector extension for PostgreSQL — as its vector store for semantic search over your documents. A self-hosted deployment runs PostgreSQL with pgvector as part of its Docker Compose stack.
Khoj vs. Vane — which is better? Neither is “better”; they do different jobs. Khoj is built for chatting with your own documents across many clients. Vane is built for Perplexity-style answers from the live web. Choose by whether your questions are about your own files or the open web — and run both if you need both.
Khoj is the most full-featured way to self-host a private AI assistant over your own knowledge. From here, understand the RAG pipeline underneath it, compare it with Vane and other answer engines, or read up on the vector database layer it runs on. Aquila is the independent home for AI search you own. Own your search.
Keep going
More guides on self-hosted AI search, RAG, and vector databases.