Weaviate vs Milvus (Self-Hosted, 2026): Built-In Modules or Billion-Scale?
Two heavyweight open-source vector engines, compared for teams who run their own infrastructure: Weaviate's batteries-included platform vs Milvus's distributed scale.
For self-hosting, Weaviate is the batteries-included choice — BSD-3-Clause, written in Go, with built-in vectorizer modules and hybrid search, productive at small-to-mid scale and run on Kubernetes + Helm in production. Milvus is the scale specialist — Apache-2.0, a Go core with a C++ engine, purpose-built for billion-scale workloads via a distributed, multi-component architecture. If you want the database to handle vectorization and hybrid out of the box without operating a large distributed system, pick Weaviate; if you’re genuinely heading for billions of vectors and can run a distributed deployment, pick Milvus. This head-to-head compares them through the lens of a team that runs its own infrastructure.
Both are genuinely open-source and both belong on any serious shortlist — this completes the matrix alongside our other head-to-heads. For the wider field, see the best self-hosted vector databases guide; for how each stacks up against the lean default, see Milvus vs Qdrant and Qdrant vs Weaviate.
Side-by-side comparison
| Weaviate | Milvus | |
|---|---|---|
| License | BSD-3-Clause | Apache-2.0 |
| Core language | Go | Go core + C++ engine |
| GitHub stars (June 2026) | 16.3k | 44.8k |
| Hybrid search | Built-in — vector + BM25 with fusion in one query | Yes — dense + sparse + full-text in one collection |
| Built-in vectorizer modules | Yes (module ecosystem) | No (you bring embeddings) |
| Self-host (small) | Docker / docker-compose | Milvus Lite (embedded, pip install); Standalone (Docker) |
| Self-host (production) | Kubernetes + Helm | Distributed (Kubernetes), multi-component |
| Designed scale | Small-to-large | Up to billion-scale |
| Managed cloud | Weaviate Cloud — free Sandbox; Flex from ~$45/mo | Zilliz Cloud — free tier; serverless; dedicated from ~$99/mo* |
Star counts are GitHub’s rounded figures as of June 2026 and drift over time; license, language, and the scale each is designed for are the stable facts to weight. *Zilliz dedicated entry price renders dynamically and is approximate.
License and language
Both licenses are permissive with no copyleft strings — Weaviate is BSD-3-Clause, Milvus is Apache-2.0 — so for downstream commercial use they’re effectively equivalent. You can embed either in a closed-source product with no obligation.
The language difference reflects each project’s design temperament. Weaviate is written in Go, the lingua franca of cloud-native infrastructure (it’s the language of Kubernetes itself), which suits its platform-and-modules ambitions. Milvus pairs a Go core with a C++ engine — Go for the distributed coordination and orchestration, C++ for the performance-critical vector indexing and search path. That split is a tell: Milvus is engineered for the demands of very large-scale search, where the hot path needs C++-grade control. Neither language matters for using either database via its API.
Performance and latency
The honest caveat first: vector-DB benchmarks are recall-, dataset-, and hardware-dependent, and are usually run by the vendor that wins them. Read them as directional, not gospel.
- Milvus claims (for Milvus 2.6) roughly 72% memory reduction with ~4× throughput on a 1M × 768-dim VectorDBBench run, and 3–4× (up to ~7×) higher full-text throughput vs Elasticsearch at equal recall. These are scale-and-cost-efficiency claims — the headline is doing more per unit of memory, which is exactly what matters when you’re running billions of vectors.
- Weaviate publishes an ANN benchmark reporting end-to-end p99 latency and QPS-vs-recall curves — interactive, with no single headline number. Arguably the more transparent presentation, since it shows the recall/latency trade-off rather than one cherry-picked figure.
Milvus markets itself on throughput and memory efficiency at scale; Weaviate presents the full latency-vs-recall picture. For most self-hosted workloads (well under a few million vectors), both will be comfortably fast on decent hardware, and your embedding model and chunking choices will affect end-to-end latency more than the database does. Milvus’s performance story only becomes decisive when you’re genuinely operating at the scale it’s built for. Benchmark on your data before treating performance as the deciding factor.
Hybrid search
Both support hybrid search — combining dense vector similarity with sparse/keyword matching — out of the box, so neither forces you to hand-roll fusion the way pgvector does.
- Weaviate offers built-in vector + BM25 keyword search with fusion ranking in a single query — a clean, well-documented hybrid implementation.
- Milvus does dense + sparse vectors plus full-text in a single collection, with strong full-text throughput claims (3–4×, up to ~7×, vs Elasticsearch at equal recall).
This is close to a tie on capability — both are genuine, first-class hybrid implementations. Milvus’s full-text throughput claims give it an edge specifically at high scale and high query volume; Weaviate’s BM25 hybrid is straightforward and battle-tested. Where Weaviate pulls ahead is one step earlier in the pipeline: its built-in vectorizer modules can generate the embeddings for you inside the database, so you don’t necessarily need a separate embedding service feeding the hybrid index. Milvus expects you to bring your own embeddings.
Self-hosting and operations
This is where the two diverge most, and where the choice usually gets made for self-hosters.
- Weaviate runs via Docker / docker-compose for development, with production deployments expected to use Kubernetes + Helm. Its module ecosystem (vectorizers, rerankers, generative modules) can generate embeddings inside the database — genuine convenience, since you may not need a separate embedding service, but also more moving parts to understand and operate.
- Milvus is the most operationally flexible at the ends and the heaviest in the middle-to-large range. It offers Milvus Lite (embedded,
pip install) for prototyping and Standalone (Docker) for small deployments — but its production distributed mode is a multi-component system (it relies on object storage, a message queue, and separate coordinator/worker components) designed for billion-scale and best run on Kubernetes. That distributed architecture is precisely what enables its scale, and precisely what makes it the heavier system to operate.
The trade-off in one sentence: Weaviate gives you more built into one platform; Milvus gives you more scale headroom at the cost of more components. If you want a productive, batteries-included vector store with vectorization handled for you and you’re comfortable with a Kubernetes deployment, Weaviate’s modules earn their keep. If you’re genuinely heading for billions of vectors and can staff a distributed system, Milvus’s architecture is built for exactly that. Note that Milvus Lite also makes it the easiest of the two to simply pip install and prototype with — the operational weight only arrives at the distributed end.
Cost and pricing
Self-hosting either one means the software is free and your cost is the infrastructure you run it on. A small-to-medium index on single-node Weaviate or Milvus Standalone sits comfortably on a ~$20–30/mo VPS (cheaper on Hetzner-class hosts). The picture changes at scale: Weaviate’s production Kubernetes posture implies a cluster baseline, and Milvus distributed at billion-scale wants real, well-provisioned hardware — that’s the tier where you should genuinely price out the deployment rather than assume a flat VPS bill.
For reference, their managed clouds (hosted prices, not self-host costs):
- Weaviate Cloud — free Serverless Sandbox; paid “Flex” from ~$45/mo, pay-as-you-go (vector storage from ~$0.00465 per 1M dimensions, varies).
- Milvus / Zilliz Cloud — free tier (~5 GB +
2.5M vCUs/mo); serverless pay-as-you-go ($4 per 1M vCUs; storage $0.04/GB-mo); dedicated clusters from ~$99/mo (the dedicated entry price renders dynamically and is approximate).
If a flat, predictable bill is the goal at small-to-medium scale, self-hosting either on your own VPS wins over usage-metered managed pricing — that’s the whole point of search you own. At billion-scale, Milvus’s cost story is about efficiency (its memory-reduction claims translate to fewer/smaller machines), which is its own kind of cost argument.
When to pick which
Pick Weaviate if:
- You want vectorization built into the database via its module ecosystem, not a separate embedding service.
- Your scale is small-to-large and you want a productive, batteries-included platform.
- You’re comfortable running Kubernetes + Helm for production but don’t want a multi-component distributed system.
- Built-in BM25 hybrid search out of the box matters to you.
Pick Milvus if:
- You’re genuinely heading for billion-scale and need an architecture built for it.
- You can staff and operate a distributed, Kubernetes-based multi-component system.
- High full-text/hybrid throughput at scale is critical, and memory/cost efficiency at scale matters.
- You want to start tiny with Milvus Lite (
pip install) and grow into distributed later.
Verdict
For most self-hosting teams below the very-large-scale tier, Weaviate is the more productive default — built-in vectorizer modules, clean BM25 hybrid, and a platform that does more for you out of the box, at the cost of a Kubernetes production posture. Milvus is the better fit when scale is the headline requirement — its distributed, C++-engine architecture is purpose-built for billion-scale and backs the strongest large-scale throughput and memory-efficiency claims, but it’s the heavier system to operate once you’re past Standalone. Both are excellent, genuinely open-source, and safe to embed; the decision comes down to whether you value a batteries-included platform at manageable scale (Weaviate) or maximum scale headroom you’re prepared to operate (Milvus). If neither extreme fits — you want a lean, fast engine in the productive middle — it’s worth comparing both against Qdrant before deciding.
FAQ
Is Weaviate or Milvus better for self-hosting? For small-to-mid-scale self-hosting, Weaviate is more productive — it has built-in vectorizer modules and hybrid search, and runs on Docker/compose for dev and Kubernetes + Helm for production. Milvus is the better choice when you’re genuinely heading for billion-scale and can operate its distributed, multi-component architecture.
Which scales larger, Weaviate or Milvus? Milvus is purpose-built for billion-scale via its distributed mode, with the strongest large-scale throughput and memory-efficiency claims (e.g. ~72% memory reduction with ~4× throughput on a 1M × 768-dim VectorDBBench run for Milvus 2.6). Weaviate scales well into the large range but Milvus is the dedicated scale specialist.
Do both Weaviate and Milvus support hybrid search? Yes. Weaviate offers built-in vector + BM25 keyword search with fusion ranking in one query. Milvus does dense + sparse vectors plus full-text in one collection, with strong full-text throughput claims. Weaviate additionally has built-in vectorizer modules that can generate the embeddings for you.
What licenses do Weaviate and Milvus use? Weaviate is BSD-3-Clause; Milvus is Apache-2.0. Both are permissive with no copyleft, so either is safe to embed in a commercial product.
Is Milvus hard to self-host?
It depends on scale. Milvus Lite (pip install) and Standalone (Docker) are easy for prototyping and small deployments. Its production distributed mode is a multi-component system (object storage, message queue, coordinator/worker components) best run on Kubernetes — that’s the heavier setup, and it’s the price of billion-scale capability.
Aquila is the independent guide to private, self-hosted AI search — search you own instead of rent. See the full field in best self-hosted vector databases, compare each against the lean default in Milvus vs Qdrant and Qdrant vs Weaviate, or browse all comparisons. Own your search.
Keep comparing
Vendor-neutral comparisons of self-hosted vector databases and search engines — always through the you-run-it lens.