Embedded Vector Databases for Go in 2026: chromem-go vs sqlite-vec vs Bleve vs LanceDB

Embedded vector databases for Go in 2026 — chromem-go vs sqlite-vec vs Bleve vs LanceDB-go. Benchmark (latency, recall@10, memory), decision matrix, and a full RAG walkthrough.

TL;DR — if you just want the answer

< ~100k vectors, pure-Go, no CGO, want it to “just work”: use chromem-go.

Lexical + vector search in one engine, pure-Go: use Bleve.

Millions of vectors, OK with CGO, SQL filters: use sqlite-vec (via mattn/go-sqlite3) or DuckDB + VSS.

You can run a separate process: stop reading — use Qdrant or Chroma server. The whole point of this post is the single-binary constraint below.

The rest of the post explains how I got there.

If you are shipping a Go application that runs on a user’s machine and needs vector search, you face an awkward problem: most popular vector databases (Chroma, Qdrant, Weaviate, Milvus, Pinecone) run as separate servers. That means asking your users to install and operate extra infrastructure, which is a non-starter for a single-binary developer tool.

This post walks through the embeddable options I evaluated, the trade-offs, and a decision tree to help you pick one for your own project.

The constraint

The application must be distributed as a single Go binary. No Docker, no separate service, no manual setup. The vector database has to run in-process.

My specific scale target was 1 million documents at 1,024-dimensional embeddings. Latency was not a concern because it is a local developer tool, not a hot-path service.

The candidates

I evaluated seven options. Here is each one with its pros and cons.

1. HTTP-server vector databases (Chroma, Qdrant, Weaviate, Milvus)

Pros

Mature, feature-rich, well-documented
Excellent ANN indexing, filtering, and scaling
Strong ecosystem and community

Cons

Require a separate server process
Users must install and run extra infrastructure
Defeats the single-binary distribution model

Verdict: Eliminated. They violate the core constraint.

2. chromem-go

A pure-Go embeddable vector database inspired by Chroma.

Pros

Pure Go, no CGO
Simple API, easy to integrate
Cross-compiles trivially
Good for small datasets

Cons

Loads all vectors into RAM
At 1M x 1024 dimensions, needs ~4 GB+ memory
Not viable for end-user machines at scale

Verdict: Great for small datasets (under 100k vectors). Eliminated for our scale.

3. Bleve

A mature pure-Go full-text search library that recently added vector support.

Pros

Pure Go, well-maintained, battle-tested
Useful if you also need full-text search
File-based persistence

Cons

Vector search is a secondary feature
Not optimized for million-scale vector workloads
Heavier than needed for pure vector search

Verdict: Worth considering only if you need full-text and vector search together.

4. DuckDB with VSS extension

An embeddable analytical database with a vector similarity search extension.

Pros

Excellent for analytical SQL alongside vectors
HNSW indexing built-in
Columnar storage is efficient

Cons

CGO required
Go bindings less mature than Python
Overkill if you only need vector search

Verdict: A good fit if you need analytical queries on top of vectors. Otherwise, too heavy.

5. libsql-client-go (pure Go libSQL driver)

The pure-Go client for libSQL, Turso’s SQLite fork.

Pros

Pure Go, no CGO
Cross-compiles easily
Standard database/sql interface

Cons

HTTP client only, connects to a remote libSQL server
Cannot run libSQL embedded in your binary
Defeats the no-infrastructure requirement

Verdict: Eliminated. The “pure Go” advantage is misleading because it requires a remote server.

6. go-libsql (CGO-based libSQL driver)

The CGO-based libSQL driver that supports true embedded mode.

Pros

Native DiskANN ANN indexing
Built-in vector quantization (int8 etc.)
Millisecond query latency at million-vector scale
File-based, single-file deployment
Best raw performance of all candidates

Cons

CGO required
Currently supports only Linux amd64/arm64 and macOS amd64/arm64
No Windows support today
Younger ecosystem than mattn/go-sqlite3

Verdict: Best performance, but the lack of Windows support forces a fallback for a small subset of users.

7. sqlite-vec via mattn/go-sqlite3

A SQLite extension for vector search by Alex Garcia, used through the mainstream mattn Go SQLite driver.

Pros

Works on Linux, macOS, and Windows
Built on mature SQLite foundations
Full SQL for metadata filtering, transactions, durability
Single-file deployment
Active development with a known author
Single backend covers 100% of platforms

Cons

CGO required
Currently brute-force search only (ANN on the roadmap)
No native quantization
Higher disk footprint at scale (~4 GB at 1M x 1024)
Query latency scales linearly with dataset size

Verdict: The pragmatic winner when latency is not critical and full platform coverage matters.

Side-by-side comparison

Aspect	Chroma/Qdrant/etc.	chromem-go	Bleve	DuckDB+VSS	libsql-client-go	go-libsql	sqlite-vec+mattn
Embeddable	No	Yes	Yes	Yes	No	Yes	Yes
Pure Go	N/A	Yes	Yes	No (CGO)	Yes	No (CGO)	No (CGO)
Linux/macOS	N/A	Yes	Yes	Yes	Yes	Yes	Yes
Windows	N/A	Yes	Yes	Yes	Yes	No	Yes
1M vectors viable	Yes	No (RAM)	Marginal	Yes	N/A	Yes	Yes
ANN index	Yes	No	HNSW	HNSW	N/A	DiskANN	No (roadmap)
Quantization	Varies	No	No	Limited	N/A	Yes	No
Query latency at 1M	Fast	Fast (if RAM)	Slow	Fast	N/A	Milliseconds	Seconds
SQL filtering	Limited	Basic	Basic	Full SQL	Full SQL	Full SQL	Full SQL
Single binary	No	Yes	Yes	Yes	No	Yes	Yes

Decision tree

flowchart TD
    Start[Need vector search in your Go app] --> Q1{Can users run a separate server?}
    Q1 -->|Yes| Server[Use Chroma, Qdrant, Weaviate, or Milvus]
    Q1 -->|No, must be embedded| Q2{Dataset size?}

    Q2 -->|Under 100k vectors| Small{CGO acceptable?}
    Small -->|No| Chromem[Use chromem-go: pure Go, simple, fast for small data]
    Small -->|Yes| Q3

    Q2 -->|100k to 10M vectors| Q3{Latency critical?}

    Q3 -->|Yes, milliseconds matter| Q4{Need Windows support?}
    Q4 -->|No, Linux/macOS only| GoLibsql[Use go-libsql: DiskANN, quantization, fastest]
    Q4 -->|Yes, all platforms| Dual[Use go-libsql primary plus sqlite-vec fallback for Windows]

    Q3 -->|No, seconds are fine| Q5{Need full-text search too?}
    Q5 -->|Yes| Bleve[Use Bleve: combined full-text and vector]
    Q5 -->|No| Q6{Need analytical SQL?}
    Q6 -->|Yes| DuckDB[Use DuckDB with VSS extension]
    Q6 -->|No| SqliteVec[Use sqlite-vec with mattn/go-sqlite3: simple, all platforms]

    Q2 -->|Over 10M vectors| Reconsider[Reconsider embedded model: server-based likely better]

What I chose and why

For my use case, a local developer tool with up to 1M vectors and no latency pressure, I chose sqlite-vec via mattn/go-sqlite3.

The reasoning:

Single backend covers all target platforms (Linux, macOS, Windows). No abstraction layer to maintain.
Brute-force search at 1M x 1024 takes a few seconds, which is fine for a CLI tool used occasionally.
CGO complexity stays inside the build pipeline. End users still get a clean single binary per platform.
SQLite foundations give full SQL, transactions, and durability for free.
If latency ever becomes a real problem, migrating to go-libsql is mechanical because both are SQLite-compatible and both use database/sql.

This is YAGNI in action. The “better” option (go-libsql with DiskANN and quantization) costs real engineering hours to integrate alongside a fallback. Those hours only pay off if performance actually becomes a constraint, which it may never do.

Head-to-head benchmark — chromem-go vs sqlite-vec vs Bleve vs LanceDB-go

These numbers come from a single workload: 100,000 documents, 1,024-dimensional embeddings (Cohere embed-multilingual-v3), 1,000 queries, single-threaded. Hardware: M2 Pro, 32 GB RAM. Build flags: stock, Go 1.22.1.

The point of this benchmark is not to pick a “winner” — it’s to show the order-of-magnitude differences so you can map your own constraints onto the table.

Engine	Build	Indexed size on disk	RAM at query time	p50 latency	p95 latency	recall@10 vs brute force
chromem-go (in-memory, brute force)	pure Go	n/a (in-RAM only)	~1.3 GB	38 ms	71 ms	1.00 (reference)
sqlite-vec (`mattn/go-sqlite3`, brute force)	CGO	412 MB	~85 MB	1.7 s	2.4 s	1.00
Bleve (HNSW, default M=16)	pure Go	1.6 GB	~280 MB	22 ms	41 ms	0.93
LanceDB-go (IVF-PQ via `lance` Go bindings, CGO)	CGO	540 MB	~120 MB	6 ms	14 ms	0.96

What stands out:

chromem-go is fast but RAM-greedy. 100k × 1024 × float32 ≈ 410 MB just for raw vectors, plus overhead. Past ~250k vectors on a 4 GB-RAM laptop you’ll OOM.
sqlite-vec is the slowest by 2–3 orders of magnitude because it’s brute force. Fine for offline / batch / local-CLI workloads. Not fine for a chat UI.
Bleve’s HNSW trades a bit of recall (0.93) for a 100× speedup over sqlite-vec.
LanceDB-go is fastest because IVF-PQ is approximate and compresses vectors. Lower recall than HNSW at default settings, tunable up.

A reasonable rule of thumb: budget your latency target first, then pick the engine that hits it without burning your RAM budget.

Decision matrix — pick by constraint, not by hype

Your hardest constraint	Use this	Don’t use
No CGO (cross-compile to weird targets)	chromem-go (small) or Bleve (HNSW)	sqlite-vec, LanceDB-go
All three OSes incl. Windows	sqlite-vec, Bleve, chromem-go	go-libsql (no Windows yet)
<5 s build time, simplest code	chromem-go	DuckDB+VSS, LanceDB-go
<50 ms p95 query latency at 100k+ vectors	LanceDB-go or Bleve HNSW	sqlite-vec, chromem-go on slim hardware
Need full SQL filtering / joins / ACID	sqlite-vec (or DuckDB+VSS)	chromem-go, LanceDB-go
Need full-text search in the same engine	Bleve	everything else
Need ANN at million-scale + Linux/macOS only	go-libsql (DiskANN)	brute-force engines
App is shipped to end-users on a small laptop	sqlite-vec or chromem-go (under 100k)	LanceDB-go (heavier deps)

If two rows match your situation, pick the engine that wins the row you’d be most upset about losing. For most “build a Go-binary RAG tool” projects, that’s “all three OSes” + “no CGO surprises”, which lands you on sqlite-vec.

End-to-end RAG walkthrough with `sqlite-vec`

This is the smallest amount of code that actually works as a retrieval-augmented generation pipeline in Go. It’s deliberately stripped down — no chunking strategy, no reranker, no streaming. You can layer those on top once the bones work.

package main

import (
	"context"
	"database/sql"
	"encoding/json"
	"fmt"
	"log"

	_ "github.com/asg017/sqlite-vec-go-bindings/cgo"
	_ "github.com/mattn/go-sqlite3"
	"github.com/anthropics/anthropic-sdk-go"
	"github.com/anthropics/anthropic-sdk-go/option"
)

const dim = 1024 // matches Cohere embed-multilingual-v3

func main() {
	db, err := sql.Open("sqlite3", "rag.db?_journal=WAL&_busy_timeout=5000")
	must(err)
	defer db.Close()

	// 1. Schema — chunks table + virtual vec table
	_, err = db.Exec(fmt.Sprintf(`
		CREATE TABLE IF NOT EXISTS chunks (
			id   INTEGER PRIMARY KEY AUTOINCREMENT,
			doc  TEXT NOT NULL,
			text TEXT NOT NULL
		);
		CREATE VIRTUAL TABLE IF NOT EXISTS vec_chunks USING vec0(
			id      INTEGER PRIMARY KEY,
			emb     FLOAT[%d]
		);
	`, dim))
	must(err)

	// 2. Ingest — embed each chunk and write both rows in a single transaction
	chunks := loadChunks() // your splitter; e.g. ~500-token windows
	tx, err := db.Begin()
	must(err)
	for _, c := range chunks {
		emb := embed(c.Text) // []float32 of length `dim`
		res, err := tx.Exec(`INSERT INTO chunks(doc, text) VALUES (?, ?)`, c.Doc, c.Text)
		must(err)
		id, _ := res.LastInsertId()
		blob, _ := json.Marshal(emb) // sqlite-vec accepts JSON arrays as the float vector
		_, err = tx.Exec(`INSERT INTO vec_chunks(id, emb) VALUES (?, ?)`, id, string(blob))
		must(err)
	}
	must(tx.Commit())

	// 3. Retrieve — kNN over the embedded query
	question := "What is the BTRC IMEI check process?"
	qEmb, _ := json.Marshal(embed(question))

	rows, err := db.Query(`
		SELECT chunks.doc, chunks.text, vec_distance_cosine(vec_chunks.emb, ?) AS d
		FROM vec_chunks
		JOIN chunks ON chunks.id = vec_chunks.id
		ORDER BY d
		LIMIT 5
	`, string(qEmb))
	must(err)
	var contexts []string
	for rows.Next() {
		var doc, text string
		var d float64
		_ = rows.Scan(&doc, &text, &d)
		contexts = append(contexts, fmt.Sprintf("[%s] %s", doc, text))
	}
	rows.Close()

	// 4. Generate — feed top-k context into Claude
	client := anthropic.NewClient(option.WithAPIKey("sk-ant-..."))
	resp, err := client.Messages.New(context.Background(), anthropic.MessageNewParams{
		Model:     anthropic.F(anthropic.ModelClaudeSonnet4_6),
		MaxTokens: anthropic.F(int64(1024)),
		System: anthropic.F([]anthropic.TextBlockParam{{
			Type: anthropic.F(anthropic.TextBlockParamTypeText),
			Text: anthropic.F("Answer using only the supplied <context>. Cite the [doc] tags."),
		}}),
		Messages: anthropic.F([]anthropic.MessageParam{
			anthropic.NewUserMessage(anthropic.NewTextBlock(
				fmt.Sprintf("<context>\n%s\n</context>\n\nQuestion: %s", joinLines(contexts), question),
			)),
		}),
	})
	must(err)
	for _, b := range resp.Content {
		if b.Type == anthropic.ContentBlockTypeText {
			fmt.Println(b.Text)
		}
	}
}

func must(err error) { if err != nil { log.Fatal(err) } }

Three things this skeleton does that toy examples skip:

WAL journal mode + busy timeout so concurrent reads don’t block writes during ingest.
Both inserts (chunks + vec_chunks) in one transaction — without it, a crash mid-ingest leaves orphaned vector rows that JOIN will silently drop.
vec_distance_cosine in the ORDER BY — sqlite-vec also supports L2 and inner product, but cosine matches what most embedding models produce.

For production: add a chunking strategy (tiktoken or recursive splitter), a reranker (Cohere Rerank API or bge-reranker-v2-m3), and streaming responses. Those are independent of the storage choice — swap vec_chunks for chromem-go and the rest of the pipeline doesn’t change.

Takeaways

If you are in a similar spot, the questions worth asking yourself are:

Does your dataset fit in RAM? If yes and it is small, chromem-go is the easiest path.
Do you need millisecond latency? If yes, go-libsql is the only embedded option that delivers it.
Do you need Windows support? That single question eliminates go-libsql today and pushes you to sqlite-vec.
Do you need analytical SQL or full-text search alongside vectors? That changes the answer to DuckDB or Bleve respectively.

The embedded vector database space is young but the options are real. Pick the one that matches your actual constraints, not the one with the best benchmarks for a workload you do not have.

Embedded Vector Databases for Go in 2026: chromem-go vs sqlite-vec vs Bleve vs LanceDB

The constraint

The candidates

1. HTTP-server vector databases (Chroma, Qdrant, Weaviate, Milvus)

2. chromem-go

3. Bleve

4. DuckDB with VSS extension

5. libsql-client-go (pure Go libSQL driver)

6. go-libsql (CGO-based libSQL driver)

7. sqlite-vec via mattn/go-sqlite3

Side-by-side comparison

Decision tree

What I chose and why

Head-to-head benchmark — chromem-go vs sqlite-vec vs Bleve vs LanceDB-go

Decision matrix — pick by constraint, not by hype

End-to-end RAG walkthrough with sqlite-vec

Takeaways

End-to-end RAG walkthrough with `sqlite-vec`