If you are shipping a Go application that runs on a user's machine and needs vector search, you face an awkward problem: most popular vector databases (Chroma, Qdrant, Weaviate, Milvus, Pinecone) run as separate servers. That means asking your users to install and operate extra infrastructure, which is a non-starter for a single-binary developer tool.

This post walks through the embeddable options I evaluated, the trade-offs, and a decision tree to help you pick one for your own project.

## The constraint

The application must be distributed as a single Go binary. No Docker, no separate service, no manual setup. The vector database has to run in-process.

My specific scale target was 1 million documents at 1,024-dimensional embeddings. Latency was not a concern because it is a local developer tool, not a hot-path service.

## The candidates

I evaluated seven options. Here is each one with its pros and cons.

### 1. HTTP-server vector databases (Chroma, Qdrant, Weaviate, Milvus)

**Pros**
- Mature, feature-rich, well-documented
- Excellent ANN indexing, filtering, and scaling
- Strong ecosystem and community

**Cons**
- Require a separate server process
- Users must install and run extra infrastructure
- Defeats the single-binary distribution model

**Verdict:** Eliminated. They violate the core constraint.

### 2. chromem-go

A pure-Go embeddable vector database inspired by Chroma.

**Pros**
- Pure Go, no CGO
- Simple API, easy to integrate
- Cross-compiles trivially
- Good for small datasets

**Cons**
- Loads all vectors into RAM
- At 1M x 1024 dimensions, needs ~4 GB+ memory
- Not viable for end-user machines at scale

**Verdict:** Great for small datasets (under 100k vectors). Eliminated for our scale.

### 3. Bleve

A mature pure-Go full-text search library that recently added vector support.

**Pros**
- Pure Go, well-maintained, battle-tested
- Useful if you also need full-text search
- File-based persistence

**Cons**
- Vector search is a secondary feature
- Not optimized for million-scale vector workloads
- Heavier than needed for pure vector search

**Verdict:** Worth considering only if you need full-text and vector search together.

### 4. DuckDB with VSS extension

An embeddable analytical database with a vector similarity search extension.

**Pros**
- Excellent for analytical SQL alongside vectors
- HNSW indexing built-in
- Columnar storage is efficient

**Cons**
- CGO required
- Go bindings less mature than Python
- Overkill if you only need vector search

**Verdict:** A good fit if you need analytical queries on top of vectors. Otherwise, too heavy.

### 5. libsql-client-go (pure Go libSQL driver)

The pure-Go client for libSQL, Turso's SQLite fork.

**Pros**
- Pure Go, no CGO
- Cross-compiles easily
- Standard database/sql interface

**Cons**
- HTTP client only, connects to a remote libSQL server
- Cannot run libSQL embedded in your binary
- Defeats the no-infrastructure requirement

**Verdict:** Eliminated. The "pure Go" advantage is misleading because it requires a remote server.

### 6. go-libsql (CGO-based libSQL driver)

The CGO-based libSQL driver that supports true embedded mode.

**Pros**
- Native DiskANN ANN indexing
- Built-in vector quantization (int8 etc.)
- Millisecond query latency at million-vector scale
- File-based, single-file deployment
- Best raw performance of all candidates

**Cons**
- CGO required
- Currently supports only Linux amd64/arm64 and macOS amd64/arm64
- No Windows support today
- Younger ecosystem than mattn/go-sqlite3

**Verdict:** Best performance, but the lack of Windows support forces a fallback for a small subset of users.

### 7. sqlite-vec via mattn/go-sqlite3

A SQLite extension for vector search by Alex Garcia, used through the mainstream mattn Go SQLite driver.

**Pros**
- Works on Linux, macOS, and Windows
- Built on mature SQLite foundations
- Full SQL for metadata filtering, transactions, durability
- Single-file deployment
- Active development with a known author
- Single backend covers 100% of platforms

**Cons**
- CGO required
- Currently brute-force search only (ANN on the roadmap)
- No native quantization
- Higher disk footprint at scale (~4 GB at 1M x 1024)
- Query latency scales linearly with dataset size

**Verdict:** The pragmatic winner when latency is not critical and full platform coverage matters.

## Side-by-side comparison

| Aspect | Chroma/Qdrant/etc. | chromem-go | Bleve | DuckDB+VSS | libsql-client-go | go-libsql | sqlite-vec+mattn |
|---|---|---|---|---|---|---|---|
| Embeddable | No | Yes | Yes | Yes | No | Yes | Yes |
| Pure Go | N/A | Yes | Yes | No (CGO) | Yes | No (CGO) | No (CGO) |
| Linux/macOS | N/A | Yes | Yes | Yes | Yes | Yes | Yes |
| Windows | N/A | Yes | Yes | Yes | Yes | No | Yes |
| 1M vectors viable | Yes | No (RAM) | Marginal | Yes | N/A | Yes | Yes |
| ANN index | Yes | No | HNSW | HNSW | N/A | DiskANN | No (roadmap) |
| Quantization | Varies | No | No | Limited | N/A | Yes | No |
| Query latency at 1M | Fast | Fast (if RAM) | Slow | Fast | N/A | Milliseconds | Seconds |
| SQL filtering | Limited | Basic | Basic | Full SQL | Full SQL | Full SQL | Full SQL |
| Single binary | No | Yes | Yes | Yes | No | Yes | Yes |

## Decision tree

```mermaid
flowchart TD
    Start[Need vector search in your Go app] --> Q1{Can users run a separate server?}
    Q1 -->|Yes| Server[Use Chroma, Qdrant, Weaviate, or Milvus]
    Q1 -->|No, must be embedded| Q2{Dataset size?}

    Q2 -->|Under 100k vectors| Small{CGO acceptable?}
    Small -->|No| Chromem[Use chromem-go: pure Go, simple, fast for small data]
    Small -->|Yes| Q3

    Q2 -->|100k to 10M vectors| Q3{Latency critical?}

    Q3 -->|Yes, milliseconds matter| Q4{Need Windows support?}
    Q4 -->|No, Linux/macOS only| GoLibsql[Use go-libsql: DiskANN, quantization, fastest]
    Q4 -->|Yes, all platforms| Dual[Use go-libsql primary plus sqlite-vec fallback for Windows]

    Q3 -->|No, seconds are fine| Q5{Need full-text search too?}
    Q5 -->|Yes| Bleve[Use Bleve: combined full-text and vector]
    Q5 -->|No| Q6{Need analytical SQL?}
    Q6 -->|Yes| DuckDB[Use DuckDB with VSS extension]
    Q6 -->|No| SqliteVec[Use sqlite-vec with mattn/go-sqlite3: simple, all platforms]

    Q2 -->|Over 10M vectors| Reconsider[Reconsider embedded model: server-based likely better]
```

## What I chose and why

For my use case, a local developer tool with up to 1M vectors and no latency pressure, I chose **sqlite-vec via mattn/go-sqlite3**.

The reasoning:

1. Single backend covers all target platforms (Linux, macOS, Windows). No abstraction layer to maintain.
2. Brute-force search at 1M x 1024 takes a few seconds, which is fine for a CLI tool used occasionally.
3. CGO complexity stays inside the build pipeline. End users still get a clean single binary per platform.
4. SQLite foundations give full SQL, transactions, and durability for free.
5. If latency ever becomes a real problem, migrating to go-libsql is mechanical because both are SQLite-compatible and both use database/sql.

This is YAGNI in action. The "better" option (go-libsql with DiskANN and quantization) costs real engineering hours to integrate alongside a fallback. Those hours only pay off if performance actually becomes a constraint, which it may never do.

## Takeaways

If you are in a similar spot, the questions worth asking yourself are:

- Does your dataset fit in RAM? If yes and it is small, chromem-go is the easiest path.
- Do you need millisecond latency? If yes, go-libsql is the only embedded option that delivers it.
- Do you need Windows support? That single question eliminates go-libsql today and pushes you to sqlite-vec.
- Do you need analytical SQL or full-text search alongside vectors? That changes the answer to DuckDB or Bleve respectively.

The embedded vector database space is young but the options are real. Pick the one that matches your actual constraints, not the one with the best benchmarks for a workload you do not have.