Getting Started With pgvector in .NET for Simple Vector Search
Learn pgvector with .NET, Npgsql and EF Core to store embeddings and run simple vector search with cosine distance and HNSW indexes, step by step.
Imagine you walk into a big library in your town. You loved one story book, and you ask the librarian, "Please give me more books that feel like this one." A good librarian does not just match the title word by word. She thinks about the mood, the topic, and the style, and then hands you books that are close in meaning. That is exactly what vector search does for a computer. And pgvector is the small tool that gives this power to a normal PostgreSQL database.
In this guide we will learn, in simple steps, how to store these "meaning numbers" in Postgres and ask it to find the most similar items, all from a .NET app. We will keep things friendly and beginner-level. By the end you will have a working idea you can build on.
What is a vector, really?
A computer cannot understand the word "dog" directly. So an AI model turns text into a list of numbers, like [0.12, -0.45, 0.88, ...]. This list is called an embedding or a vector. The clever part is this: words and sentences that mean similar things get number lists that are close to each other.
Think of it like a giant map. Every word or sentence is a dot on this map. "Dog" and "puppy" sit near each other. "Dog" and "rocket" sit far apart. Vector search is just asking, "Which dots are nearest to my dot?"
From text to a nearest match
Steps
Text
A word or sentence
Embedding model
Turns text into numbers
Vector
A list like [0.1, 0.9, ...]
Postgres + pgvector
Stores and compares vectors
Nearest items
Closest matches returned
Why pgvector and not a separate database?
There are many special "vector databases" out there. They are nice, but they add one more thing to run, pay for, and back up. If your app already uses PostgreSQL (and many .NET apps do), you can simply switch on pgvector. Your users' data and your vectors live in the same place. You can even join a vector search with your normal SQL WHERE filters.
Here is a simple comparison to help you decide.
| Option | Extra service to run? | Good for | Notes |
|---|---|---|---|
| pgvector in Postgres | No | Apps already using Postgres | Simple, one database, free |
| Dedicated vector DB | Yes | Very large scale, billions of vectors | More to manage and pay for |
| In-memory search | No, but no persistence | Tiny demos | Data is lost on restart |
For most small and medium .NET apps, pgvector is the calm, sensible first choice.
How the pieces fit together
Before writing code, let us picture the whole system. There are three actors: your .NET app, an embedding model (to make vectors), and PostgreSQL with pgvector (to store and search).
The embedding model can be a cloud service or a local model like one served by Ollama. For this beginner guide, the important point is: something gives you a list of floats. pgvector does the storing and the comparing.
Step 1: Turn on pgvector in PostgreSQL
pgvector is an extension. You enable it once per database with a single SQL command. Most managed Postgres providers already have it ready to switch on.
// Run this once against your database.
// You can do it in psql, a migration, or a setup script.
await using var setup = dataSource.CreateCommand(
"CREATE EXTENSION IF NOT EXISTS vector;");
await setup.ExecuteNonQueryAsync();After this, Postgres understands a new column type called vector. You tell it how many numbers each vector has, for example vector(1536) if your model gives 1536 numbers.
Step 2: Add the NuGet package and enable vectors in Npgsql
In .NET, the official Pgvector package teaches Npgsql, Dapper, and EF Core how to read and write the vector type. Install it first.
// dotnet add package Pgvector
// dotnet add package Npgsql
using Npgsql;
using Pgvector;
var connectionString = "Host=localhost;Username=postgres;Password=postgres;Database=demo";
var builder = new NpgsqlDataSourceBuilder(connectionString);
builder.UseVector(); // this line wires up the vector type
await using var dataSource = builder.Build();That one call, UseVector(), is the magic glue. Now your C# code can pass a Vector object straight into a SQL parameter, and read one back, without any messy conversion.
Setup order
Steps
Install package
add Pgvector + Npgsql
Enable extension
CREATE EXTENSION vector
Build data source
call UseVector()
Create table
add a vector column
Step 3: Create a table with a vector column
Let us store some simple documents. Each row has an id, the text, and its embedding. The embedding column is where the numbers live.
await using var create = dataSource.CreateCommand(@"
CREATE TABLE IF NOT EXISTS documents (
id bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
content text NOT NULL,
embedding vector(1536)
);");
await create.ExecuteNonQueryAsync();Notice vector(1536). The number in brackets must match what your embedding model produces. If your model gives 768 numbers, write vector(768). Mixing sizes will cause errors, so keep them equal.
Step 4: Insert text with its embedding
When you save a document, you first ask the embedding model for its vector, then store both the text and the vector together. Below, assume GetEmbeddingAsync calls your model and returns a float[].
async Task AddDocumentAsync(NpgsqlDataSource ds, string content)
{
float[] numbers = await GetEmbeddingAsync(content); // from your model
var embedding = new Vector(numbers);
await using var cmd = ds.CreateCommand(
"INSERT INTO documents (content, embedding) VALUES ($1, $2)");
cmd.Parameters.AddWithValue(content);
cmd.Parameters.AddWithValue(embedding); // Pgvector handles this type
await cmd.ExecuteNonQueryAsync();
}See how clean it is. Because we called UseVector(), passing a Vector as a parameter just works.
Step 5: Search for similar documents
Now the fun part. A user types a question. We turn that question into a vector too, then ask Postgres, "Give me the 5 rows whose embedding is closest to this." pgvector adds special operators for distance:
| Operator | Meaning | Common use |
|---|---|---|
<-> | L2 (straight-line) distance | General numeric vectors |
<=> | Cosine distance | Text embeddings (most common) |
<#> | Negative inner product | When the model says so |
For most text search you will use <=>, cosine distance. Smaller distance means more similar.
async Task<List<string>> SearchAsync(NpgsqlDataSource ds, string query)
{
float[] numbers = await GetEmbeddingAsync(query);
var queryVector = new Vector(numbers);
await using var cmd = ds.CreateCommand(@"
SELECT content
FROM documents
ORDER BY embedding <=> $1
LIMIT 5;");
cmd.Parameters.AddWithValue(queryVector);
var results = new List<string>();
await using var reader = await cmd.ExecuteReaderAsync();
while (await reader.ReadAsync())
{
results.Add(reader.GetString(0));
}
return results;
}The line ORDER BY embedding <=> $1 is the heart of vector search. It sorts every row by how close it is to the query, and LIMIT 5 keeps only the nearest five.
Step 6: Make it fast with an HNSW index
If you only have a handful of rows, Postgres can check every row and stay fast. This is called an exact search. But when you grow to thousands or millions of rows, checking every row becomes slow. This is where an index helps.
pgvector offers an HNSW index. The name stands for Hierarchical Navigable Small World. You do not need to memorise that. Just picture a smart shortcut map: instead of walking past every house to find the nearest shop, you hop through a few well-placed signposts and arrive quickly. HNSW trades a tiny bit of accuracy for a big jump in speed. This is called approximate nearest neighbour search.
await using var index = dataSource.CreateCommand(@"
CREATE INDEX IF NOT EXISTS documents_embedding_hnsw
ON documents
USING hnsw (embedding vector_cosine_ops);");
await index.ExecuteNonQueryAsync();The important part is vector_cosine_ops. It must match the distance you search with. Since we search with <=> (cosine), we build the index with vector_cosine_ops. If you used L2 (<->), you would build it with vector_l2_ops instead. Mismatching them means the index will not be used.
Here is a simple way to choose.
| Situation | Use | Why |
|---|---|---|
| Under a few thousand rows | No index (exact) | Already fast, fully accurate |
| Thousands to millions | HNSW index | Stays fast, slight accuracy trade-off |
| Need 100% exact every time | No index | HNSW is approximate |
Using EF Core instead of raw Npgsql
If your project already uses Entity Framework Core, you do not have to write raw SQL. The Pgvector.EntityFrameworkCore package lets you map a Vector property and even create the HNSW index in your model. A document entity might look like this.
using Pgvector;
using Pgvector.EntityFrameworkCore;
public class Document
{
public long Id { get; set; }
public string Content { get; set; } = "";
public Vector? Embedding { get; set; }
}
// In OnModelCreating:
modelBuilder.HasPostgresExtension("vector");
modelBuilder.Entity<Document>()
.Property(d => d.Embedding)
.HasColumnType("vector(1536)");
modelBuilder.Entity<Document>()
.HasIndex(d => d.Embedding)
.HasMethod("hnsw")
.HasOperators("vector_cosine_ops");Then your LINQ query can order by distance using a helper:
var queryVector = new Vector(await GetEmbeddingAsync(userText));
var matches = await db.Documents
.OrderBy(d => d.Embedding!.CosineDistance(queryVector))
.Take(5)
.Select(d => d.Content)
.ToListAsync();EF Core turns CosineDistance into the <=> operator behind the scenes, so you get the same fast search with comfortable, typed C# code.
A few friendly tips
A few small habits will save you headaches later:
- Keep the vector size the same everywhere: the column, the model output, and the index must agree.
- Pick your distance once and stay with it. If the model docs say cosine, use cosine for both the query and the index.
- Add the HNSW index only when your data grows. For a small demo, exact search is simpler and perfectly fine.
- Store the original text next to the vector. You almost always want to show the real content back to the user, not the numbers.
- Embeddings can be large. Many small rows are easier to manage than a few huge ones, so chunk long documents into paragraphs.
Putting it all together
Let us recap the flow as one clear picture, so the whole idea stays in your head.
The complete pgvector workflow
Steps
Enable extension
once per database
Create table
with a vector column
Store text + vector
on every insert
Add HNSW index
when data grows
Search by distance
on every query
You now have the full shape of a working vector search feature. The data lives in the same Postgres you already trust. Your .NET code stays clean thanks to the Pgvector package. And when traffic grows, one HNSW index keeps things fast. That is a lot of value for very little extra plumbing.
The best way to learn is to try it. Spin up a Postgres container, enable the extension, store ten short sentences with their embeddings, and search. Watching it return the right neighbours for the first time is a small but happy moment, and it makes all these ideas feel real.
Quick recap
- A vector (embedding) is a list of numbers that captures the meaning of text. Similar meanings sit close together.
- pgvector is a free PostgreSQL extension. Enable it with
CREATE EXTENSION IF NOT EXISTS vector;. - In .NET, install the
Pgvectorpackage and callUseVector()on your Npgsql data source. - Store vectors in a
vector(N)column whereNmatches your model's output size. - Search with distance operators:
<=>for cosine (most text),<->for L2. ORDER BY embedding <=> $1 LIMIT 5finds the five closest rows.- Add an HNSW index with matching operators (
vector_cosine_ops) to keep search fast on big data. - EF Core users can map a
Vectorproperty and useCosineDistancein LINQ instead of raw SQL.
References and further reading
- pgvector for .NET (official GitHub) - examples for Npgsql, Dapper, and EF Core.
- pgvector (PostgreSQL extension) - the source extension, distance operators, and index docs.
- Pgvector NuGet package - the .NET package you install.
- Getting Started With pgvector in .NET (Milan Jovanovic) - a clear community walkthrough.
- Understanding vector search and HNSW (Neon) - a friendly explanation of HNSW indexes.
Related Posts
Using Stored Procedures and Functions With EF Core and PostgreSQL
A friendly, beginner guide to calling PostgreSQL stored procedures and functions from EF Core using FromSql, ExecuteSql, and keyless entities.
What Is Vector Search? A Concise Guide for .NET Developers
A simple, friendly guide to vector search for .NET developers: embeddings, similarity, nearest neighbors, and how to build it with Microsoft.Extensions.VectorData.
How I Implemented Full-Text Search on My Website with EF Core
A simple, beginner-friendly guide to adding fast full-text search to your .NET website using EF Core with SQL Server and PostgreSQL.
Understanding Change Tracking for Better Performance in EF Core
Learn how EF Core change tracking works, the entity states it uses, and simple tricks like AsNoTracking to make your .NET apps faster.
Calling Views, Stored Procedures and Functions in EF Core
A friendly, beginner guide to calling database views, stored procedures, and functions in EF Core with FromSql, SqlQuery, ExecuteSql, and ToView.
How to Use Global Query Filters in EF Core (Beginner Guide)
Learn EF Core global query filters with simple examples for soft delete and multi-tenancy, plus the new named filters in EF Core 10.