Skip to main content
SEMastery
Data Accessbeginner

Getting Started With pgvector in .NET for Simple Vector Search

Learn pgvector with .NET, Npgsql and EF Core to store embeddings and run simple vector search with cosine distance and HNSW indexes, step by step.

11 min readUpdated November 5, 2025

Imagine you walk into a big library in your town. You loved one story book, and you ask the librarian, "Please give me more books that feel like this one." A good librarian does not just match the title word by word. She thinks about the mood, the topic, and the style, and then hands you books that are close in meaning. That is exactly what vector search does for a computer. And pgvector is the small tool that gives this power to a normal PostgreSQL database.

In this guide we will learn, in simple steps, how to store these "meaning numbers" in Postgres and ask it to find the most similar items, all from a .NET app. We will keep things friendly and beginner-level. By the end you will have a working idea you can build on.

What is a vector, really?

A computer cannot understand the word "dog" directly. So an AI model turns text into a list of numbers, like [0.12, -0.45, 0.88, ...]. This list is called an embedding or a vector. The clever part is this: words and sentences that mean similar things get number lists that are close to each other.

Think of it like a giant map. Every word or sentence is a dot on this map. "Dog" and "puppy" sit near each other. "Dog" and "rocket" sit far apart. Vector search is just asking, "Which dots are nearest to my dot?"

From text to a nearest match

Text
Embedding model
Vector
Postgres + pgvector
Nearest items

Steps

1

Text

A word or sentence

2

Embedding model

Turns text into numbers

3

Vector

A list like [0.1, 0.9, ...]

4

Postgres + pgvector

Stores and compares vectors

5

Nearest items

Closest matches returned

The full journey of one search, end to end.

Why pgvector and not a separate database?

There are many special "vector databases" out there. They are nice, but they add one more thing to run, pay for, and back up. If your app already uses PostgreSQL (and many .NET apps do), you can simply switch on pgvector. Your users' data and your vectors live in the same place. You can even join a vector search with your normal SQL WHERE filters.

Here is a simple comparison to help you decide.

OptionExtra service to run?Good forNotes
pgvector in PostgresNoApps already using PostgresSimple, one database, free
Dedicated vector DBYesVery large scale, billions of vectorsMore to manage and pay for
In-memory searchNo, but no persistenceTiny demosData is lost on restart

For most small and medium .NET apps, pgvector is the calm, sensible first choice.

How the pieces fit together

Before writing code, let us picture the whole system. There are three actors: your .NET app, an embedding model (to make vectors), and PostgreSQL with pgvector (to store and search).

The three main parts of a pgvector search system.

The embedding model can be a cloud service or a local model like one served by Ollama. For this beginner guide, the important point is: something gives you a list of floats. pgvector does the storing and the comparing.

Step 1: Turn on pgvector in PostgreSQL

pgvector is an extension. You enable it once per database with a single SQL command. Most managed Postgres providers already have it ready to switch on.

// Run this once against your database.
// You can do it in psql, a migration, or a setup script.
await using var setup = dataSource.CreateCommand(
    "CREATE EXTENSION IF NOT EXISTS vector;");
await setup.ExecuteNonQueryAsync();

After this, Postgres understands a new column type called vector. You tell it how many numbers each vector has, for example vector(1536) if your model gives 1536 numbers.

Step 2: Add the NuGet package and enable vectors in Npgsql

In .NET, the official Pgvector package teaches Npgsql, Dapper, and EF Core how to read and write the vector type. Install it first.

// dotnet add package Pgvector
// dotnet add package Npgsql
 
using Npgsql;
using Pgvector;
 
var connectionString = "Host=localhost;Username=postgres;Password=postgres;Database=demo";
 
var builder = new NpgsqlDataSourceBuilder(connectionString);
builder.UseVector(); // this line wires up the vector type
await using var dataSource = builder.Build();

That one call, UseVector(), is the magic glue. Now your C# code can pass a Vector object straight into a SQL parameter, and read one back, without any messy conversion.

Setup order

Install package
Enable extension
Build data source
Create table

Steps

1

Install package

add Pgvector + Npgsql

2

Enable extension

CREATE EXTENSION vector

3

Build data source

call UseVector()

4

Create table

add a vector column

Do these once, in this order, before any search.

Step 3: Create a table with a vector column

Let us store some simple documents. Each row has an id, the text, and its embedding. The embedding column is where the numbers live.

await using var create = dataSource.CreateCommand(@"
    CREATE TABLE IF NOT EXISTS documents (
        id          bigint GENERATED ALWAYS AS IDENTITY PRIMARY KEY,
        content     text NOT NULL,
        embedding   vector(1536)
    );");
await create.ExecuteNonQueryAsync();

Notice vector(1536). The number in brackets must match what your embedding model produces. If your model gives 768 numbers, write vector(768). Mixing sizes will cause errors, so keep them equal.

Step 4: Insert text with its embedding

When you save a document, you first ask the embedding model for its vector, then store both the text and the vector together. Below, assume GetEmbeddingAsync calls your model and returns a float[].

async Task AddDocumentAsync(NpgsqlDataSource ds, string content)
{
    float[] numbers = await GetEmbeddingAsync(content); // from your model
    var embedding = new Vector(numbers);
 
    await using var cmd = ds.CreateCommand(
        "INSERT INTO documents (content, embedding) VALUES ($1, $2)");
    cmd.Parameters.AddWithValue(content);
    cmd.Parameters.AddWithValue(embedding); // Pgvector handles this type
    await cmd.ExecuteNonQueryAsync();
}

See how clean it is. Because we called UseVector(), passing a Vector as a parameter just works.

Step 5: Search for similar documents

Now the fun part. A user types a question. We turn that question into a vector too, then ask Postgres, "Give me the 5 rows whose embedding is closest to this." pgvector adds special operators for distance:

OperatorMeaningCommon use
<->L2 (straight-line) distanceGeneral numeric vectors
<=>Cosine distanceText embeddings (most common)
<#>Negative inner productWhen the model says so

For most text search you will use <=>, cosine distance. Smaller distance means more similar.

async Task<List<string>> SearchAsync(NpgsqlDataSource ds, string query)
{
    float[] numbers = await GetEmbeddingAsync(query);
    var queryVector = new Vector(numbers);
 
    await using var cmd = ds.CreateCommand(@"
        SELECT content
        FROM documents
        ORDER BY embedding <=> $1
        LIMIT 5;");
    cmd.Parameters.AddWithValue(queryVector);
 
    var results = new List<string>();
    await using var reader = await cmd.ExecuteReaderAsync();
    while (await reader.ReadAsync())
    {
        results.Add(reader.GetString(0));
    }
    return results;
}

The line ORDER BY embedding <=> $1 is the heart of vector search. It sorts every row by how close it is to the query, and LIMIT 5 keeps only the nearest five.

What happens during one search request.

Step 6: Make it fast with an HNSW index

If you only have a handful of rows, Postgres can check every row and stay fast. This is called an exact search. But when you grow to thousands or millions of rows, checking every row becomes slow. This is where an index helps.

pgvector offers an HNSW index. The name stands for Hierarchical Navigable Small World. You do not need to memorise that. Just picture a smart shortcut map: instead of walking past every house to find the nearest shop, you hop through a few well-placed signposts and arrive quickly. HNSW trades a tiny bit of accuracy for a big jump in speed. This is called approximate nearest neighbour search.

await using var index = dataSource.CreateCommand(@"
    CREATE INDEX IF NOT EXISTS documents_embedding_hnsw
    ON documents
    USING hnsw (embedding vector_cosine_ops);");
await index.ExecuteNonQueryAsync();

The important part is vector_cosine_ops. It must match the distance you search with. Since we search with <=> (cosine), we build the index with vector_cosine_ops. If you used L2 (<->), you would build it with vector_l2_ops instead. Mismatching them means the index will not be used.

Exact search checks every row; HNSW jumps through a graph.

Here is a simple way to choose.

SituationUseWhy
Under a few thousand rowsNo index (exact)Already fast, fully accurate
Thousands to millionsHNSW indexStays fast, slight accuracy trade-off
Need 100% exact every timeNo indexHNSW is approximate

Using EF Core instead of raw Npgsql

If your project already uses Entity Framework Core, you do not have to write raw SQL. The Pgvector.EntityFrameworkCore package lets you map a Vector property and even create the HNSW index in your model. A document entity might look like this.

using Pgvector;
using Pgvector.EntityFrameworkCore;
 
public class Document
{
    public long Id { get; set; }
    public string Content { get; set; } = "";
    public Vector? Embedding { get; set; }
}
 
// In OnModelCreating:
modelBuilder.HasPostgresExtension("vector");
 
modelBuilder.Entity<Document>()
    .Property(d => d.Embedding)
    .HasColumnType("vector(1536)");
 
modelBuilder.Entity<Document>()
    .HasIndex(d => d.Embedding)
    .HasMethod("hnsw")
    .HasOperators("vector_cosine_ops");

Then your LINQ query can order by distance using a helper:

var queryVector = new Vector(await GetEmbeddingAsync(userText));
 
var matches = await db.Documents
    .OrderBy(d => d.Embedding!.CosineDistance(queryVector))
    .Take(5)
    .Select(d => d.Content)
    .ToListAsync();

EF Core turns CosineDistance into the <=> operator behind the scenes, so you get the same fast search with comfortable, typed C# code.

A few friendly tips

A few small habits will save you headaches later:

  • Keep the vector size the same everywhere: the column, the model output, and the index must agree.
  • Pick your distance once and stay with it. If the model docs say cosine, use cosine for both the query and the index.
  • Add the HNSW index only when your data grows. For a small demo, exact search is simpler and perfectly fine.
  • Store the original text next to the vector. You almost always want to show the real content back to the user, not the numbers.
  • Embeddings can be large. Many small rows are easier to manage than a few huge ones, so chunk long documents into paragraphs.

Putting it all together

Let us recap the flow as one clear picture, so the whole idea stays in your head.

The complete pgvector workflow

Enable extension
Create table
Store text + vector
Add HNSW index
Search by distance

Steps

1

Enable extension

once per database

2

Create table

with a vector column

3

Store text + vector

on every insert

4

Add HNSW index

when data grows

5

Search by distance

on every query

Setup happens once; search happens on every user query.

You now have the full shape of a working vector search feature. The data lives in the same Postgres you already trust. Your .NET code stays clean thanks to the Pgvector package. And when traffic grows, one HNSW index keeps things fast. That is a lot of value for very little extra plumbing.

The best way to learn is to try it. Spin up a Postgres container, enable the extension, store ten short sentences with their embeddings, and search. Watching it return the right neighbours for the first time is a small but happy moment, and it makes all these ideas feel real.

Quick recap

  • A vector (embedding) is a list of numbers that captures the meaning of text. Similar meanings sit close together.
  • pgvector is a free PostgreSQL extension. Enable it with CREATE EXTENSION IF NOT EXISTS vector;.
  • In .NET, install the Pgvector package and call UseVector() on your Npgsql data source.
  • Store vectors in a vector(N) column where N matches your model's output size.
  • Search with distance operators: <=> for cosine (most text), <-> for L2.
  • ORDER BY embedding <=> $1 LIMIT 5 finds the five closest rows.
  • Add an HNSW index with matching operators (vector_cosine_ops) to keep search fast on big data.
  • EF Core users can map a Vector property and use CosineDistance in LINQ instead of raw SQL.

References and further reading

Related Posts