Skip to main content
SEMastery
Fundamentalsbeginner

What Is Vector Search? A Concise Guide for .NET Developers

A simple, friendly guide to vector search for .NET developers: embeddings, similarity, nearest neighbors, and how to build it with Microsoft.Extensions.VectorData.

11 min readUpdated April 29, 2026

Imagine you walk into a big library to find a book about "feeling sad after moving to a new city." You do not know the title. You do not know the author. You only know the feeling of what you want.

A normal search would ask, "Which words did you type?" If the book never uses the word "sad", you might miss it. But a kind librarian who understands you would say, "Ah, you want stories about loneliness and new beginnings," and walk you straight to the right shelf.

Vector search is that kind librarian, built into software. It does not match words. It matches meaning. This guide explains how it works, in plain language, and shows how to build it in .NET.

A simple everyday picture

Think about a kirana shop owner who arranges items not in random spots, but by how similar they are. Sugar sits near salt and flour, because they are all kitchen staples. Soap sits near shampoo and toothpaste. When a customer asks for "something to wash my hair," the owner does not need the exact word. He walks to the bathroom-items corner, because similar things live close together.

Vector search puts every piece of data into a giant invisible shop like this. Similar things get placed close together. When you ask a question, the computer finds the nearest items on the shelf. That is the whole idea. Everything else is just the math that makes "close together" mean "similar in meaning."

Keyword search matches words, vector search matches meaning.

Step one: turning data into numbers

Computers are good at numbers, not feelings. So the first job is to turn each piece of text (or image, or sound) into a list of numbers. That list is called an embedding.

An embedding is just an array of decimal numbers, like [0.12, -0.84, 0.33, ...]. A real one might have 384, 768, or even 1536 numbers in it. Each number is called a dimension. You do not read these numbers one by one. What matters is the direction the whole list points to.

The trick is this: a special AI model (an embedding model) is trained so that things with similar meaning get similar lists. The word "king" and the word "queen" end up close. The words "king" and "banana" end up far apart. You never write these rules by hand. The model learned them from huge amounts of text.

Here is the key mental model:

IdeaPlain meaning
EmbeddingA list of numbers that describes meaning
DimensionOne number inside that list
Embedding modelThe AI that creates the list from your data
Vector spaceThe invisible shop where similar things sit close

In .NET, you create embeddings using Microsoft.Extensions.AI. You give it text, and it gives you back a vector.

using Microsoft.Extensions.AI;
 
// embeddingGenerator comes from your AI provider (OpenAI, Azure, Ollama, etc.)
IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator = GetGenerator();
 
string text = "A friendly guide to vector search";
Embedding<float> embedding = await embeddingGenerator.GenerateAsync(text);
 
// The vector is just a list of floats
ReadOnlyMemory<float> vector = embedding.Vector;
Console.WriteLine($"This text became {vector.Length} numbers.");

Step two: measuring "how close"

Once everything is a list of numbers, "similar" becomes "close." But how do we measure closeness between two lists of numbers? We use a distance or similarity measure.

The most popular one is cosine similarity. Do not let the name scare you. It simply measures the angle between two arrows.

  • If two arrows point in almost the same direction, the angle is tiny, and they are very similar.
  • If they point in different directions, the angle is wide, and they are different.

Cosine similarity ignores how long the arrows are. It only cares about direction. This is helpful because a long document and a short sentence can still mean the same thing.

Two vectors pointing the same way are similar; a wide angle means different.

Here are the common measures you will meet:

MeasureWhat it looks atGood when
Cosine similarityAngle between vectorsText meaning (most common)
Dot productAngle and length togetherWhen length carries meaning
Euclidean distanceStraight-line gapSimple geometric closeness

You can even write cosine similarity yourself to see it is not magic. It is school-level math.

static float CosineSimilarity(ReadOnlyMemory<float> a, ReadOnlyMemory<float> b)
{
    var x = a.Span;
    var y = b.Span;
 
    float dot = 0f, magA = 0f, magB = 0f;
    for (int i = 0; i < x.Length; i++)
    {
        dot += x[i] * y[i];
        magA += x[i] * x[i];
        magB += y[i] * y[i];
    }
 
    // 1.0 means identical direction, 0.0 means unrelated
    return dot / (MathF.Sqrt(magA) * MathF.Sqrt(magB));
}

Step three: finding the nearest neighbors

Now imagine you have one million records, each turned into a vector and stored. A user asks a question. You turn the question into a vector too. Then you ask: "Which stored vectors are closest to this question vector?"

The closest ones are called the nearest neighbors. Asking for the top few (say, the closest 5) is called k-nearest-neighbor search, where k is how many you want back.

How a single search request flows

Query text
Query vector
Compare to stored vectors
Top matches

Steps

1

Query text

User types a question

2

Query vector

Embedding model converts it

3

Compare to stored vectors

Measure closeness

4

Top matches

Return nearest neighbors

From a typed question to the most similar results.

Checking every single vector one by one is called an exact search. It is accurate but slow when you have millions of records. So clever data structures group similar vectors together ahead of time. This is called Approximate Nearest Neighbor (ANN) search.

ANN trades a tiny bit of accuracy for a huge boost in speed. It might miss the absolute best match once in a while, but it returns great results in milliseconds instead of seconds. For almost every real app, this trade is worth it.

Search typeSpeedAccuracyBest for
Exact (brute force)Slow on big dataPerfectSmall datasets, tests
Approximate (ANN)Very fastAlmost perfectReal apps, large data

Where the vectors live: vector stores

A place that holds your vectors and runs the nearest-neighbor search for you is called a vector store (or vector database). It does three main jobs:

  1. Store each record together with its vector.
  2. Build a fast index for ANN search.
  3. Answer "find the closest matches" quickly.

Popular options that work with .NET include Qdrant, Azure AI Search, Redis, MongoDB Atlas, and PostgreSQL with the pgvector extension. For learning and small apps, you can even keep everything in memory.

The good news is that Microsoft built a shared abstraction called Microsoft.Extensions.VectorData. It gives you one set of interfaces, so the same C# code works across many different vector databases. You learn it once and switch providers without rewriting your logic.

One .NET abstraction sits over many different vector databases.

Putting it together in .NET

Let's see the shape of a small app. First, you describe your record. You mark which property holds the key, which holds the text, and which holds the vector. The VectorStoreVector attribute tells the store how many dimensions to expect.

using Microsoft.Extensions.VectorData;
 
public sealed class Movie
{
    [VectorStoreKey]
    public int Key { get; set; }
 
    [VectorStoreData]
    public string Title { get; set; } = "";
 
    [VectorStoreData]
    public string Description { get; set; } = "";
 
    // 384 matches the embedding model you choose
    [VectorStoreVector(Dimensions: 384)]
    public ReadOnlyMemory<float> Vector { get; set; }
}

Next, you create the collection, turn each description into a vector, and store the records. Then you run a search by turning the user's question into a vector and asking for the closest matches.

// Get a collection from your chosen vector store provider
var collection = vectorStore.GetCollection<int, Movie>("movies");
await collection.EnsureCollectionExistsAsync();
 
// Add records (each Description is embedded first)
foreach (var movie in movies)
{
    movie.Vector = (await embeddingGenerator
        .GenerateAsync(movie.Description)).Vector;
    await collection.UpsertAsync(movie);
}
 
// Search by meaning, not keywords
var queryVector = (await embeddingGenerator
    .GenerateAsync("a film about brave space travel")).Vector;
 
await foreach (var result in collection.SearchAsync(queryVector, top: 3))
{
    Console.WriteLine($"{result.Record.Title} (score: {result.Score})");
}

Notice that nowhere did we list keywords. We searched for "brave space travel" and the store can return a movie whose description says "a daring journey across the galaxy," because the meanings are close.

The two phases: indexing and querying

It helps to split a vector search system into two phases. The first phase happens once (or whenever data changes). The second phase happens every time a user searches.

Indexing phase vs query phase

Raw data
Embed each item
Store vectors
User question
Embed question
Return matches

Steps

1

Raw data

Your documents or items

2

Embed each item

Create a vector per item

3

Store vectors

Save into the vector store

4

User question

Someone searches

5

Embed question

Turn it into a vector

6

Return matches

Nearest neighbors come back

Build the searchable store first, then answer questions against it.

Keeping these phases clear in your head saves a lot of confusion. Indexing is the slow, batch part. Querying is the fast, live part. Most performance work focuses on making the query phase quick, which is exactly why ANN indexes exist.

Why developers care: RAG and beyond

The biggest reason vector search became popular is Retrieval-Augmented Generation (RAG). Large language models are smart but do not know your private documents. With RAG, you first use vector search to find the few documents most related to a user's question, then hand those documents to the language model so it can answer with real, grounded facts.

// A tiny RAG sketch
var question = "What is our refund policy?";
var qVector = (await embeddingGenerator.GenerateAsync(question)).Vector;
 
// 1. Retrieve: find the closest company documents
var context = new List<string>();
await foreach (var hit in collection.SearchAsync(qVector, top: 4))
{
    context.Add(hit.Record.Description);
}
 
// 2. Augment + Generate: send context to the chat model
string prompt = $"""
    Answer using only this context:
    {string.Join("\n---\n", context)}
 
    Question: {question}
    """;
 
var answer = await chatClient.GetResponseAsync(prompt);
Console.WriteLine(answer.Text);

Beyond RAG, vector search powers product recommendations ("people who liked this also liked..."), image search ("find pictures like this one"), duplicate detection, and smart support tools that find the right help article even when the user uses different words.

A few honest tips

  • Pick one embedding model and stick to it. Vectors from different models are not comparable. If you change models, you must re-embed all your data.
  • Match the dimensions. Your stored vector size must equal the model's output size. A mismatch will throw errors.
  • Clean your text first. Removing junk like HTML tags before embedding gives better matches.
  • Start small and in-memory. Learn the flow first, then move to a real vector database when you need scale.
  • Remember licensing. Some popular .NET libraries (for example, MediatR and MassTransit) are now commercially licensed. The core vector packages from Microsoft, like Microsoft.Extensions.VectorData, are free and open source, but always check the license of any third-party piece you add.

Quick recap

  • Vector search finds results by meaning, not by exact keywords.
  • An embedding turns text (or images) into a list of numbers; similar meanings get similar lists.
  • We measure similarity with cosine similarity (the angle between vectors) and other distance measures.
  • Finding the closest stored vectors is nearest-neighbor search; ANN makes it fast on large data.
  • A vector store holds the vectors and runs the search; Microsoft.Extensions.VectorData gives .NET one common API across many databases.
  • The flow has two phases: indexing (embed and store) and querying (embed the question and return matches).
  • The most common use is RAG, which grounds language models in your own data.

References and further reading

Related Posts