Building Semantic Search With Amazon S3 Vectors and Semantic Kernel
A beginner-friendly guide to building semantic search in .NET using Amazon S3 Vectors for cheap storage and Semantic Kernel for embeddings.
Think about a friendly librarian in your school. You walk up and say, "I want a book about a boy who flies on a broom and goes to a magic school." You never said the title. You never said "Harry Potter". But the librarian smiles and brings you the right book, because she understood the meaning of what you asked.
A normal computer search is not like that librarian. It looks for the exact words you typed. If you type "magic school broom boy" and the book description says "young wizard at Hogwarts", a word-matching search may find nothing.
Semantic search is how we teach our app to act like that kind librarian. It searches by meaning, not by exact words. In this article we will build semantic search in .NET using two tools that work nicely together:
- Amazon S3 Vectors to store the data cheaply.
- Semantic Kernel (and the .NET vector data libraries) to turn text into meaning and to run the search.
We will go slow, use simple words, and build it step by step.
What is an embedding?
Before we search by meaning, we need a way to measure meaning. Computers are good at numbers, not feelings. So we turn each piece of text into a long list of numbers. That list is called an embedding (or a vector).
The clever part is this: text with similar meaning gives numbers that are close together. Text with different meaning gives numbers that are far apart.
Imagine a giant map. "Cat" and "kitten" sit near each other. "Cat" and "rocket" sit far apart. An embedding is just the address of a word or sentence on that map.
A model that creates these numbers is called an embedding model. Examples are OpenAI's text-embedding-3-small or models you run with Amazon Bedrock or Ollama. Each model gives a fixed number of values, like 1536 numbers. That count is called the number of dimensions.
What is Amazon S3 Vectors?
Once we have these number lists, we need a place to keep them. We could use a special vector database. But Amazon released something simpler and cheaper: Amazon S3 Vectors.
Amazon S3 is the famous storage service that holds files (called objects) in the cloud. S3 Vectors adds native support to store and query vectors right inside S3. It is built to be very cheap at huge scale. AWS says it can cut storage and query costs by up to 90 percent compared with running your own vector engine, and it can hold billions of vectors.
It works best when you have a lot of data but do not search every second. Think of a document archive, a media library, or a product catalog. You pay little to keep the vectors, and you pay a small amount each time you search.
Here are the main building blocks in S3 Vectors.
| Term | What it means | Everyday picture |
|---|---|---|
| Vector bucket | A special S3 bucket that holds vector indexes | A big cupboard |
| Vector index | A named place inside the bucket where vectors live | One drawer in the cupboard |
| Vector | One embedding plus a key and some metadata | One labelled card in the drawer |
| Query | A search for the nearest vectors to your input | Asking "which cards are most like this one?" |
The S3 Vectors API gives you a small set of actions. The ones we care about most are listed below.
| API action | What it does |
|---|---|
CreateVectorBucket | Makes a new vector bucket in your AWS region |
CreateIndex | Makes a new index and sets its dimensions and distance type |
PutVectors | Adds up to 500 vectors at a time, each with a key and metadata |
QueryVectors | Finds the nearest vectors to a query vector (the search step) |
GetVectors | Reads vectors back by their keys |
DeleteVectors | Removes vectors you no longer want |
What is Semantic Kernel?
Semantic Kernel is a Microsoft open-source SDK for .NET that helps you add AI features to your apps. For our task, two parts of the .NET AI stack matter:
- An embedding generator, which calls an embedding model and gives you the number list.
- The vector data abstractions in the
Microsoft.Extensions.VectorDatapackage, which give a common, provider-agnostic way to store and search vectors.
The nice thing about these abstractions is that you write your search code once. You can start with a free in-memory store on your laptop, then switch to a real service like Azure AI Search, Qdrant, or your own S3 Vectors store, with very little code change.
How the whole thing fits together
Let us look at the full picture before we write code. There are two journeys. The first is ingestion: we read our documents, turn them into vectors, and save them. The second is search: a user asks a question, we turn the question into a vector, and we find the closest stored vectors.
Ingestion pipeline
Steps
Read
Load text from files or a database
Chunk
Split long text into small pieces
Embed
Turn each chunk into a vector
Store
PutVectors into the S3 index
Search pipeline
Steps
Ask
User types a natural question
Embed
Turn the question into a query vector
Query
QueryVectors finds nearest stored vectors
Rank
Order results by closeness score
Show
Return the matching documents
Step 1: Add the packages
Start a new console app and add the libraries. We need the AWS SDK for S3 Vectors, the .NET vector data abstractions, and an embedding generator. Here we use OpenAI for the embeddings, but you can swap it for Bedrock or Ollama later.
// In your terminal, from the project folder:
// dotnet add package AWSSDK.S3Vectors
// dotnet add package Microsoft.Extensions.VectorData.Abstractions
// dotnet add package Microsoft.Extensions.AI
// dotnet add package Microsoft.Extensions.AI.OpenAI --prerelease
using Amazon.S3Vectors;
using Amazon.S3Vectors.Model;
using Microsoft.Extensions.AI;A small note for honesty: package names in the AI space change often because the tools are young and move fast. Always check the current name on NuGet before you copy. The ideas in this article stay the same even if a name changes.
Step 2: Define what a "document" looks like
We want to store more than just numbers. For each chunk of text we keep an id, the original text (so we can show it back), and the embedding. We will keep the human-readable fields in S3 Vectors metadata, and the embedding as the vector itself.
// A simple record for one searchable chunk.
public sealed record SearchDocument
{
public required string Id { get; init; } // unique key
public required string Title { get; init; } // shown to the user
public required string Text { get; init; } // the chunk content
public required float[] Embedding { get; init; } // the vector
}Why keep the original text? Because the vector is just numbers. When we find a match, we want to show the human the real sentence, not a list of floats.
Step 3: Create the bucket and index
Before storing anything, we make a vector bucket and an index inside it. The index needs two important settings: the dimension count (must match your embedding model) and the distance type (how we measure "closeness", usually cosine).
var s3v = new AmazonS3VectorsClient(); // uses your AWS credentials
await s3v.CreateVectorBucketAsync(new CreateVectorBucketRequest
{
VectorBucketName = "library-search"
});
await s3v.CreateIndexAsync(new CreateIndexRequest
{
VectorBucketName = "library-search",
IndexName = "book-chunks",
Dimension = 1536, // must match the embedding model
DistanceMetric = DistanceMetric.Cosine,
DataType = DataType.Float32
});The dimension must match your model. If your model gives 1536 numbers but your index expects 384, the store will reject the data. Think of it like a key and a lock: the shape has to fit.
Step 4: Turn text into embeddings
Now we create the embedding generator and turn a piece of text into a vector. With Microsoft.Extensions.AI, the interface is small and clean.
IEmbeddingGenerator<string, Embedding<float>> embedder =
new OpenAIClient(apiKey)
.GetEmbeddingClient("text-embedding-3-small")
.AsIEmbeddingGenerator();
// Turn one sentence into its vector.
Embedding<float> result =
await embedder.GenerateAsync("A young wizard goes to a magic school.");
float[] vector = result.Vector.ToArray();
// vector.Length is 1536, matching our index dimension.When you have many chunks, generate their embeddings in batches. That is faster and cheaper than one call per chunk, because each network round trip costs time.
Step 5: Store the vectors (ingestion)
Now we push our vectors into the S3 Vectors index. The PutVectors action takes up to 500 vectors per call. Each vector carries a key, the float data, and optional metadata we can show later.
async Task StoreAsync(IEnumerable<SearchDocument> docs)
{
var items = docs.Select(d => new PutInputVector
{
Key = d.Id,
Data = new VectorData { Float32 = d.Embedding.ToList() },
Metadata = Document.FromString(
$"{{\"title\":\"{d.Title}\",\"text\":\"{d.Text}\"}}")
}).ToList();
await s3v.PutVectorsAsync(new PutVectorsRequest
{
VectorBucketName = "library-search",
IndexName = "book-chunks",
Vectors = items // up to 500 per request
});
}This is the "fill the drawer" step. Each card (vector) has a label (key), a meaning (the floats), and a sticky note (metadata).
Step 6: Search by meaning (query)
Finally, the fun part. A user types a question. We embed the question, then call QueryVectors to find the nearest stored vectors. We ask for the top few results and request the metadata so we can show the title and text.
async Task<IReadOnlyList<(string Key, double Score)>> SearchAsync(string question)
{
// 1. Turn the question into a query vector.
var q = (await embedder.GenerateAsync(question)).Vector.ToArray();
// 2. Ask S3 Vectors for the nearest matches.
var response = await s3v.QueryVectorsAsync(new QueryVectorsRequest
{
VectorBucketName = "library-search",
IndexName = "book-chunks",
QueryVector = new VectorData { Float32 = q.ToList() },
TopK = 5, // return the 5 closest
ReturnDistance = true,
ReturnMetadata = true
});
return response.Vectors
.Select(v => (v.Key, v.Distance ?? 0d))
.ToList();
}A smaller distance means a closer match. With cosine distance, a value near 0 means "almost the same meaning" and a value near 1 means "quite different". So we usually sort by distance and keep the smallest values.
Using the Semantic Kernel vector abstractions
The AWS SDK code above talks to S3 directly. That is good for learning. But in a bigger app you may prefer the provider-agnostic style from Microsoft.Extensions.VectorData. You describe your model with attributes, and the same search code works across many stores.
using Microsoft.Extensions.VectorData;
public sealed class BookChunk
{
[VectorStoreKey]
public required string Id { get; set; }
[VectorStoreData]
public required string Title { get; set; }
[VectorStoreData]
public required string Text { get; set; }
[VectorStoreVector(1536, DistanceFunction = DistanceFunction.CosineSimilarity)]
public required ReadOnlyMemory<float> Embedding { get; set; }
}With a connector that supports this model, you get a VectorStoreCollection and call simple methods like UpsertAsync to save and SearchAsync to find. You can even attach an embedding generator to the collection so it makes the vectors for you during upsert and search. The big win is that you can start with the free InMemoryVectorStore on your laptop, prove your idea works, then point at a real backing store when you scale up.
Why the abstraction helps
Steps
Prototype
Use the in-memory store, no cloud needed
Test
Same code, real embeddings
Scale
Swap to S3 Vectors or another store
Tips for good results
Building the pipeline is only half the job. Here are simple habits that make your search feel smart instead of random.
- Chunk your text well. Do not embed a whole 50-page PDF as one vector. Split it into small, meaningful pieces, maybe a few sentences each. Small chunks give sharper matches.
- Keep dimensions matched. The model dimension and the index dimension must be equal. Write the number down in one place so you never mix it up.
- Store the original text. Always keep the real sentence in metadata so you can show it to the user.
- Batch your embeddings. Generate many at once to save money and time.
- Pick the right distance. Cosine is a safe default for text. Stay consistent between storing and searching.
When is S3 Vectors the right choice?
S3 Vectors shines when you have lots of vectors but query them less often, and you care about cost. Document archives, support knowledge bases, media catalogs, and agent memory are great fits. AWS positions it for retrieval-augmented generation (RAG), agent memory, and semantic search at very large scale and low cost, with sub-second responses for occasional queries.
If you need extremely fast, high-volume queries every second, a purpose-built, in-memory vector engine may suit better. The good news: because we used the .NET vector abstractions, switching later is mostly a config change, not a rewrite.
References and further reading
- Amazon S3 Vectors (AWS user guide)
- Tutorial: Getting started with S3 Vectors (AWS)
- QueryVectors API reference (AWS)
- Amazon S3 Vectors now generally available (AWS blog)
- What are Semantic Kernel Vector Stores? (Microsoft Learn)
- Vector databases for .NET AI apps (Microsoft Learn)
- Generating embeddings for vector store connectors (Microsoft Learn)
- Introducing Microsoft.Extensions.VectorData (.NET Blog)
Quick recap
- Semantic search finds results by meaning, not by exact words, like a kind librarian who understands what you really want.
- An embedding turns text into a list of numbers. Similar meanings give numbers that are close together.
- Amazon S3 Vectors stores and queries vectors right in S3. It is cheap at large scale and great for archives, catalogs, and agent memory.
- Semantic Kernel and
Microsoft.Extensions.VectorDatagive clean .NET tools to make embeddings and run searches with one common API. - The flow has two journeys: ingest (read, chunk, embed, store) and search (ask, embed, query, rank, show).
- Match your dimensions, chunk your text well, keep the original text, batch your embeddings, and use cosine distance for text.
- Using the .NET vector abstractions means you can start small on your laptop and switch stores later with little code change.
Related Posts
Building Resilient Cloud Applications With .NET
Learn to build resilient cloud apps in .NET with retries, timeouts, and circuit breakers using Polly and Microsoft.Extensions.Resilience.
What Is Vector Search? A Concise Guide for .NET Developers
A simple, friendly guide to vector search for .NET developers: embeddings, similarity, nearest neighbors, and how to build it with Microsoft.Extensions.VectorData.
Working With LLMs in .NET Using Microsoft.Extensions.AI
A beginner-friendly guide to calling large language models in .NET with Microsoft.Extensions.AI, using one simple IChatClient interface for any provider.
Top AI Instruments for .NET Developers in 2025
A friendly tour of the best AI tools for .NET developers in 2025: GitHub Copilot, Microsoft.Extensions.AI, Agent Framework, and more.
How to Extract Structured Data From Images Using Ollama in .NET
A beginner-friendly guide to reading text and fields from images using a local Ollama vision model in .NET, returning clean, typed JSON every time.
Build a Multi-Model AI Chat Bot in .NET with ChatGPT and Neon Postgres Branching
Learn to build a multi-model AI chat bot in .NET 10 using ChatGPT and Neon serverless Postgres branching, with simple steps a beginner can follow.