Why build a cache by hand when .NET already has MemoryCache?

MemoryCache is great for most apps and you should usually use it. But building one by hand teaches you how caches really work: how keys are stored, how old items are removed, and how to stay safe when many threads run at once. It also gives you a tiny, dependency-free cache for hot paths where you want full control and zero extra packages.

What is an LRU cache?

LRU stands for Least Recently Used. When the cache is full and you need room for a new item, you throw away the item that has not been used for the longest time. The idea is simple: things you used recently are the things you will probably use again soon, so keep those and drop the rest.

How do I make my cache thread-safe?

Use a ConcurrentDictionary for fast key lookups, and protect the usage-order list with a single lock so two threads never reorder it at the same time. The dictionary handles concurrent reads and writes for you, and the short lock keeps the linked list correct. Keep the locked section tiny so threads do not wait long.

Is a hand-built cache faster than MemoryCache?

It can be for very specific hot paths, because you remove features you do not need and avoid extra allocations. Libraries like BitFaster.Caching show a well-tuned custom design can be several times faster than MemoryCache. But for general use, MemoryCache is safer and easier. Measure before you replace it.

Fundamentalsintermediate

How to Build a High-Performance Cache in C# Without External Libraries

Build a fast, thread-safe, size-limited LRU cache in C# using only the .NET base class library. Clear diagrams, code, and student-friendly explanations.

14 min readUpdated February 15, 2026

A school library with limited shelf space

Picture the small library in your school. There is one front shelf right next to the librarian's desk. It can hold only twenty books. Behind a locked door, far away, is the giant storeroom with every book the school owns.

When a student asks for a popular book, the librarian wants it on the front shelf so she can hand it over in one second. But the front shelf is tiny. So she follows a simple rule: keep the most recently borrowed books on the front shelf. When a new book becomes popular and the shelf is full, she removes the book that nobody has touched for the longest time and sends it back to the storeroom.

That front shelf is a cache. It is small and fast. The storeroom is your slow data source — a database, a web API, or a heavy calculation. The librarian's rule for deciding which book to remove is called LRU, short for Least Recently Used.

This article will build exactly that librarian, in C#, using only the tools that ship with .NET. No NuGet packages. By the end you will have a small, fast, thread-safe cache and you will understand every line.

What "high performance" really means here

A cache has two main jobs, and both must be fast:

Get — look up a key and return its value if present.
Put — store a key and value, and if the cache is full, remove the least recently used item.

We want both of these to run in roughly constant time — written as O(1). That means the time taken does not grow as the cache gets bigger. A cache with ten items and a cache with ten million items should both answer in about the same tiny amount of time.

Figure 1: The two operations a cache must do quickly — Get on a hit returns instantly, and Put may evict the oldest item when the cache is full.

To hit O(1), we need to pick the right data structures. This is the heart of the whole article, so let us go slowly.

The two data structures that make it work

No single built-in collection gives us everything we want, so we combine two.

1. A dictionary for instant lookups. A Dictionary (or its thread-safe cousin ConcurrentDictionary) finds a value by key in O(1). This solves the Get problem. But a dictionary does not remember the order in which items were used, so it cannot tell us which item is "least recently used."

2. A doubly linked list for usage order. A LinkedList<T> lets us add to one end and remove from either end in O(1), and — crucially — remove a node from the middle in O(1) if we already hold a reference to that node. We keep the most recently used item at the front and the least recently used item at the back.

The trick is to make them point at each other. The dictionary does not store the value directly; it stores a reference to the linked-list node that holds the value. That way, the moment we find a key, we also have the exact node we need to move.

Need	Plain `Dictionary`	Plain `LinkedList`	Combined design
Find value by key	O(1) ✅	O(n) ❌	O(1) ✅
Know usage order	No ❌	Yes ✅	Yes ✅
Move an item to "most recent"	No ❌	O(1) with node ✅	O(1) ✅
Find and evict the oldest	No ❌	O(1) (back) ✅	O(1) ✅

Here is the same idea as a picture. The dictionary on the left points into nodes of the list on the right.

Figure 2: The dictionary maps each key to its node in the usage list. Front of the list = most recently used; back of the list = next to be evicted.

When we read key B, we move node B to the front. Now A and C shift back one place. If the cache is full and we add a new key, C (at the back) gets evicted first because it has gone the longest without being touched.

Step 1: A simple single-thread LRU cache

Let us first build a version that works correctly when only one thread uses it. We will add thread safety in the next step. Starting simple makes the logic easy to see.

using System.Collections.Generic;
 
public sealed class SimpleLruCache<TKey, TValue> where TKey : notnull
{
    private readonly int _capacity;
    private readonly Dictionary<TKey, LinkedListNode<CacheItem>> _map;
    private readonly LinkedList<CacheItem> _usage;
 
    // Each node stores both key and value.
    // We need the key so eviction can remove it from the dictionary too.
    private sealed record CacheItem(TKey Key, TValue Value);
 
    public SimpleLruCache(int capacity)
    {
        if (capacity <= 0)
            throw new ArgumentOutOfRangeException(nameof(capacity));
 
        _capacity = capacity;
        _map = new Dictionary<TKey, LinkedListNode<CacheItem>>(capacity);
        _usage = new LinkedList<CacheItem>();
    }
 
    public bool TryGet(TKey key, out TValue value)
    {
        if (_map.TryGetValue(key, out var node))
        {
            // Touch it: move to the front (most recently used).
            _usage.Remove(node);
            _usage.AddFirst(node);
            value = node.Value.Value;
            return true;
        }
 
        value = default!;
        return false;
    }
 
    public void Put(TKey key, TValue value)
    {
        if (_map.TryGetValue(key, out var existing))
        {
            // Update the value and mark it as recently used.
            _usage.Remove(existing);
        }
        else if (_map.Count >= _capacity)
        {
            // Cache is full. Evict the least recently used (back of list).
            var lru = _usage.Last!;
            _usage.RemoveLast();
            _map.Remove(lru.Value.Key);
        }
 
        var node = new LinkedListNode<CacheItem>(new CacheItem(key, value));
        _usage.AddFirst(node);
        _map[key] = node;
    }
}

Read that slowly. Notice the one detail people miss: we store the key inside the node. Why? When we evict the oldest node, we have the node, but we also need to delete its entry from the dictionary. Without the key stored in the node, we would not know which dictionary entry to remove. That tiny choice is what keeps eviction at O(1).

How a Put flows when the cache is full

It helps to walk the eviction path as a sequence of clear steps.

Put into a full cache

Lookup

Full?

Evict

Insert

Steps

Lookup

Check if key already exists

Full?

Count has reached capacity

Evict

Remove last list node + its map entry

Insert

Add new node at the front

What happens, in order, when you add a new key but every slot is taken.

The order matters. We evict before we insert, so the cache never grows past its capacity even for a single moment. This keeps memory use predictable, which is one of the biggest reasons to build your own cache: a plain dictionary used as a cache can grow forever and crash your app with an out-of-memory error.

Step 2: Making it thread-safe

Real applications have many threads hitting the cache at once: web requests, background jobs, timers. If two threads change the linked list at the same time, the list can become corrupt — nodes point at the wrong neighbours, and your app crashes or returns garbage.

There are two common ways to add safety. We will use a practical mix that the well-known BitFaster.Caching library also leans on in spirit: a ConcurrentDictionary for the key lookups, plus a single short lock around the list reordering.

using System.Collections.Concurrent;
using System.Collections.Generic;
 
public sealed class LruCache<TKey, TValue> where TKey : notnull
{
    private readonly int _capacity;
    private readonly ConcurrentDictionary<TKey, LinkedListNode<CacheItem>> _map;
    private readonly LinkedList<CacheItem> _usage;
    private readonly object _listLock = new();
 
    private sealed record CacheItem(TKey Key, TValue Value);
 
    public LruCache(int capacity)
    {
        if (capacity <= 0)
            throw new ArgumentOutOfRangeException(nameof(capacity));
 
        _capacity = capacity;
        _map = new ConcurrentDictionary<TKey, LinkedListNode<CacheItem>>();
        _usage = new LinkedList<CacheItem>();
    }
 
    public bool TryGet(TKey key, out TValue value)
    {
        if (_map.TryGetValue(key, out var node))
        {
            // Reordering the list must be protected by the lock.
            lock (_listLock)
            {
                // The node may have been evicted between lookup and lock.
                if (node.List is not null)
                {
                    _usage.Remove(node);
                    _usage.AddFirst(node);
                }
            }
 
            value = node.Value.Value;
            return true;
        }
 
        value = default!;
        return false;
    }
 
    public void Put(TKey key, TValue value)
    {
        var newNode = new LinkedListNode<CacheItem>(new CacheItem(key, value));
 
        lock (_listLock)
        {
            if (_map.TryGetValue(key, out var existing))
            {
                _usage.Remove(existing);
                _map.TryRemove(key, out _);
            }
            else if (_map.Count >= _capacity)
            {
                var lru = _usage.Last;
                if (lru is not null)
                {
                    _usage.RemoveLast();
                    _map.TryRemove(lru.Value.Key, out _);
                }
            }
 
            _usage.AddFirst(newNode);
            _map[key] = newNode;
        }
    }
}

A few important notes:

The dictionary read in TryGet happens outside the lock. That is the fast path. Most cache calls are reads, and ConcurrentDictionary lets many threads read at the same time without blocking.
We only take the lock to reorder the list. The locked section is tiny — just a remove and an add. Short locks mean threads barely wait.
We check node.List is not null after taking the lock. Between our lookup and our lock, another thread could have evicted that node. Checking protects us from touching a node that is no longer in the list.

Figure 3: Two threads hitting the cache. Reads share the dictionary freely; only the brief list reorder is serialized by the lock.

Choosing a good capacity

The cache only helps if the right items stay in it. Two numbers guide your choice of capacity:

If capacity is...	What happens	When it is right
Too small	Items get evicted before they are reused, so you keep missing	Almost never on purpose
Just right	Hot items stay; cold items leave; high hit rate	The goal
Too large	You waste memory holding items nobody asks for	When memory is cheap and misses are very costly

A common starting point is to size the cache to hold your hot set — the small group of items that are asked for again and again. In many systems, a tiny fraction of keys gets the vast majority of requests. If you can fit that hot fraction, your hit rate climbs sharply and your slow data source gets touched far less.

Step 3: A handy "get or add" helper

In real code you rarely call TryGet and Put separately. You usually want: "give me this value; if it is missing, compute it once and remember it." This is the cache-aside pattern wrapped into one method.

GetOrAdd (cache-aside)

Try cache

Hit

Miss

Build + store

Steps

Try cache

Look up the key

Hit

Return the stored value

Miss

Run the factory function

Build + store

Put the new value, then return it

One method that returns a cached value or builds it on a miss.

public TValue GetOrAdd(TKey key, Func<TKey, TValue> factory)
{
    if (TryGet(key, out var existing))
        return existing;
 
    // Miss: build the value once and store it.
    var value = factory(key);
    Put(key, value);
    return value;
}

This version is simple and correct, but be honest about one limit: if two threads miss the same key at the same moment, both may run factory. For a cheap calculation that is fine. For an expensive one — like a database call — you would add stampede protection so only one thread builds the value while the others wait. The grown-up libraries do this for you, and it is the main reason people eventually reach for MemoryCache, HybridCache, or a tuned library instead of rolling their own forever.

When to build your own, and when not to

Building your own cache is a wonderful way to learn, and it is genuinely useful for tiny, hot, dependency-free paths. But be fair about the trade-offs.

Situation	Build your own	Use built-in / library
Learning how caches work	✅ Yes	—
A tiny hot path, zero dependencies wanted	✅ Yes	—
You need expiry by time, callbacks, size limits	—	✅ `MemoryCache`
Shared cache across many servers	—	✅ Distributed cache / Redis
You need top speed and stampede protection	—	✅ Tuned library (e.g. BitFaster)

The honest summary: our hand-built LruCache is fast, correct, and small. For most apps, the framework's Microsoft.Extensions.Caching.Memory.MemoryCache is the safe default because it adds time-based expiry, eviction callbacks, and size limits without any work from you. Reach for a specialized library only after you have measured a real bottleneck.

A quick mental model of the whole thing

Figure 4: The full life of a key — born on a miss, kept alive by reuse, and eventually evicted when it goes cold.

That is the entire idea. A key enters when it is first requested. Every time it is used, it jumps to the front and stays warm. If the rest of the world keeps asking for other keys, it slowly drifts to the back and gets evicted. Simple rules, big speed-up.

References and further reading

Caching in .NET — Microsoft Learn — the official overview of caching options in modern .NET.
ConcurrentDictionary Class — Microsoft Learn — the thread-safe dictionary we used for fast lookups.
LinkedList<T> Class — Microsoft Learn — the doubly linked list that gives us O(1) reordering.
BitFaster.Caching on GitHub — a high-performance, thread-safe caching library; its ConcurrentLru shows how a tuned design beats MemoryCache on speed.
ConcurrentLru wiki — BitFaster.Caching — explains the no-background-thread, low-overhead approach to LRU.

Quick recap

A cache is a small, fast store in front of a slow data source — like a librarian's front shelf in front of the storeroom.
LRU (Least Recently Used) keeps recently used items and evicts the one untouched for the longest time.
To hit O(1), combine a dictionary (fast key lookup) with a doubly linked list (usage order). The dictionary points at list nodes.
Store the key inside each node so eviction can remove it from both the list and the dictionary in O(1).
For thread safety, use a ConcurrentDictionary for lookups and a short lock only around the list reorder.
Pick a capacity that fits your hot set so your hit rate stays high without wasting memory.
A GetOrAdd helper gives you the clean cache-aside pattern; add stampede protection if the factory is expensive.
Build your own to learn or for tiny hot paths; reach for MemoryCache or a tuned library for time expiry, sharing across servers, and stampede protection — after you measure.

How to Build a High-Performance Cache in C# Without External Libraries

A school library with limited shelf space

What "high performance" really means here

The two data structures that make it work

Step 1: A simple single-thread LRU cache

How a Put flows when the cache is full

Put into a full cache

Step 2: Making it thread-safe

Choosing a good capacity

Step 3: A handy "get or add" helper

GetOrAdd (cache-aside)

When to build your own, and when not to

A quick mental model of the whole thing

References and further reading

Quick recap

Related Posts

Caching in ASP.NET Core: Make Your App Fast (The Easy Way)

Building High-Performance .NET Apps With C# Channels

5 Ways to Check for Duplicates in C# Collections

Scaling Monoliths: A Practical Guide for Growing .NET Systems

8 Tips to Write Clean Code in C# and .NET

SOLID Principles in C# and .NET: A Beginner-Friendly Guide