What is the difference between basic and advanced rate limiting?

Basic rate limiting puts one shared limit on your whole API. Advanced rate limiting gives each client their own limit (by user or IP), uses different limits for different plans, chains several rules together, sends a friendly 429 with a Retry-After header, and shares counts across many servers using something like Redis.

How do I give each user their own rate limit?

Use a partitioned limiter. You read a key from each request — like the logged-in user id or the client IP address — and ASP.NET Core keeps a separate counter for each key. So every user gets their own fair limit instead of sharing one big bucket.

Does ASP.NET Core rate limiting work across many servers?

Not on its own. The built-in limiters count in memory, so each server counts by itself. If you run several servers behind a load balancer and need one shared limit, store the counts in a shared place like Redis, using a community library such as RedisRateLimiting.AspNetCore.

What should a good 429 response contain?

A good 429 reply tells the client how long to wait. Add a Retry-After header and a small JSON body (ProblemDetails works well) explaining the limit. This turns a confusing error into a clear, polite message the client can act on.

Are MediatR and MassTransit still free for this?

MediatR and MassTransit have moved to commercial licensing for newer versions. You do not need either of them for rate limiting — ASP.NET Core has rate limiting built in. Just be aware of their licenses if you use them elsewhere in your project.

ASP.NETadvanced

Advanced Rate Limiting Use Cases in .NET: A Friendly Deep Dive

Go beyond the basics of ASP.NET Core rate limiting: per-user limits, chained limiters, friendly 429 responses, Redis for many servers, and tier-based rules.

12 min readUpdated February 6, 2026

The water tank in an Indian home

In many Indian homes, water comes only for a few hours a day. So families store it in an overhead tank. A tap at the bottom lets water flow out at a steady speed. Even if everyone opens taps at once, the tank gives out water slowly and fairly, so it does not empty in one rush and nobody is left dry.

Your API server is like that tank. Requests are the water. If everyone pulls hard at the same time, the tank empties, the server slows down, and everyone suffers. Basic rate limiting is one tap for the whole house. But real homes need more care — a separate tap per family, a bigger share for the family that pays more, and a polite notice that says "water will come back in 10 minutes."

That extra care is what we call advanced rate limiting. In this guide we go past the simple one-size-fits-all limit and build rules that are fair, friendly, and ready for real traffic on .NET 10.

New to rate limiting? Start with the basics in Rate Limiting in ASP.NET Core: A Simple, Complete Guide and then come back here.

What "advanced" really means

A basic limiter says: "This whole API allows 100 requests per minute." That is a single shared bucket. It is easy, but it has problems. One noisy client can use up the whole limit and block everyone else. And a paying customer gets the same small share as a free, anonymous visitor.

Advanced rate limiting fixes this by adding five ideas. We will build each one step by step.

From Basic to Advanced Rate Limiting

Partition

Tiers

Chain

Friendly 429

Distributed

Steps

Partition

One counter per user or IP

Tiers

Bigger limits for paid plans

Chain

Several rules together

Friendly 429

Retry-After + clear body

Distributed

Shared counts via Redis

Each step makes your limits fairer and more production-ready.

Idea 1: Give every client its own limit (partitioning)

The most important upgrade is partitioning. Instead of one big bucket for everybody, you give each client their own bucket. The bucket is chosen by a key you read from the request — usually the logged-in user id, or the IP address for anonymous visitors.

Think of it like the water tank giving each family their own tap. One greedy family cannot drain the share of the others.

Figure 1: One incoming request is sorted into its own bucket by a key, so clients never steal each other's share.

In ASP.NET Core you build this with RateLimitPartition. You pick a key for each request, and the framework keeps a separate counter for each key automatically.

using System.Threading.RateLimiting;
 
builder.Services.AddRateLimiter(options =>
{
    // A global limiter that chooses a bucket per request.
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
        httpContext =>
        {
            // Logged-in users get a bucket by their name.
            // Everyone else shares a bucket by their IP address.
            var key = httpContext.User.Identity?.IsAuthenticated == true
                ? httpContext.User.Identity!.Name!
                : httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
 
            return RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: key,
                factory: _ => new FixedWindowRateLimiterOptions
                {
                    PermitLimit = 100,
                    Window = TimeSpan.FromMinutes(1)
                });
        });
 
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});

One safety note: always validate the key. An empty or huge key can let an attacker create endless buckets and waste memory. Fall back to a safe value like "unknown" when the key is missing, as shown above.

Idea 2: Different limits for different plans (tiers)

Real products have plans. A free user might get 60 requests a minute, while a premium user gets 1,000. This is easy once you have partitioning: you simply choose a different limit based on who the client is.

Here is a small table of the kind of plan you might offer.

Plan	Requests per minute	Burst allowed	Best for
Free	60	Small	Trying the API
Basic	600	Medium	Small apps
Premium	5,000	Large	Busy production apps
Internal	No limit	N/A	Your own services

You read the plan (often from a claim on the user, or an API key) and return a partition with the matching numbers.

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
{
    var user = httpContext.User;
    var key = user.Identity?.Name ?? httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
 
    // Read the plan from a claim. Default to "free" if missing.
    var plan = user.FindFirst("plan")?.Value ?? "free";
 
    var permit = plan switch
    {
        "premium" => 5000,
        "basic"   => 600,
        _         => 60   // free
    };
 
    return RateLimitPartition.GetTokenBucketLimiter(
        partitionKey: $"{plan}:{key}",
        factory: _ => new TokenBucketRateLimiterOptions
        {
            TokenLimit = permit,
            TokensPerPeriod = permit,
            ReplenishmentPeriod = TimeSpan.FromMinutes(1),
            AutoReplenishment = true,
            QueueLimit = 0
        });
});

We used a token bucket here. It lets a client spend tokens quickly for a short burst, then refills them at a steady rate. That feels nicer than a hard cliff for paying customers who sometimes need a quick burst.

Idea 3: Chain several rules together

Sometimes one rule is not enough. You might want a per-second rule to stop sudden spikes and a per-minute rule to keep the long-term average fair. ASP.NET Core lets you combine limiters with PartitionedRateLimiter.CreateChained. The request must pass every rule in the chain.

Figure 2: A chained limiter runs each rule in order. The request only passes if it clears all of them.

options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
    // Rule 1: stop quick spikes — at most 10 requests a second.
    PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
        RateLimitPartition.GetFixedWindowLimiter(
            GetKey(ctx),
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 10,
                Window = TimeSpan.FromSeconds(1)
            })),
 
    // Rule 2: keep the long-term average fair — 200 a minute.
    PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
        RateLimitPartition.GetFixedWindowLimiter(
            GetKey(ctx),
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 200,
                Window = TimeSpan.FromMinutes(1)
            }))
);
 
// Helper to read the same key for every rule.
static string GetKey(HttpContext ctx) =>
    ctx.User.Identity?.Name
    ?? ctx.Connection.RemoteIpAddress?.ToString()
    ?? "unknown";

This pairing is a common and powerful pattern. The per-second rule absorbs short bursts; the per-minute rule protects the steady flow.

Idea 4: Make the 429 reply friendly

A bare 429 with no details is confusing. A good limiter tells the client how long to wait. You do this with the OnRejected callback. Add a Retry-After header and a small JSON body so the client knows exactly what happened.

options.OnRejected = async (context, cancellationToken) =>
{
    context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
 
    // If the limiter told us how long to wait, share it.
    if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
    {
        context.HttpContext.Response.Headers.RetryAfter =
            ((int)retryAfter.TotalSeconds).ToString();
    }
 
    await context.HttpContext.Response.WriteAsJsonAsync(new
    {
        type = "https://httpstatuses.com/429",
        title = "Too many requests",
        status = 429,
        detail = "You have sent too many requests. Please slow down and try again soon."
    }, cancellationToken);
};

The table below shows what separates a toy limiter from a production one.

Detail	Toy limiter	Production limiter
Status code	429 only	429 with reason
Retry-After header	Missing	Present
Response body	Empty	Clear JSON (ProblemDetails)
Logging	None	Structured log per rejection
Across servers	In-memory only	Shared store (Redis)

Always log rejections too. If you suddenly see thousands of 429s for one key, that is a useful signal — maybe an attack, maybe a buggy client, maybe a limit that is too tight.

Here is a trap that surprises many teams. The built-in limiters count in memory. If you run three copies of your app behind a load balancer, each copy counts on its own. A "100 per minute" limit quietly becomes "300 per minute" because each server allows 100.

Figure 3: With in-memory counters, three servers each allow the full limit, so the real total is triple what you wanted.

The fix is a shared store that all servers read and write. Redis is the common choice. ASP.NET Core has no built-in Redis backplane, but the community package RedisRateLimiting.AspNetCore plugs into the same AddRateLimiter API and keeps the counts in Redis. Then all servers agree on one number.

Distributed Rate Limiting with Redis

Client

Any server

Redis

Decision

Steps

Client

Sends a request

Any server

Asks Redis for the count

Redis

Holds one shared counter

Decision

Allow or 429 fairly

Every server checks the same shared counter, so the limit is correct no matter how many servers run.

// After: dotnet add package RedisRateLimiting.AspNetCore
using RedisRateLimiting;
using RedisRateLimiting.AspNetCore;
using StackExchange.Redis;
 
var redis = ConnectionMultiplexer.Connect("localhost:6379");
 
builder.Services.AddRateLimiter(options =>
{
    options.AddRedisFixedWindowLimiter("by-user", opt =>
    {
        opt.ConnectionMultiplexerFactory = () => redis;
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
    });
});

A few honest trade-offs. Talking to Redis adds a tiny bit of delay to every request, and your limiter now depends on Redis being up. Plan for what happens if Redis is unreachable — usually you "fail open" (let requests through) so a Redis hiccup does not take down your whole API. For a deeper look at distributed counting, see the Microsoft Learn rate limiting docs.

Putting it together: the request journey

Let us trace one request through a full advanced setup, from arrival to response.

Figure 4: The full journey — pick a key, find the plan, check the shared count, then either run the API or send a friendly 429.

Picking the right algorithm for the job

Each built-in algorithm has a personality. Here is a quick guide for advanced setups.

Fixed window — simplest. Good for internal tools. Watch out for the "boundary burst" where a client doubles up across the reset moment.
Sliding window — fairer than fixed window. A great default for public APIs because it smooths out those boundary bursts.
Token bucket — best for paid tiers. Allows short bursts but keeps a steady average. Friendly for real apps.
Concurrency — limits how many requests run at the same time, not per minute. Use it for heavy endpoints like report generation.

A common mix in production: a token bucket per user for general traffic, plus a concurrency limiter on a few expensive endpoints. You can attach named policies to specific routes so one slow endpoint cannot hog the whole server.

// A named concurrency policy for one heavy endpoint.
options.AddConcurrencyLimiter("heavy-reports", opt =>
{
    opt.PermitLimit = 5;     // only 5 reports build at once
    opt.QueueLimit = 10;     // 10 more may wait in line
    opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});
 
// Then in your endpoint mapping:
// app.MapGet("/reports", BuildReport).RequireRateLimiting("heavy-reports");

A note on libraries and licensing

You do not need any extra library to do most of this — rate limiting is built into ASP.NET Core. For distributed counting you may add RedisRateLimiting.AspNetCore, which is open source and free.

One thing worth knowing for your wider project: MediatR and MassTransit have moved to commercial licensing for their newer versions. They are popular for messaging and request pipelines, but they are not required for rate limiting. If you already use them elsewhere, check their license terms. For rate limiting itself, the built-in tools and a Redis package are all you need.

Common mistakes to avoid

Forgetting many servers. In-memory counts triple when you scale to three servers. Use Redis if you need one true limit.
Unbounded keys. Always validate the partition key. A missing or attacker-chosen key can flood your memory with buckets.
Silent 429s. Without Retry-After and a body, clients keep hammering blindly. Be clear and polite.
One limit for everyone. Mixing free and paying users in one bucket is unfair. Partition and tier.
No logging. Without logs you cannot tell an attack from a normal busy day.

Quick recap

Partition so each user or IP gets its own fair bucket instead of one shared limit.
Tier your limits so paying customers get a bigger, fairer share.
Chain a per-second rule with a per-minute rule to catch both spikes and steady overuse.
Be friendly on 429: add a Retry-After header and a clear JSON body, and log every rejection.
Go distributed with Redis when you run more than one server, or your real limit silently multiplies.
Pick the right algorithm: token bucket for tiers, sliding window for public APIs, concurrency for heavy endpoints.
You do not need MediatR or MassTransit for this; rate limiting is built into ASP.NET Core. (Both are now commercially licensed in newer versions.)

Advanced Rate Limiting Use Cases in .NET: A Friendly Deep Dive

The water tank in an Indian home

What "advanced" really means

From Basic to Advanced Rate Limiting

Idea 1: Give every client its own limit (partitioning)

Idea 2: Different limits for different plans (tiers)

Idea 3: Chain several rules together

Idea 4: Make the 429 reply friendly

Distributed Rate Limiting with Redis

Putting it together: the request journey

Picking the right algorithm for the job

A note on libraries and licensing

Common mistakes to avoid

Quick recap

References and further reading

Related Posts

Rate Limiting in ASP.NET Core: A Simple, Complete Guide

API Key Authentication in ASP.NET Core: The Secure Way

Caching in ASP.NET Core: Make Your App Fast (The Easy Way)

Top 15 Mistakes Developers Make When Creating Web APIs

Authentication and Authorization Best Practices in ASP.NET Core

ASP.NET Core Output Cache: Speed Up Your API with In-Memory and Redis

The water tank in an Indian home

What "advanced" really means

From Basic to Advanced Rate Limiting

Idea 1: Give every client its own limit (partitioning)

Idea 2: Different limits for different plans (tiers)

Idea 3: Chain several rules together

Idea 4: Make the 429 reply friendly

Idea 5: Share counts across many servers (distributed)

Distributed Rate Limiting with Redis

Putting it together: the request journey

Picking the right algorithm for the job

A note on libraries and licensing

Common mistakes to avoid

Quick recap

References and further reading

Related Posts

Rate Limiting in ASP.NET Core: A Simple, Complete Guide

API Key Authentication in ASP.NET Core: The Secure Way

Caching in ASP.NET Core: Make Your App Fast (The Easy Way)

Top 15 Mistakes Developers Make When Creating Web APIs

Authentication and Authorization Best Practices in ASP.NET Core

ASP.NET Core Output Cache: Speed Up Your API with In-Memory and Redis