Skip to main content
SEMastery
ASP.NETadvanced

Advanced Rate Limiting Use Cases in .NET: A Friendly Deep Dive

Go beyond the basics of ASP.NET Core rate limiting: per-user limits, chained limiters, friendly 429 responses, Redis for many servers, and tier-based rules.

12 min readUpdated February 6, 2026

The water tank in an Indian home

In many Indian homes, water comes only for a few hours a day. So families store it in an overhead tank. A tap at the bottom lets water flow out at a steady speed. Even if everyone opens taps at once, the tank gives out water slowly and fairly, so it does not empty in one rush and nobody is left dry.

Your API server is like that tank. Requests are the water. If everyone pulls hard at the same time, the tank empties, the server slows down, and everyone suffers. Basic rate limiting is one tap for the whole house. But real homes need more care — a separate tap per family, a bigger share for the family that pays more, and a polite notice that says "water will come back in 10 minutes."

That extra care is what we call advanced rate limiting. In this guide we go past the simple one-size-fits-all limit and build rules that are fair, friendly, and ready for real traffic on .NET 10.

New to rate limiting? Start with the basics in Rate Limiting in ASP.NET Core: A Simple, Complete Guide and then come back here.

What "advanced" really means

A basic limiter says: "This whole API allows 100 requests per minute." That is a single shared bucket. It is easy, but it has problems. One noisy client can use up the whole limit and block everyone else. And a paying customer gets the same small share as a free, anonymous visitor.

Advanced rate limiting fixes this by adding five ideas. We will build each one step by step.

From Basic to Advanced Rate Limiting

Partition
Tiers
Chain
Friendly 429
Distributed

Steps

1

Partition

One counter per user or IP

2

Tiers

Bigger limits for paid plans

3

Chain

Several rules together

4

Friendly 429

Retry-After + clear body

5

Distributed

Shared counts via Redis

Each step makes your limits fairer and more production-ready.

Idea 1: Give every client its own limit (partitioning)

The most important upgrade is partitioning. Instead of one big bucket for everybody, you give each client their own bucket. The bucket is chosen by a key you read from the request — usually the logged-in user id, or the IP address for anonymous visitors.

Think of it like the water tank giving each family their own tap. One greedy family cannot drain the share of the others.

Figure 1: One incoming request is sorted into its own bucket by a key, so clients never steal each other's share.

In ASP.NET Core you build this with RateLimitPartition. You pick a key for each request, and the framework keeps a separate counter for each key automatically.

using System.Threading.RateLimiting;
 
builder.Services.AddRateLimiter(options =>
{
    // A global limiter that chooses a bucket per request.
    options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(
        httpContext =>
        {
            // Logged-in users get a bucket by their name.
            // Everyone else shares a bucket by their IP address.
            var key = httpContext.User.Identity?.IsAuthenticated == true
                ? httpContext.User.Identity!.Name!
                : httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
 
            return RateLimitPartition.GetFixedWindowLimiter(
                partitionKey: key,
                factory: _ => new FixedWindowRateLimiterOptions
                {
                    PermitLimit = 100,
                    Window = TimeSpan.FromMinutes(1)
                });
        });
 
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
});

One safety note: always validate the key. An empty or huge key can let an attacker create endless buckets and waste memory. Fall back to a safe value like "unknown" when the key is missing, as shown above.

Idea 2: Different limits for different plans (tiers)

Real products have plans. A free user might get 60 requests a minute, while a premium user gets 1,000. This is easy once you have partitioning: you simply choose a different limit based on who the client is.

Here is a small table of the kind of plan you might offer.

PlanRequests per minuteBurst allowedBest for
Free60SmallTrying the API
Basic600MediumSmall apps
Premium5,000LargeBusy production apps
InternalNo limitN/AYour own services

You read the plan (often from a claim on the user, or an API key) and return a partition with the matching numbers.

options.GlobalLimiter = PartitionedRateLimiter.Create<HttpContext, string>(httpContext =>
{
    var user = httpContext.User;
    var key = user.Identity?.Name ?? httpContext.Connection.RemoteIpAddress?.ToString() ?? "unknown";
 
    // Read the plan from a claim. Default to "free" if missing.
    var plan = user.FindFirst("plan")?.Value ?? "free";
 
    var permit = plan switch
    {
        "premium" => 5000,
        "basic"   => 600,
        _         => 60   // free
    };
 
    return RateLimitPartition.GetTokenBucketLimiter(
        partitionKey: $"{plan}:{key}",
        factory: _ => new TokenBucketRateLimiterOptions
        {
            TokenLimit = permit,
            TokensPerPeriod = permit,
            ReplenishmentPeriod = TimeSpan.FromMinutes(1),
            AutoReplenishment = true,
            QueueLimit = 0
        });
});

We used a token bucket here. It lets a client spend tokens quickly for a short burst, then refills them at a steady rate. That feels nicer than a hard cliff for paying customers who sometimes need a quick burst.

Idea 3: Chain several rules together

Sometimes one rule is not enough. You might want a per-second rule to stop sudden spikes and a per-minute rule to keep the long-term average fair. ASP.NET Core lets you combine limiters with PartitionedRateLimiter.CreateChained. The request must pass every rule in the chain.

Figure 2: A chained limiter runs each rule in order. The request only passes if it clears all of them.
options.GlobalLimiter = PartitionedRateLimiter.CreateChained(
    // Rule 1: stop quick spikes — at most 10 requests a second.
    PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
        RateLimitPartition.GetFixedWindowLimiter(
            GetKey(ctx),
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 10,
                Window = TimeSpan.FromSeconds(1)
            })),
 
    // Rule 2: keep the long-term average fair — 200 a minute.
    PartitionedRateLimiter.Create<HttpContext, string>(ctx =>
        RateLimitPartition.GetFixedWindowLimiter(
            GetKey(ctx),
            _ => new FixedWindowRateLimiterOptions
            {
                PermitLimit = 200,
                Window = TimeSpan.FromMinutes(1)
            }))
);
 
// Helper to read the same key for every rule.
static string GetKey(HttpContext ctx) =>
    ctx.User.Identity?.Name
    ?? ctx.Connection.RemoteIpAddress?.ToString()
    ?? "unknown";

This pairing is a common and powerful pattern. The per-second rule absorbs short bursts; the per-minute rule protects the steady flow.

Idea 4: Make the 429 reply friendly

A bare 429 with no details is confusing. A good limiter tells the client how long to wait. You do this with the OnRejected callback. Add a Retry-After header and a small JSON body so the client knows exactly what happened.

options.OnRejected = async (context, cancellationToken) =>
{
    context.HttpContext.Response.StatusCode = StatusCodes.Status429TooManyRequests;
 
    // If the limiter told us how long to wait, share it.
    if (context.Lease.TryGetMetadata(MetadataName.RetryAfter, out var retryAfter))
    {
        context.HttpContext.Response.Headers.RetryAfter =
            ((int)retryAfter.TotalSeconds).ToString();
    }
 
    await context.HttpContext.Response.WriteAsJsonAsync(new
    {
        type = "https://httpstatuses.com/429",
        title = "Too many requests",
        status = 429,
        detail = "You have sent too many requests. Please slow down and try again soon."
    }, cancellationToken);
};

The table below shows what separates a toy limiter from a production one.

DetailToy limiterProduction limiter
Status code429 only429 with reason
Retry-After headerMissingPresent
Response bodyEmptyClear JSON (ProblemDetails)
LoggingNoneStructured log per rejection
Across serversIn-memory onlyShared store (Redis)

Always log rejections too. If you suddenly see thousands of 429s for one key, that is a useful signal — maybe an attack, maybe a buggy client, maybe a limit that is too tight.

Idea 5: Share counts across many servers (distributed)

Here is a trap that surprises many teams. The built-in limiters count in memory. If you run three copies of your app behind a load balancer, each copy counts on its own. A "100 per minute" limit quietly becomes "300 per minute" because each server allows 100.

Figure 3: With in-memory counters, three servers each allow the full limit, so the real total is triple what you wanted.

The fix is a shared store that all servers read and write. Redis is the common choice. ASP.NET Core has no built-in Redis backplane, but the community package RedisRateLimiting.AspNetCore plugs into the same AddRateLimiter API and keeps the counts in Redis. Then all servers agree on one number.

Distributed Rate Limiting with Redis

Client
Any server
Redis
Decision

Steps

1

Client

Sends a request

2

Any server

Asks Redis for the count

3

Redis

Holds one shared counter

4

Decision

Allow or 429 fairly

Every server checks the same shared counter, so the limit is correct no matter how many servers run.
// After: dotnet add package RedisRateLimiting.AspNetCore
using RedisRateLimiting;
using RedisRateLimiting.AspNetCore;
using StackExchange.Redis;
 
var redis = ConnectionMultiplexer.Connect("localhost:6379");
 
builder.Services.AddRateLimiter(options =>
{
    options.AddRedisFixedWindowLimiter("by-user", opt =>
    {
        opt.ConnectionMultiplexerFactory = () => redis;
        opt.PermitLimit = 100;
        opt.Window = TimeSpan.FromMinutes(1);
    });
});

A few honest trade-offs. Talking to Redis adds a tiny bit of delay to every request, and your limiter now depends on Redis being up. Plan for what happens if Redis is unreachable — usually you "fail open" (let requests through) so a Redis hiccup does not take down your whole API. For a deeper look at distributed counting, see the Microsoft Learn rate limiting docs.

Putting it together: the request journey

Let us trace one request through a full advanced setup, from arrival to response.

Figure 4: The full journey — pick a key, find the plan, check the shared count, then either run the API or send a friendly 429.

Picking the right algorithm for the job

Each built-in algorithm has a personality. Here is a quick guide for advanced setups.

  • Fixed window — simplest. Good for internal tools. Watch out for the "boundary burst" where a client doubles up across the reset moment.
  • Sliding window — fairer than fixed window. A great default for public APIs because it smooths out those boundary bursts.
  • Token bucket — best for paid tiers. Allows short bursts but keeps a steady average. Friendly for real apps.
  • Concurrency — limits how many requests run at the same time, not per minute. Use it for heavy endpoints like report generation.

A common mix in production: a token bucket per user for general traffic, plus a concurrency limiter on a few expensive endpoints. You can attach named policies to specific routes so one slow endpoint cannot hog the whole server.

// A named concurrency policy for one heavy endpoint.
options.AddConcurrencyLimiter("heavy-reports", opt =>
{
    opt.PermitLimit = 5;     // only 5 reports build at once
    opt.QueueLimit = 10;     // 10 more may wait in line
    opt.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;
});
 
// Then in your endpoint mapping:
// app.MapGet("/reports", BuildReport).RequireRateLimiting("heavy-reports");

A note on libraries and licensing

You do not need any extra library to do most of this — rate limiting is built into ASP.NET Core. For distributed counting you may add RedisRateLimiting.AspNetCore, which is open source and free.

One thing worth knowing for your wider project: MediatR and MassTransit have moved to commercial licensing for their newer versions. They are popular for messaging and request pipelines, but they are not required for rate limiting. If you already use them elsewhere, check their license terms. For rate limiting itself, the built-in tools and a Redis package are all you need.

Common mistakes to avoid

  • Forgetting many servers. In-memory counts triple when you scale to three servers. Use Redis if you need one true limit.
  • Unbounded keys. Always validate the partition key. A missing or attacker-chosen key can flood your memory with buckets.
  • Silent 429s. Without Retry-After and a body, clients keep hammering blindly. Be clear and polite.
  • One limit for everyone. Mixing free and paying users in one bucket is unfair. Partition and tier.
  • No logging. Without logs you cannot tell an attack from a normal busy day.

Quick recap

  • Partition so each user or IP gets its own fair bucket instead of one shared limit.
  • Tier your limits so paying customers get a bigger, fairer share.
  • Chain a per-second rule with a per-minute rule to catch both spikes and steady overuse.
  • Be friendly on 429: add a Retry-After header and a clear JSON body, and log every rejection.
  • Go distributed with Redis when you run more than one server, or your real limit silently multiplies.
  • Pick the right algorithm: token bucket for tiers, sliding window for public APIs, concurrency for heavy endpoints.
  • You do not need MediatR or MassTransit for this; rate limiting is built into ASP.NET Core. (Both are now commercially licensed in newer versions.)

References and further reading

Related Posts