How to Scale Long-Running API Requests in .NET: A Beginner's Guide
Learn how to handle slow, long-running API requests in .NET using the 202 Accepted pattern, background services, channels, and status polling.
Imagine you walk into a busy sweet shop to order a big box of fresh jalebis for a wedding. The shopkeeper cannot make 500 jalebis while you stand at the counter. If he tried, the whole queue behind you would be stuck for an hour. Nobody could buy even a single laddu.
So a smart shopkeeper does something else. He takes your order, gives you a token number, and says "come back in an hour, your box will be ready." You walk away free. He keeps serving other customers at the counter. In the back kitchen, his team slowly fries your jalebis. When you return and show your token, your hot box is waiting.
A long-running API request is exactly this problem. Some work is just too slow to finish while the caller waits. In this guide you will learn how to build the "token system" for your API in .NET 10, so slow jobs never block your server. We will go step by step, in plain language.
What is a long-running request?
Most API calls are fast. You ask for a user's profile, the server reads one row, and replies in a few milliseconds. Easy.
But some requests ask for slow work:
- Generating a 200-page PDF report.
- Resizing or converting a large video.
- Sending 50,000 emails.
- Calling a slow third-party service many times.
If your API tries to do this slow work while the caller waits, three bad things happen.
- The caller's connection may time out before you finish.
- The server thread is stuck holding that one request, so it cannot serve others.
- If the caller retries, you might do the same heavy work twice.
Here is the difference between a fast request and a slow one.
The goal is simple. We want the slow path to behave like the fast path: reply quickly, then do the heavy work somewhere else.
The big idea: accept now, work later
The trick is the same as the sweet shop. Do not do the slow work inside the request. Instead:
- Accept the request and save what needs to be done.
- Return a job id and an HTTP 202 Accepted status right away.
- A background worker picks up the job and does the slow work.
- The caller polls a status endpoint to check progress.
- When the job is done, the caller fetches the result.
This is called the Asynchronous Request-Reply pattern. Microsoft documents it in the Azure Architecture Center. Let us see the whole flow.
Notice the API never waits for the slow work. It hands the job to a queue and returns a token (the job id). This keeps your server fast and free.
The job lifecycle
Steps
Accepted
API saves the job, returns 202 + id
Queued
job waits in the queue
Running
worker does the slow work
Completed
result is ready to fetch
Step 1: Accept the request and return 202
Let us start with the endpoint the caller hits. It should be tiny. It only validates the input, creates a job id, drops the job on a queue, and returns.
app.MapPost("/reports", async (
ReportRequest request,
IBackgroundQueue queue,
IJobStore jobs) =>
{
// 1. Make a job id and remember it.
var jobId = Guid.NewGuid();
await jobs.CreateAsync(jobId, JobStatus.Queued);
// 2. Put the work on the queue. Do NOT run it here.
await queue.EnqueueAsync(new ReportJob(jobId, request.CustomerId));
// 3. Tell the caller where to check progress.
return Results.Accepted($"/jobs/{jobId}", new { id = jobId });
});Three small things happen here. We create a job record, we enqueue the work, and we return 202 Accepted. The Results.Accepted helper also sets a Location header pointing at the status endpoint, so the caller knows exactly where to look next.
This endpoint finishes in a few milliseconds, even though the real work might take two minutes.
Step 2: A queue inside your app with Channels
We need a place to hold jobs between "accepted" and "running". For work that lives inside a single app, .NET gives us a perfect tool: System.Threading.Channels. A channel is like a safe pipe. One side writes jobs in, the other side reads them out. It is async-friendly and fast.
We will use a bounded channel. Bounded means it has a maximum size. This gives us backpressure: if the queue is full, new work waits or is rejected instead of piling up forever and crashing the server.
public sealed class BackgroundQueue : IBackgroundQueue
{
private readonly Channel<ReportJob> _channel =
Channel.CreateBounded<ReportJob>(new BoundedChannelOptions(100)
{
FullMode = BoundedChannelFullMode.Wait
});
public async ValueTask EnqueueAsync(ReportJob job) =>
await _channel.Writer.WriteAsync(job);
public IAsyncEnumerable<ReportJob> ReadAllAsync(CancellationToken ct) =>
_channel.Reader.ReadAllAsync(ct);
}The number 100 is the capacity. With FullMode = Wait, if 100 jobs are already waiting, the next enqueue pauses until there is room. This protects your memory during a traffic spike.
Why bounded queues protect you
Steps
Spike
many requests arrive at once
Bounded Queue
only N jobs held, rest wait
Steady Work
worker drains at a safe rate
Safe Server
memory stays under control
Step 3: A background worker that does the slow work
Now we need something that reads jobs from the channel and actually does them. In ASP.NET Core this is a BackgroundService, a class that runs quietly in the background for the whole life of the app.
public sealed class ReportWorker(
IBackgroundQueue queue,
IJobStore jobs,
IServiceProvider services) : BackgroundService
{
protected override async Task ExecuteAsync(CancellationToken stoppingToken)
{
await foreach (var job in queue.ReadAllAsync(stoppingToken))
{
await jobs.UpdateAsync(job.Id, JobStatus.Running);
try
{
using var scope = services.CreateScope();
var maker = scope.ServiceProvider.GetRequiredService<IReportMaker>();
var url = await maker.BuildAsync(job, stoppingToken);
await jobs.CompleteAsync(job.Id, url);
}
catch (Exception ex)
{
await jobs.FailAsync(job.Id, ex.Message);
}
}
}
}The worker loops forever. Each time a job arrives, it marks the job Running, does the heavy work, and then stores the result link. If something breaks, it marks the job Failed with a message. Notice we create a scope so the worker can safely use scoped services like a database context.
Register both pieces in Program.cs:
builder.Services.AddSingleton<IBackgroundQueue, BackgroundQueue>();
builder.Services.AddHostedService<ReportWorker>();Step 4: Let the caller check progress
Returning 202 is only half the story. The caller now holds a job id but does not know when the work is done. So we add a status endpoint.
app.MapGet("/jobs/{id:guid}", async (Guid id, IJobStore jobs) =>
{
var job = await jobs.GetAsync(id);
if (job is null)
return Results.NotFound();
return job.Status switch
{
JobStatus.Completed => Results.Redirect(job.ResultUrl!), // 302 to result
JobStatus.Failed => Results.Problem(job.Error),
_ => Results.Ok(new { id, status = job.Status.ToString() })
};
});The caller polls GET /jobs/{id} every few seconds. While the job is Queued or Running, it gets 200 OK with the status. When the job is Completed, it gets redirected to the finished result. Clean and simple.
Here is the state machine the job follows on the server.
Polling vs pushing: how callers get updates
Polling is the easiest way for a caller to learn that a job is done, but it is not the only way. Here is how the common options compare.
| Method | How it works | Best for |
|---|---|---|
| Polling | Caller asks GET /jobs/{id} on a timer | Simple clients, public APIs |
| Webhook | Server calls the caller's URL when done | Server-to-server systems |
| SignalR / WebSocket | Server pushes a live message | Dashboards, live progress bars |
Start with polling. It works everywhere and needs no special setup. Move to webhooks or SignalR only when you truly need instant updates. Keep your first version boring and reliable.
When in-process is not enough
The channel-and-worker setup is great, but it has one weakness. If your server restarts, every job sitting in memory is lost. For a "generate report" feature that may be fine, because the caller can just ask again. For "charge the customer" it is not fine at all.
When jobs must survive restarts or run across many machines, you need a durable queue. This means the jobs are stored in a database, Redis, or a cloud queue, not just in memory. A popular .NET choice is Hangfire, which stores jobs in SQL Server, PostgreSQL, or Redis and can run workers on several servers at once.
With a shared store, you can add more API servers to accept work and more workers to process it, each scaling on its own. This is how you handle real load.
A quick honesty note about licensing, because it matters when you choose tools. Some well-known libraries changed their terms recently. MediatR and MassTransit have moved to commercial licensing for newer versions, so check the license before adding them to a company project. Hangfire has a free open-source core with a paid Pro edition for advanced features. Always read the license page first.
Here is a simple way to compare your options.
| Option | Survives restart? | Scales across machines? | Setup effort |
|---|---|---|---|
| Channel + BackgroundService | No | No | Very low |
| Hangfire + database | Yes | Yes | Medium |
| Cloud queue + workers | Yes | Yes | Medium to high |
Three rules that keep you safe
As your jobs grow, three habits will save you a lot of pain.
Make jobs idempotent. Idempotent is a big word for a simple idea: running the same job twice should not cause double trouble. Use the job id as an idempotency key. Before charging a customer, check "did I already finish job 42?" If yes, skip it. Networks retry, workers crash and restart, and the same job can arrive twice. Safe jobs survive that.
Always set a timeout and cancellation. A slow job should not run forever. Pass the CancellationToken into every async call so the work stops cleanly when the app shuts down or a deadline passes.
Watch your queue depth. If the number of waiting jobs keeps climbing, your workers cannot keep up. That is an early warning. Track queue depth, failure rate, and job duration, and set an alert before users feel the slowdown.
A safe job, step by step
Steps
Check idempotency
skip if already done
Run with timeout
honor cancellation token
Record result
store status and output
Emit metrics
track depth and failures
Putting it together
Let us retell the whole journey in plain words, the sweet shop way.
A caller asks for slow work. Your API takes the order, writes a job record, drops it on a bounded queue, and hands back a token with 202 Accepted. Your server stays fast and free for everyone else. A background worker quietly takes the job, does the heavy lifting, and saves the result. The caller checks the status endpoint now and then. When the job is done, they collect the finished result.
That is the entire pattern. The same shape works whether you handle ten jobs a day or ten thousand. You only swap the in-memory channel for a durable store when jobs must not be lost.
Quick recap
- Never do slow work while the caller waits. It times out and blocks your server.
- Accept the request, save a job, and return HTTP 202 Accepted with a job id.
- Use
System.Threading.Channelsplus aBackgroundServicefor in-process work. - Use a bounded channel for backpressure so spikes cannot flood your memory.
- Give callers a status endpoint to poll, like
GET /jobs/{id}. - Move to a durable queue (Hangfire, Redis, or a cloud queue) when jobs must survive restarts or scale across machines.
- Make jobs idempotent, add timeouts and cancellation, and monitor queue depth.
- Check tool licenses: MediatR and MassTransit are now commercially licensed for newer versions.
References and further reading
- Asynchronous Request-Reply Pattern (Azure Architecture Center)
- Background tasks with hosted services in ASP.NET Core (Microsoft Learn)
- System.Threading.Channels (Microsoft Learn)
- How to Scale Long-Running API Requests (Milan Jovanovic)
- Hangfire Documentation: Background Processing
Related Posts
Building Resilient Cloud Applications With .NET
Learn to build resilient cloud apps in .NET with retries, timeouts, and circuit breakers using Polly and Microsoft.Extensions.Resilience.
Distributed Locking in .NET: Coordinating Work Across Multiple Instances
A friendly beginner guide to distributed locking in .NET. Learn how to stop multiple app instances from doing the same job twice using Redis and SQL Server.
How to Build a URL Shortener With .NET: A Beginner's Step-by-Step Guide
A friendly, step-by-step guide to building a URL shortener in .NET 10 using minimal APIs and EF Core. Learn short codes, redirects, and storage.
Building Async APIs in ASP.NET Core the Right Way
Learn to build fast, safe async APIs in ASP.NET Core: async/await, CancellationToken, avoiding .Result deadlocks, and thread pool tips.
Flexible PDF Reporting in .NET Using Razor Views
A beginner-friendly guide to making PDF reports in .NET by writing Razor views as HTML and turning them into PDFs with a headless browser.
Getting Started With Dapr for Building Cloud-Native Microservices in .NET
A beginner-friendly guide to Dapr for .NET developers: learn sidecars, state, pub/sub, and service invocation to build cloud-native microservices.