Why does a simple outbox processor get slow at high volume?

A basic processor reads one small batch, publishes one message at a time, and marks them done one by one. At low volume this is fine. But as messages pile up, the query to find unprocessed rows scans more and more data, each round-trip to the broker is slow, and a single worker becomes a bottleneck. The fixes are a good index, larger batches, batched publishing, and more than one worker.

How do I run more than one outbox worker without sending the same message twice?

Use a claiming step so each worker grabs its own rows. In PostgreSQL you use SELECT ... FOR UPDATE SKIP LOCKED, which lets a worker lock a batch of rows and skip rows another worker already locked. Each worker then processes a different set of messages, so they never collide.

Does scaling the outbox break the at-least-once guarantee?

No. Scaling changes how fast and how parallel the processing is, not the delivery promise. You still get at-least-once delivery, which means a message can be sent more than once after a crash. Your consumers must stay idempotent, usually by pairing the outbox with the inbox pattern.

Should I add an outbox worker to every API instance?

Usually no. Tying a dispatcher to every web or API pod often backfires because too many readers fight over the same outbox table and create lock contention. It is better to run a small, fixed number of dedicated worker instances that you scale on purpose.

Scaling the Outbox Pattern in .NET: From Hundreds to Billions of Messages

Scale the Outbox Pattern in .NET to billions of messages a day with batching, indexes, SKIP LOCKED, and parallel workers — explained simply with diagrams.

13 min readUpdated April 30, 2026

A busy post office at festival time

Think about a small post office in your town. On a normal day, one clerk stands at the counter. People drop letters in the box, and the clerk calmly picks them up, stamps them, and sends them out. Everything flows.

Now imagine Diwali. Suddenly everyone is sending cards and gifts. The letter box overflows. That one clerk cannot keep up. Letters sit for hours. People get upset.

What does a smart post office do? A few simple things. They put up a clear sign so clerks find the right pile fast. They handle letters in bundles instead of one at a time. They open more counters with more clerks. And they make sure two clerks never grab the same bundle by mistake.

The Outbox Pattern is exactly like that post office. The basic version uses one clerk. It works well until the festival rush. Scaling the outbox means teaching the post office to handle the rush: better signs (indexes), bundles (batching), more counters (parallel workers), and rules so clerks never clash (row locking).

If you have not read the basic Outbox Pattern yet, start with Implementing the Outbox Pattern. This guide assumes you already have a working outbox and now want it to go fast.

Where we start: the simple outbox

A simple outbox has two halves. First, your app saves business data and a message into one database transaction. Second, a background worker reads unprocessed messages and publishes them to a broker.

The basic outbox: one worker reads and publishes one batch at a time.

Here is the kind of worker most people write first. It is correct, but it is slow under load.

public class SimpleOutboxProcessor(AppDbContext db, IBus bus)
{
    public async Task ProcessAsync(CancellationToken ct)
    {
        // Read a small batch of unprocessed messages.
        var messages = await db.OutboxMessages
            .Where(m => m.ProcessedOnUtc == null)
            .OrderBy(m => m.OccurredOnUtc)
            .Take(20)
            .ToListAsync(ct);
 
        foreach (var message in messages)
        {
            // Publish one message, wait, then move on.
            await bus.Publish(message.ToEvent(), ct);
 
            message.ProcessedOnUtc = DateTime.UtcNow;
            await db.SaveChangesAsync(ct); // one save per message
        }
    }
}

Look closely and you can already see three slow spots. The Where scans the table. The foreach publishes one message at a time and waits for each one. And it saves to the database once per message. At 50 messages a day, none of this matters. At 50 million, all of it matters.

Why it slows down

Let me name the bottlenecks clearly, because each one has a clean fix.

Bottleneck	What goes wrong	The fix
Slow query	Finding unprocessed rows scans more and more data as the table grows	A filtered, covering index
One-by-one publish	Each broker call has network latency; doing them in series wastes time	Publish in batches
One save per message	Many tiny database writes add up	One save per batch
Single worker	One thread can only do so much	Multiple workers in parallel
Workers colliding	Two workers grab the same rows and double-publish	Row locking with `SKIP LOCKED`

We will fix them one at a time, from cheapest to boldest. You do not need all of them on day one. Apply the first few, measure, and only add more if you still need more speed.

The scaling ladder

Index

Batch read

Batch publish

Parallel workers

SKIP LOCKED

Steps

Index

Make the read query fast

Batch read

Take bigger chunks

Batch publish

Send many at once

Parallel workers

More clerks at the counter

SKIP LOCKED

Stop workers clashing

Climb only as high as your message volume needs.

Fix 1: A good index (the clear sign)

Your worker always asks the same question: "Give me the oldest messages that are not processed yet." If the database has to scan the whole table to answer that, it gets slower every day as the table grows.

The fix is a filtered, covering index. Filtered means it only includes rows that still need work. Covering means the index already holds the columns the query needs, so the database does not have to jump back to the main table.

-- PostgreSQL: index only the rows that still need processing.
CREATE INDEX ix_outbox_unprocessed
ON outbox_messages (occurred_on_utc)
INCLUDE (id, type, content)
WHERE processed_on_utc IS NULL;

Because of the WHERE processed_on_utc IS NULL clause, the index stays small. It only tracks the backlog, not the millions of messages you already sent. Once a message is marked processed, it drops out of the index. This single change often gives the biggest win for the least effort.

A quick warning. If your outbox table keeps every message forever, it will grow without limit. Add a cleanup job that deletes or archives old processed messages. A small, tidy table is a fast table.

Fix 2: Bigger batches (handle bundles, not single letters)

Reading 20 rows at a time means many round-trips. Reading 1,000 rows at a time means fewer round-trips and more useful work per trip. Increasing the batch size is a simple lever.

But do not make the batch enormous. A batch of 100,000 inside one transaction holds locks for a long time, uses lots of memory, and blocks cleanup. A long-running transaction is its own kind of trouble. Pick a size in the hundreds to low thousands and tune by measuring.

const int BatchSize = 1000;
 
var messages = await db.OutboxMessages
    .Where(m => m.ProcessedOnUtc == null)
    .OrderBy(m => m.OccurredOnUtc)
    .Take(BatchSize)
    .ToListAsync(ct);

Fix 3: Publish in batches and save once

Publishing one message, waiting, then publishing the next wastes time on network latency. Most brokers and client libraries let you publish many messages together. And instead of saving the database once per message, mark the whole batch done in a single write.

public async Task ProcessBatchAsync(CancellationToken ct)
{
    var messages = await db.OutboxMessages
        .Where(m => m.ProcessedOnUtc == null)
        .OrderBy(m => m.OccurredOnUtc)
        .Take(1000)
        .ToListAsync(ct);
 
    if (messages.Count == 0)
        return;
 
    // Publish the whole batch together, not one at a time.
    var events = messages.Select(m => m.ToEvent()).ToList();
    await bus.PublishBatch(events, ct);
 
    // Mark them all done in one round-trip.
    var now = DateTime.UtcNow;
    foreach (var message in messages)
        message.ProcessedOnUtc = now;
 
    await db.SaveChangesAsync(ct); // one save for the whole batch
}

This is the same logic as before, but it spends far less time waiting. Fewer round-trips to the broker and one save instead of a thousand.

Fix 4: More workers (open more counters)

A single worker is still one thread doing one batch at a time. To go faster, run several workers at once. This is the competing consumers idea from the Azure Architecture Center: many workers pull from the same source so the total throughput goes up.

But here is the trap. If two workers both run the same query, they both read the same unprocessed rows. They both publish them. Now every message goes out twice on purpose, which is worse than just slow. We need a way for each worker to claim its own private set of rows.

Many workers competing for messages — but they must not grab the same rows.

Fix 5: SKIP LOCKED (no two clerks on the same bundle)

PostgreSQL has a beautiful tool for this: FOR UPDATE SKIP LOCKED. When a worker runs a select with these words, it locks the rows it reads, and any other worker that runs the same query at the same time skips the locked rows and picks the next free ones.

So Worker 1 grabs rows 1 to 1000. Worker 2, running at the same instant, sees those are locked, skips them, and grabs rows 1001 to 2000. They never fight. They never double-publish. Each does honest, separate work.

How SKIP LOCKED hands each worker a different bundle of rows.

In EF Core you reach for raw SQL here, because SKIP LOCKED is a database feature, not something LINQ expresses directly.

public async Task<List<OutboxMessage>> ClaimBatchAsync(int size, CancellationToken ct)
{
    // Each worker locks its own rows; others skip them.
    return await db.OutboxMessages
        .FromSqlRaw(
            """
            SELECT * FROM outbox_messages
            WHERE processed_on_utc IS NULL
            ORDER BY occurred_on_utc
            LIMIT {0}
            FOR UPDATE SKIP LOCKED
            """, size)
        .ToListAsync(ct);
}

You run this select inside a transaction. The lock is held until you commit. So the flow is: open a transaction, claim a batch with SKIP LOCKED, publish the batch, mark the rows done, commit. When you commit, the locks release and the rows are now marked processed, so no one ever touches them again.

One worker's safe loop

Begin tx

Claim batch

Publish

Mark done

Commit

Steps

Begin tx

Start a transaction

Claim batch

FOR UPDATE SKIP LOCKED

Publish

Send the batch to broker

Mark done

Set processed_on_utc

Commit

Release locks

Each worker repeats this loop forever, never clashing with the others.

A word on ordering

There is a price for going parallel. With several workers each grabbing different batches, messages can arrive at the broker out of order. Worker 2 might finish before Worker 1. For many systems this is fine. An email service does not care which order two unrelated emails go out.

But sometimes order matters within a group. For example, all events for one order should arrive in sequence. The clean fix is not to force the whole outbox into one slow line. Instead:

Give related messages a partition key (like the order id) and keep messages with the same key on the same worker or the same broker partition. Kafka does this naturally with partitions.
On the consumer side, use the inbox pattern to buffer and reorder, and make consumers idempotent so duplicates are harmless.

This way you keep most of the speed of parallel processing and still protect the few places where order truly matters.

Choice	You get	You give up
Single worker, ordered	Strict global order	Throughput; one worker is a ceiling
Parallel, no key	Highest throughput	Global ordering
Parallel, partition key	High throughput, per-key order	Slightly more setup

Putting it together: the worker host

Here is how the dedicated worker process ties the loop together with several workers and a polling delay. Keep this as a small, separate deployment — not bolted onto every API pod.

public class OutboxBackgroundService(IServiceProvider services) : BackgroundService
{
    private const int Workers = 4; // a small, fixed number you scale on purpose
 
    protected override async Task ExecuteAsync(CancellationToken ct)
    {
        var loops = Enumerable.Range(0, Workers)
            .Select(_ => RunWorkerLoop(ct));
 
        await Task.WhenAll(loops);
    }
 
    private async Task RunWorkerLoop(CancellationToken ct)
    {
        while (!ct.IsCancellationRequested)
        {
            using var scope = services.CreateScope();
            var processor = scope.ServiceProvider
                .GetRequiredService<OutboxProcessor>();
 
            int handled = await processor.ProcessBatchAsync(ct);
 
            // If the batch was empty, wait a bit before polling again.
            if (handled == 0)
                await Task.Delay(TimeSpan.FromSeconds(1), ct);
        }
    }
}

Notice the small idle delay when there is nothing to do. Without it, an empty outbox would make your workers spin in a tight loop and waste database trips. When work exists, the loop keeps pulling batches as fast as it can.

How far can this go?

With these pieces — a filtered covering index, batch reads, batched publishing, a handful of workers, and SKIP LOCKED — real .NET teams have pushed a single outbox to tens of thousands of messages per second, which adds up to billions of messages per day. Milan Jovanović documents reaching past 30,000 messages per second with exactly this toolkit. You almost certainly do not need that much. The point is that the same five fixes carry you from a few hundred to a few billion without changing the core idea.

Things that quietly bite you

A few honest warnings from production.

Do not put a worker in every API pod. Too many readers create lock contention on the outbox table and can slow everyone down. Use a small, dedicated worker deployment.
Always clean up old rows. A table that grows forever gets slow even with a perfect index. Archive or delete processed messages on a schedule.
Keep batches reasonable. Huge batches hold locks too long and block cleanup. Hundreds to low thousands is the sweet zone.
Stay idempotent at the consumer. Scaling does not change the at-least-once promise. Duplicates will happen on retries and crashes. Pair with the inbox pattern and the idempotent consumer pattern.
Watch the backlog metric. Track how many unprocessed rows exist. If it keeps growing, you need more workers, bigger batches, or a faster broker — and you want to know before customers do.

A note on libraries and licensing

You do not have to hand-build all of this. Several .NET libraries ship outbox support. MassTransit and MediatR both moved to commercial licensing in their newer versions, so check the license terms and pricing before you adopt them for a production system. The open-source CAP library from the .NET Foundation provides an outbox-style event bus and remains free. Whatever you pick, the scaling ideas in this guide still apply underneath — the library is just doing the index, batching, and locking work for you.

Quick recap

The basic outbox uses one worker. It is correct but slows down under heavy load, like one clerk at a festival.
Index first. A filtered, covering index on unprocessed rows is the biggest, cheapest win.
Batch reads and publishes. Bigger chunks and grouped broker calls cut wasted waiting. Save the whole batch in one write.
Run several workers. Competing consumers raise total throughput.
Use FOR UPDATE SKIP LOCKED so each worker claims its own rows and no message goes out twice by accident.
Mind ordering. Parallel work can reorder messages. Use partition keys for per-group order and keep consumers idempotent.
Operate it well. Dedicated workers, cleanup jobs, sane batch sizes, and backlog monitoring keep it healthy at scale.
These five fixes take you from hundreds of messages to billions without changing the core pattern.

Scaling the Outbox Pattern in .NET: From Hundreds to Billions of Messages

A busy post office at festival time

Where we start: the simple outbox

Why it slows down

The scaling ladder

Fix 1: A good index (the clear sign)

Fix 2: Bigger batches (handle bundles, not single letters)

Fix 3: Publish in batches and save once

Fix 4: More workers (open more counters)

Fix 5: SKIP LOCKED (no two clerks on the same bundle)

One worker's safe loop

A word on ordering

Putting it together: the worker host

How far can this go?

Things that quietly bite you

A note on libraries and licensing

Quick recap

References and further reading

Related Patterns

The Outbox Pattern in .NET: Never Lose a Message Again

The Inbox Pattern in .NET: Handle Each Message Exactly Once

Idempotent Consumer: Handling Duplicate Messages in .NET

MassTransit Outbox Pattern with EF Core and MongoDB in .NET

Outbox Pattern for Reliable Microservices Messaging in .NET

Boost Your EF Core Productivity in PostgreSQL With Entity Developer