How to Set Up Production-Ready Monitoring With ASP.NET Core Health Checks
A friendly, step-by-step guide to production-ready monitoring with ASP.NET Core health checks: liveness, readiness, dependency checks, a UI, and probes.
A night watchman doing his rounds
Picture a big housing society in your city. At night, one watchman walks around with a torch. He does not inspect every single flat. He does a quick round: is the main gate locked, is the water pump running, are the stairwell lights on, is the lift working? In two minutes he knows if the society is fine, if something small is off, or if there is a real problem that needs the manager right now.
A health check in ASP.NET Core is that night watchman, but for your web app. Instead of a person, a robot does the round — usually Kubernetes, a load balancer, or an uptime monitor. Every few seconds it visits a special URL and asks one simple question: "Are you okay?"
Your app runs a few quick checks — can it reach the database, is Redis awake, is there enough disk space — and answers with one of three words: Healthy, Degraded, or Unhealthy. The robot then decides what to do next: keep sending people to your app, slow down, or restart it.
A single /health URL is a good start, but it is not enough for real production. The watchman needs a proper routine, a logbook, and a way to call the manager. This guide shows you how to build that full routine step by step, in plain language.
What "production-ready" really adds
Many tutorials stop after one endpoint. Real systems need more. Here is the difference between a toy setup and a production setup.
| Concern | Toy setup | Production-ready setup |
|---|---|---|
| Endpoints | One /health URL | Separate liveness and readiness URLs |
| Checks | Always returns "OK" | Real database, cache, and disk checks |
| Speed | No timeout | Per-check timeout, fast liveness |
| History | None | Stored results you can look back at |
| Safety | Public detailed output | Open probes, protected details |
| Alerts | Someone notices later | Automatic alert when red |
The goal is honesty. A health check that always says "Healthy" is worse than none, because it gives false comfort. Your checks must tell the truth even when it hurts.
Step 1: Add the built-in health check service
The core health check API ships inside ASP.NET Core, so you do not need any extra package to begin. You register the service and map an endpoint.
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddHealthChecks();
var app = builder.Build();
app.MapHealthChecks("/healthz");
app.Run();Now visit /healthz in a browser. With no checks added yet, it returns the plain text Healthy and HTTP status 200. That is the watchman saying "I exist." Useful, but he has not looked at anything yet.
The three possible results are worth remembering. They map to HTTP status codes by default like this.
| Status | Meaning | Default HTTP code |
|---|---|---|
| Healthy | Everything works | 200 |
| Degraded | Works, but not great | 200 |
| Unhealthy | A required part failed | 503 |
Notice that Degraded still returns 200. That is on purpose. A degraded app should keep serving people while you investigate. Only Unhealthy pulls it out of rotation.
Step 2: Split liveness from readiness
This is the most important idea in the whole guide. Two questions sound similar but are very different.
- Liveness: "Is the app alive, or is it stuck and needs a restart?"
- Readiness: "Is the app ready to take requests right now?"
An app can be alive but not ready. Think of a shop where the shutter is up and the lights are on (alive), but the staff are still counting the cash register and have not opened the counter yet (not ready). You would not send customers in yet, but you also would not knock the building down.
Kubernetes treats these very differently. If liveness fails, it restarts the pod. If readiness fails, it stops sending traffic but leaves the pod running so it can recover.
Liveness vs readiness decisions
Steps
Live fails
Restart the pod
Ready fails
Stop traffic, keep pod
Both pass
Send traffic normally
We separate them using tags. Every check gets a tag like live or ready, and each endpoint runs only the checks with the matching tag.
builder.Services.AddHealthChecks()
// A tiny check that proves the app loop is alive.
.AddCheck("self", () => HealthCheckResult.Healthy(), tags: ["live"])
// A real dependency check used only for readiness.
.AddCheck("ready-gate", () => HealthCheckResult.Healthy(), tags: ["ready"]);
app.MapHealthChecks("/healthz/live", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("live")
});
app.MapHealthChecks("/healthz/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready")
});The Predicate is the filter. The live endpoint runs only live-tagged checks, so it stays tiny and fast. The ready endpoint runs the heavier dependency checks. Keep liveness almost empty — often just a "self" check that returns Healthy. If liveness does a database query and the database hiccups, Kubernetes will kill a perfectly fine pod for no reason.
Step 3: Check your real dependencies
A health check that only returns Healthy is dishonest. The watchman must actually look at the water pump. For common dependencies, the community Xabaril project gives you ready-made checks so you do not write the same code again. Install the packages you need.
dotnet add package AspNetCore.HealthChecks.SqlServer
dotnet add package AspNetCore.HealthChecks.Npgsql
dotnet add package AspNetCore.HealthChecks.RedisThen wire them up. Give each one the ready tag so it runs on the readiness endpoint, not on liveness.
var sql = builder.Configuration.GetConnectionString("Sql")!;
var redis = builder.Configuration.GetConnectionString("Redis")!;
builder.Services.AddHealthChecks()
.AddCheck("self", () => HealthCheckResult.Healthy(), tags: ["live"])
.AddSqlServer(
connectionString: sql,
name: "sql-database",
tags: ["ready"],
timeout: TimeSpan.FromSeconds(3))
.AddRedis(
redisConnectionString: redis,
name: "redis-cache",
tags: ["ready"],
timeout: TimeSpan.FromSeconds(2));Two things matter here. First, every dependency check has a timeout. Without one, a frozen database could make your readiness response hang forever, which is as bad as the app being down. Second, the checks are tagged ready, so a slow cache will only stop new traffic, not trigger a restart.
Writing your own custom check
Sometimes you need to check something specific, like whether a downstream payment API answers. You write a small class that implements IHealthCheck.
public sealed class PaymentApiHealthCheck(IHttpClientFactory factory) : IHealthCheck
{
public async Task<HealthCheckResult> CheckHealthAsync(
HealthCheckContext context,
CancellationToken cancellationToken = default)
{
try
{
var client = factory.CreateClient("payments");
using var response = await client.GetAsync("/ping", cancellationToken);
if (response.IsSuccessStatusCode)
return HealthCheckResult.Healthy("Payment API responded.");
// Still up, but the dependency is unhappy — warn, do not kill.
return HealthCheckResult.Degraded(
$"Payment API returned {(int)response.StatusCode}.");
}
catch (Exception ex)
{
return HealthCheckResult.Unhealthy("Payment API unreachable.", ex);
}
}
}Register it with a tag so it joins the readiness round.
builder.Services.AddHealthChecks()
.AddCheck<PaymentApiHealthCheck>("payment-api", tags: ["ready"]);Notice the three return paths. A clean success is Healthy. An odd status code is Degraded — the app still works, but you want to see the warning. A thrown exception is Unhealthy. Choosing the right level is the real skill. Mark something Unhealthy only if the app truly cannot do its job without it.
Choosing the right status
Steps
Works fully
Return Healthy
Works partly
Return Degraded
Cannot work
Return Unhealthy
Step 4: Return useful JSON, but keep it safe
The default response is the single word Healthy. Robots are happy with that, but humans debugging an incident want detail: which check failed, how long it took, and why. You can shape the response with a custom writer.
app.MapHealthChecks("/healthz/ready", new HealthCheckOptions
{
Predicate = check => check.Tags.Contains("ready"),
ResponseWriter = async (context, report) =>
{
context.Response.ContentType = "application/json";
var payload = new
{
status = report.Status.ToString(),
totalDurationMs = report.TotalDuration.TotalMilliseconds,
checks = report.Entries.Select(e => new
{
name = e.Key,
status = e.Value.Status.ToString(),
durationMs = e.Value.Duration.TotalMilliseconds,
description = e.Value.Description
})
};
await context.Response.WriteAsJsonAsync(payload);
}
});Now the readiness URL returns a tidy JSON object that a person can read during an incident. But here is the safety rule: this detailed output is also useful to attackers. It can leak server names, dependency versions, and your internal structure.
So follow this pattern:
- Keep
/healthz/liveand/healthz/readysimple and open. Kubernetes probes do not send auth headers, so locking these down breaks the probes. - Put the rich JSON and any dashboard on a separate, protected endpoint — behind authorization, or on an internal-only port that the public cannot reach.
Step 5: Add a dashboard and store history
A single check tells you "now." Production needs "what happened." The Xabaril UI package gives you a small web dashboard that polls your endpoints and draws a history.
dotnet add package AspNetCore.HealthChecks.UI
dotnet add package AspNetCore.HealthChecks.UI.Client
dotnet add package AspNetCore.HealthChecks.UI.InMemory.Storagebuilder.Services
.AddHealthChecksUI(setup =>
{
setup.AddHealthCheckEndpoint("ready", "/healthz/ready");
setup.SetEvaluationTimeInSeconds(15);
})
.AddInMemoryStorage();
app.MapHealthChecksUI(options => options.UIPath = "/health-ui");One warning about storage. AddInMemoryStorage is easy, but it forgets everything when the app restarts. If you want history that survives restarts and crashes — which is exactly when you most want it — use a database-backed store instead, such as AspNetCore.HealthChecks.UI.SqlServer.Storage or the PostgreSQL equivalent. With persistent storage, after an outage you can look back and see precisely when the database started failing.
Protect the /health-ui path the same way you protect the detailed endpoint. It shows your internal map, so it is not for the public.
Step 6: Push results instead of waiting
Everything so far is pull-based: a robot calls your URL. ASP.NET Core also supports push-based monitoring through IHealthCheckPublisher. When you register a publisher, the framework runs your checks on a timer and hands the result to your code. You can then push that result anywhere — a metrics system, a logging pipeline, or a Slack alert.
builder.Services.Configure<HealthCheckPublisherOptions>(options =>
{
options.Delay = TimeSpan.FromSeconds(5);
options.Period = TimeSpan.FromSeconds(30);
options.Predicate = check => check.Tags.Contains("ready");
});
builder.Services.AddSingleton<IHealthCheckPublisher, SlackAlertPublisher>();This is how you turn "someone will notice eventually" into "someone gets paged in 30 seconds." The publisher runs even when no one is calling your endpoints, so it works well with dashboards like Grafana or with an on-call tool.
Step 7: Wire the probes into Kubernetes
Finally, tell Kubernetes which URL is which. This is where all the earlier work pays off. The liveness probe points at the tiny live endpoint; the readiness probe points at the heavier ready endpoint.
livenessProbe:
httpGet:
path: /healthz/live
port: 8080
initialDelaySeconds: 10
periodSeconds: 10
readinessProbe:
httpGet:
path: /healthz/ready
port: 8080
initialDelaySeconds: 5
periodSeconds: 5If your app needs a long warm-up (loading a big cache, running migrations), add a startup probe as well. It gives the app extra time to boot before the liveness probe starts judging it, so a slow start does not get mistaken for a crash.
Probe lifecycle on deploy
Steps
Startup
Wait for boot to finish
Readiness
Open traffic when ready
Liveness
Restart if stuck later
Common mistakes to avoid
- Heavy liveness checks. Querying the database in liveness causes pointless restarts when the database blips. Keep liveness to a self-check.
- No timeouts. A frozen dependency without a timeout makes the whole response hang, which looks like a crash.
- Auth on probes. Kubernetes probes send no auth headers. Locking them down means constant restarts.
- Leaking details publicly. Rich JSON helps attackers. Keep it behind protection.
- Forgetting history. In-memory storage loses data on restart — the worst moment to lose it.
- Always-green checks. A check that never fails is worse than no check, because you trust a lie.
Quick recap
- A health check is a quick "are you okay?" round that robots run against your app.
- Split liveness (restart if stuck) from readiness (stop traffic if not ready) using tags and a
Predicate. - Keep liveness tiny and fast; put real database, cache, and API checks on readiness with a timeout.
- Return three honest levels: Healthy, Degraded (still 200), and Unhealthy (503).
- Keep simple probe endpoints open; protect rich JSON and the UI dashboard.
- Use persistent storage so check history survives restarts.
- Add an
IHealthCheckPublisherto push alerts instead of waiting to be asked. - Map
livenessProbeandreadinessProbe(and astartupProbefor slow boots) in Kubernetes.
References and further reading
- Health checks in ASP.NET Core — Microsoft Learn
- Xabaril AspNetCore.Diagnostics.HealthChecks (GitHub)
- Configure Liveness, Readiness and Startup Probes — Kubernetes
- Health monitoring — .NET microservices architecture (Microsoft Learn)
- Adding health checks with liveness, readiness and startup probes — Andrew Lock
Related Posts
Health Checks in ASP.NET Core: A Beginner's Guide
Learn health checks in ASP.NET Core: add liveness and readiness endpoints, check your database and Redis, write custom checks, and wire up Kubernetes probes.
Getting Started With OpenTelemetry in .NET With Jaeger and Seq
A beginner guide to OpenTelemetry in .NET. Add traces, metrics, and logs, then view them in Jaeger and Seq using the OTLP exporter step by step.
5 Serilog Best Practices for Better Structured Logging in .NET
Learn 5 simple Serilog best practices for structured logging in .NET: message templates, enrichers, correlation IDs, hiding secrets, and async sinks.
Monitoring .NET Applications With OpenTelemetry and Grafana
A beginner-friendly guide to monitoring .NET apps with OpenTelemetry and Grafana. Send metrics, traces, and logs to Prometheus, Tempo, and Loki step by step.
Logging Best Practices in ASP.NET Core: A Beginner's Guide
Learn logging best practices in ASP.NET Core: log levels, structured logging, source-generated LoggerMessage, scopes, correlation IDs, and keeping secrets out.
Structured Logging in ASP.NET Core with Serilog: A Beginner's Guide
A friendly, step-by-step guide to structured logging in ASP.NET Core with Serilog: setup, sinks, request logging, and viewing logs on .NET 10.