Building Generative AI Apps With GitHub Models and .NET Aspire
A beginner-friendly guide to building generative AI apps in .NET using GitHub Models, .NET Aspire, and Microsoft.Extensions.AI with clean code examples.
Ordering food from one app, many kitchens
Think about a food delivery app like the ones we all use in India. You open one app. Inside, there are hundreds of restaurants. One cooks biryani, another makes dosa, another bakes pizza. You do not call each kitchen yourself. You do not need their phone numbers. The app holds one account for you, takes your order, and sends it to the right kitchen. You just say what you want, and food arrives.
GitHub Models works in the same friendly way for AI. There are many AI "kitchens" out there: OpenAI's GPT models, Microsoft's Phi models, Meta's Llama, DeepSeek, and more. Instead of signing up for each one, you use one door: GitHub Models. One token, many models.
And .NET Aspire is like the delivery app's system that remembers your address and payment once, then fills it in for every order. You set up your AI model in one place, and every part of your app can use it safely.
In this guide we will build a small AI chat feature in .NET, step by step, in simple language. By the end you will understand how GitHub Models, .NET Aspire, and Microsoft.Extensions.AI fit together like a neat little team.
Meet the three friends
Before we touch code, let us meet our three helpers. Each one has a clear job.
| Friend | What it is | Its job in our app |
|---|---|---|
| GitHub Models | A service that hosts many AI models behind one API | Gives us the actual brain that writes answers |
| .NET Aspire | An orchestration tool for .NET apps | Holds the API key once and wires services together |
| Microsoft.Extensions.AI | A set of .NET libraries (MEAI) | Gives us one clean interface, IChatClient, to talk to any model |
Here is the simple idea in one picture. Your web app talks to a common interface. The interface talks to GitHub Models. Aspire makes sure the key and the address are passed along quietly in the background.
The big win here is that your app code never sees the messy details. It just asks a question and gets an answer.
What you need first
To follow along you need a few simple things. None of them are hard.
- .NET 10 SDK. This is the current LTS (long-term support) release, so it is a safe base for new projects. C# 14 ships with it.
- The .NET Aspire workload or templates, which you can add with the
dotnetCLI. - A GitHub account and a personal access token with the
models:readpermission. This token is the key that lets your app call the models. Treat it like a password. Never paste it into your code.
Once you have a token, you are ready. The free tier of GitHub Models has rate limits, but it is more than enough to build and test what we are doing here.
Step 1: Create the Aspire app
An Aspire solution has a special project called the AppHost. Think of the AppHost as the manager of a kitchen. It does not cook. It decides who does what, hands out keys, and makes sure everyone can talk to each other.
We will add the GitHub Models hosting package to the AppHost. The package name is Aspire.Hosting.GitHub.Models.
// AppHost.cs — the manager of the whole app
var builder = DistributedApplication.CreateBuilder(args);
// Declare the AI model as a resource, just like a database.
// "chat" is the name we will use to reference it.
// "openai/gpt-4o-mini" is the model we picked from the catalog.
var chat = builder.AddGitHubModel("chat", "openai/gpt-4o-mini");
// Give our web project a reference to that model.
// Now the web app can find and use "chat".
builder.AddProject<Projects.MyChatWeb>("web")
.WithReference(chat);
builder.Build().Run();Notice how clean this is. We declared the model in one line, the same way we would declare a Redis cache or a PostgreSQL database. That is the Aspire style: everything your app depends on lives in the AppHost.
When you call AddGitHubModel, Aspire automatically creates a secret parameter for your token. Its name follows the pattern {resourceName}-gh-apikey. For our chat resource, that is chat-gh-apikey. You store your GitHub token in that parameter, and Aspire keeps it safe.
Step 2: Store the token safely
You do not put your token in code. You put it in the AppHost's user secrets, where it stays off-limits to source control. You set it once with a command like this:
// Run this in the AppHost project folder (shown as C# comment for clarity):
// dotnet user-secrets set "Parameters:chat-gh-apikey" "YOUR_GITHUB_TOKEN"
//
// Aspire reads this secret at startup and injects it into any
// service that references the "chat" model. Your web app never
// hard-codes the token. It just asks for "chat" and it works.This is one of the nicest parts of using Aspire. The token lives in exactly one place. If it changes, you update one secret, not ten files. This is a big safety and maintenance win, and it is the kind of thing that quietly prevents leaked keys.
How the token travels
Steps
Secret
Token saved in user secrets
AppHost
Reads secret as a parameter
Web app
Receives it via reference
GitHub Models
Authenticates the call
Step 3: Connect from the web app
Now we move to the consuming side: the web app that actually talks to the model. For the best compatibility with GitHub Models, we use the Azure AI Inference client integration. The package is Aspire.Azure.AI.Inference. It speaks the same protocol that GitHub Models expects, and it gives us an IChatClient from Microsoft.Extensions.AI.
// Program.cs in the web app
var builder = WebApplication.CreateBuilder(args);
// "chat" must match the name we used in the AppHost.
// AddChatClient() turns the connection into an IChatClient
// that we can inject anywhere.
builder.AddAzureChatCompletionsClient("chat")
.AddChatClient();
var app = builder.Build();That is the whole connection. No URLs typed by hand. No keys in the file. The name "chat" ties everything back to the AppHost, and Aspire fills in the rest. This is the payoff for the small setup we did earlier.
Here is the flow of one chat request, from the user pressing send to the answer coming back.
Step 4: Ask the model a question
With an IChatClient injected, sending a message is short and readable. You build a list of messages, give each one a role, and call GetResponseAsync.
public class JokeService(IChatClient chatClient)
{
public async Task<string> TellJokeAsync(string topic)
{
// Messages carry a role: System sets the tone,
// User is the human asking.
var messages = new List<ChatMessage>
{
new(ChatRole.System, "You are a friendly assistant for kids."),
new(ChatRole.User, $"Tell me a short, clean joke about {topic}.")
};
ChatResponse response = await chatClient.GetResponseAsync(messages);
return response.Text; // The model's answer as plain text.
}
}The System message is like giving an actor their character before the scene. It shapes how the model behaves. The User message is the actual request. The model reads both and replies.
Streaming the answer word by word
Sometimes you do not want to wait for the whole answer. You want words to appear live, the way a person types. IChatClient supports this with GetStreamingResponseAsync, which returns an async stream of small updates.
public async IAsyncEnumerable<string> StreamAnswerAsync(string question)
{
var messages = new List<ChatMessage>
{
new(ChatRole.User, question)
};
// Each update is a small chunk of text.
// We yield it right away so the UI can show it live.
await foreach (var update in chatClient.GetStreamingResponseAsync(messages))
{
if (!string.IsNullOrEmpty(update.Text))
{
yield return update.Text;
}
}
}Streaming makes an app feel fast and alive, even when the full answer takes a few seconds. This is exactly how popular chat apps show text flowing in.
Step 5: Add superpowers with the middleware pipeline
Here is where Microsoft.Extensions.AI really shines. An IChatClient can be wrapped with extra behaviors, like layers of an onion. Each layer adds a feature without changing your core code. This is called a middleware pipeline.
// Build a smarter client by stacking features.
builder.Services.AddChatClient(sp =>
sp.GetRequiredService<IChatClient>())
.UseFunctionInvocation() // Let the model call your C# functions.
.UseOpenTelemetry() // Trace every call for observability.
.UseLogging(); // Write logs for each request.Each Use... call adds one ring to the onion. Let us see what these rings do.
| Wrapper | What it adds | Why you want it |
|---|---|---|
UseFunctionInvocation | Lets the model call your own C# methods | The model can fetch live data, like a weather lookup |
UseOpenTelemetry | Emits traces and metrics | You can watch AI calls in your dashboard |
UseLogging | Logs each request and response | Easy debugging when something looks wrong |
UseDistributedCache | Caches repeated answers | Saves cost and time on the same question |
Here is the onion idea as a picture. Your call passes through each layer on the way out and on the way back.
The beauty is that your JokeService from earlier does not change at all. It still asks for IChatClient and calls GetResponseAsync. The extra features come for free, decided at startup.
Step 6: Watch it all in the Aspire dashboard
When you run the AppHost, Aspire opens a dashboard in your browser. This is your control room. You see every service, whether it is healthy, and a live stream of logs. If you added UseOpenTelemetry, you also see traces: a timeline of each AI call, how long it took, and which model answered.
From run to insight
Steps
Run AppHost
dotnet run starts everything
Dashboard opens
One screen for all services
Send a chat
User asks a question
See traces
Timing and logs appear live
This single dashboard is a huge help when learning. You do not guess what happened. You watch it happen. If a call is slow or fails, the trace tells you exactly where.
A quick word on cost and the free tier
GitHub Models gives you a free tier with rate limits. This is perfect for learning, demos, and small side projects. You can build the whole app in this guide without paying anything.
When your app grows and you need more requests per minute, you move to paid, pay-per-use access. The good news: your code barely changes. Because everything goes through IChatClient and Aspire, switching tiers or even switching models is a small config change, not a rewrite. You could move from openai/gpt-4o-mini to a different model by editing one line in the AppHost.
A note on libraries and licensing
If you read older tutorials, you may see messaging libraries like MediatR and MassTransit used to pass requests around. Be aware that both of these are now commercially licensed for many uses. They are still fine tools, but check their license before adding them to a real project. For the small AI app in this guide you do not need them at all. Plain dependency injection and IChatClient are enough.
Also worth knowing for the future: C# 15, with its new union types, is currently in .NET 11 preview. For today, .NET 10 LTS and C# 14 are the stable, recommended base for shipping AI apps.
Putting the whole picture together
Let us zoom out and see the full shape of what we built. The AppHost is the manager. The web app is the worker. GitHub Models is the brain. MEAI is the shared language between them.
Every piece has one clear job, and the connections between them are handled for you. That is what makes this stack pleasant to work with, even for a beginner.
Common mistakes to avoid
A few small traps catch new builders. Here is a short table so you can dodge them.
| Mistake | What goes wrong | The fix |
|---|---|---|
| Token in source code | Your secret can leak on GitHub | Use user secrets and the chat-gh-apikey parameter |
| Name mismatch | App cannot find the model | Use the exact same name, like "chat", in AppHost and web app |
| Wrong token scope | Calls get rejected | Make sure the token has models:read permission |
| Ignoring rate limits | Free tier blocks extra calls | Add caching or slow down during testing |
Most "it does not work" moments come from one of these four. Check them first and you will save a lot of time.
References and further reading
- GitHub Models integration with Aspire — Get started (official docs)
- GitHub Models client integration (Aspire docs)
- Microsoft.Extensions.AI libraries (Microsoft Learn)
- Introducing Microsoft.Extensions.AI Preview (.NET Blog)
- Quickstart: Create a .NET AI app using the AI app template (Microsoft Learn)
- Generative AI for Beginners — .NET (Microsoft GitHub samples)
- Milan Jovanović — Building Generative AI Apps With GitHub Models and .NET Aspire
Quick recap
- GitHub Models is one door to many AI models, with a free tier for learning. You need a token with the
models:readpermission. - .NET Aspire holds your token once in the AppHost and injects it into every service, so you never hard-code secrets.
- Declare the model with
AddGitHubModel("chat", "openai/gpt-4o-mini")in the AppHost, thenWithReferenceit into your web app. - In the web app,
AddAzureChatCompletionsClient("chat").AddChatClient()gives you a ready-to-useIChatClient. - Use
GetResponseAsyncfor a full answer andGetStreamingResponseAsyncto stream words live. - Wrap
IChatClientwithUseFunctionInvocation,UseOpenTelemetry, andUseLoggingto add features without touching your core code. - The Aspire dashboard lets you watch logs and traces in one place.
- Stick to .NET 10 LTS and C# 14 for stable apps. C# 15 union types are still in .NET 11 preview.
- Watch out for token leaks, name mismatches, wrong token scope, and rate limits.
Related Posts
5 Serilog Best Practices for Better Structured Logging in .NET
Learn 5 simple Serilog best practices for structured logging in .NET: message templates, enrichers, correlation IDs, hiding secrets, and async sinks.
How to Set Up Production-Ready Monitoring With ASP.NET Core Health Checks
A friendly, step-by-step guide to production-ready monitoring with ASP.NET Core health checks: liveness, readiness, dependency checks, a UI, and probes.
YARP as an API Gateway in .NET: A Beginner's Guide
Learn how to use YARP as an API gateway in .NET 10. Routes, clusters, load balancing, health checks, auth, and transforms explained in simple, friendly steps.
Containerize Your .NET Applications Without a Dockerfile
Learn how to build container images for your .NET apps using the SDK and dotnet publish, with no Dockerfile needed. Beginner-friendly guide for .NET 10.
Horizontally Scaling ASP.NET Core APIs With YARP Load Balancing
Learn how to scale ASP.NET Core APIs horizontally using YARP load balancing, with policies, health checks, and a full Program.cs setup explained simply.
Implementing an API Gateway for Microservices With YARP
Learn to build an API gateway for microservices with YARP in .NET 10. Routes, clusters, auth, rate limits, and transforms explained in simple steps.