What are GitHub Models?

GitHub Models is a service from GitHub that lets you call many AI models, like OpenAI GPT, Microsoft Phi, DeepSeek, and Meta Llama, through one single API. It has a free tier with rate limits that is great for learning and prototyping. To call it from code you need a GitHub personal access token with the 'models:read' permission. When you are ready for production you can switch to paid, pay-per-use access without changing much code.

Why use .NET Aspire for an AI app?

Aspire lets you declare your AI model in one place, called the AppHost, just like you declare a database or a cache. It then injects the connection details and the API key into every service that needs them. This means you never hard-code an API key inside your web app. You get one secret, stored once, shared safely. Aspire also gives you a dashboard so you can see logs, traces, and metrics for your AI calls in one screen.

What is Microsoft.Extensions.AI?

Microsoft.Extensions.AI (often shortened to MEAI) is a set of .NET libraries that give you one common interface, IChatClient, for talking to any AI model. Whether you use GitHub Models, Azure OpenAI, or a local model with Ollama, your app code stays the same. You can also wrap the client with extra features like logging, OpenTelemetry tracing, and automatic function calling, all without changing your core logic.

Do I need to pay to start with GitHub Models?

No. GitHub Models has a free tier that is perfect for learning and small experiments. It has rate limits, so it is not meant for heavy production traffic, but it is enough to build and test a real app. When you outgrow the free tier you move to paid usage that is billed per token, and your code barely changes.

IChatClient is the main interface in Microsoft.Extensions.AI for sending messages to a chat model and getting a reply. It has a GetResponseAsync method for a single full answer, and a GetStreamingResponseAsync method that streams the answer word by word so the user sees text appear live. Because it is an interface, you can swap the model behind it without rewriting your app.

DevOpsbeginner

Building Generative AI Apps With GitHub Models and .NET Aspire

A beginner-friendly guide to building generative AI apps in .NET using GitHub Models, .NET Aspire, and Microsoft.Extensions.AI with clean code examples.

13 min readUpdated May 16, 2026

Ordering food from one app, many kitchens

Think about a food delivery app like the ones we all use in India. You open one app. Inside, there are hundreds of restaurants. One cooks biryani, another makes dosa, another bakes pizza. You do not call each kitchen yourself. You do not need their phone numbers. The app holds one account for you, takes your order, and sends it to the right kitchen. You just say what you want, and food arrives.

GitHub Models works in the same friendly way for AI. There are many AI "kitchens" out there: OpenAI's GPT models, Microsoft's Phi models, Meta's Llama, DeepSeek, and more. Instead of signing up for each one, you use one door: GitHub Models. One token, many models.

And .NET Aspire is like the delivery app's system that remembers your address and payment once, then fills it in for every order. You set up your AI model in one place, and every part of your app can use it safely.

In this guide we will build a small AI chat feature in .NET, step by step, in simple language. By the end you will understand how GitHub Models, .NET Aspire, and Microsoft.Extensions.AI fit together like a neat little team.

Meet the three friends

Before we touch code, let us meet our three helpers. Each one has a clear job.

Friend	What it is	Its job in our app
GitHub Models	A service that hosts many AI models behind one API	Gives us the actual brain that writes answers
.NET Aspire	An orchestration tool for .NET apps	Holds the API key once and wires services together
Microsoft.Extensions.AI	A set of .NET libraries (MEAI)	Gives us one clean interface, `IChatClient`, to talk to any model

Here is the simple idea in one picture. Your web app talks to a common interface. The interface talks to GitHub Models. Aspire makes sure the key and the address are passed along quietly in the background.

Figure 1: Your app uses one interface. Aspire passes the key. GitHub Models does the thinking.

The big win here is that your app code never sees the messy details. It just asks a question and gets an answer.

What you need first

To follow along you need a few simple things. None of them are hard.

.NET 10 SDK. This is the current LTS (long-term support) release, so it is a safe base for new projects. C# 14 ships with it.
The .NET Aspire workload or templates, which you can add with the dotnet CLI.
A GitHub account and a personal access token with the models:read permission. This token is the key that lets your app call the models. Treat it like a password. Never paste it into your code.

Once you have a token, you are ready. The free tier of GitHub Models has rate limits, but it is more than enough to build and test what we are doing here.

Step 1: Create the Aspire app

An Aspire solution has a special project called the AppHost. Think of the AppHost as the manager of a kitchen. It does not cook. It decides who does what, hands out keys, and makes sure everyone can talk to each other.

We will add the GitHub Models hosting package to the AppHost. The package name is Aspire.Hosting.GitHub.Models.

// AppHost.cs — the manager of the whole app
var builder = DistributedApplication.CreateBuilder(args);
 
// Declare the AI model as a resource, just like a database.
// "chat" is the name we will use to reference it.
// "openai/gpt-4o-mini" is the model we picked from the catalog.
var chat = builder.AddGitHubModel("chat", "openai/gpt-4o-mini");
 
// Give our web project a reference to that model.
// Now the web app can find and use "chat".
builder.AddProject<Projects.MyChatWeb>("web")
       .WithReference(chat);
 
builder.Build().Run();

Notice how clean this is. We declared the model in one line, the same way we would declare a Redis cache or a PostgreSQL database. That is the Aspire style: everything your app depends on lives in the AppHost.

When you call AddGitHubModel, Aspire automatically creates a secret parameter for your token. Its name follows the pattern {resourceName}-gh-apikey. For our chat resource, that is chat-gh-apikey. You store your GitHub token in that parameter, and Aspire keeps it safe.

Step 2: Store the token safely

You do not put your token in code. You put it in the AppHost's user secrets, where it stays off-limits to source control. You set it once with a command like this:

// Run this in the AppHost project folder (shown as C# comment for clarity):
// dotnet user-secrets set "Parameters:chat-gh-apikey" "YOUR_GITHUB_TOKEN"
//
// Aspire reads this secret at startup and injects it into any
// service that references the "chat" model. Your web app never
// hard-codes the token. It just asks for "chat" and it works.

This is one of the nicest parts of using Aspire. The token lives in exactly one place. If it changes, you update one secret, not ten files. This is a big safety and maintenance win, and it is the kind of thing that quietly prevents leaked keys.

How the token travels

Secret

AppHost

Web app

GitHub Models

Steps

Secret

Token saved in user secrets

AppHost

Reads secret as a parameter

Web app

Receives it via reference

GitHub Models

Authenticates the call

The token is stored once and flows to every service that needs it.

Step 3: Connect from the web app

Now we move to the consuming side: the web app that actually talks to the model. For the best compatibility with GitHub Models, we use the Azure AI Inference client integration. The package is Aspire.Azure.AI.Inference. It speaks the same protocol that GitHub Models expects, and it gives us an IChatClient from Microsoft.Extensions.AI.

// Program.cs in the web app
var builder = WebApplication.CreateBuilder(args);
 
// "chat" must match the name we used in the AppHost.
// AddChatClient() turns the connection into an IChatClient
// that we can inject anywhere.
builder.AddAzureChatCompletionsClient("chat")
       .AddChatClient();
 
var app = builder.Build();

That is the whole connection. No URLs typed by hand. No keys in the file. The name "chat" ties everything back to the AppHost, and Aspire fills in the rest. This is the payoff for the small setup we did earlier.

Here is the flow of one chat request, from the user pressing send to the answer coming back.

Figure 2: One round trip of a chat message through the system.

Step 4: Ask the model a question

With an IChatClient injected, sending a message is short and readable. You build a list of messages, give each one a role, and call GetResponseAsync.

public class JokeService(IChatClient chatClient)
{
    public async Task<string> TellJokeAsync(string topic)
    {
        // Messages carry a role: System sets the tone,
        // User is the human asking.
        var messages = new List<ChatMessage>
        {
            new(ChatRole.System, "You are a friendly assistant for kids."),
            new(ChatRole.User, $"Tell me a short, clean joke about {topic}.")
        };
 
        ChatResponse response = await chatClient.GetResponseAsync(messages);
        return response.Text; // The model's answer as plain text.
    }
}

The System message is like giving an actor their character before the scene. It shapes how the model behaves. The User message is the actual request. The model reads both and replies.

Streaming the answer word by word

Sometimes you do not want to wait for the whole answer. You want words to appear live, the way a person types. IChatClient supports this with GetStreamingResponseAsync, which returns an async stream of small updates.

public async IAsyncEnumerable<string> StreamAnswerAsync(string question)
{
    var messages = new List<ChatMessage>
    {
        new(ChatRole.User, question)
    };
 
    // Each update is a small chunk of text.
    // We yield it right away so the UI can show it live.
    await foreach (var update in chatClient.GetStreamingResponseAsync(messages))
    {
        if (!string.IsNullOrEmpty(update.Text))
        {
            yield return update.Text;
        }
    }
}

Streaming makes an app feel fast and alive, even when the full answer takes a few seconds. This is exactly how popular chat apps show text flowing in.

Step 5: Add superpowers with the middleware pipeline

Here is where Microsoft.Extensions.AI really shines. An IChatClient can be wrapped with extra behaviors, like layers of an onion. Each layer adds a feature without changing your core code. This is called a middleware pipeline.

// Build a smarter client by stacking features.
builder.Services.AddChatClient(sp =>
        sp.GetRequiredService<IChatClient>())
    .UseFunctionInvocation()   // Let the model call your C# functions.
    .UseOpenTelemetry()        // Trace every call for observability.
    .UseLogging();             // Write logs for each request.

Each Use... call adds one ring to the onion. Let us see what these rings do.

Wrapper	What it adds	Why you want it
`UseFunctionInvocation`	Lets the model call your own C# methods	The model can fetch live data, like a weather lookup
`UseOpenTelemetry`	Emits traces and metrics	You can watch AI calls in your dashboard
`UseLogging`	Logs each request and response	Easy debugging when something looks wrong
`UseDistributedCache`	Caches repeated answers	Saves cost and time on the same question

Here is the onion idea as a picture. Your call passes through each layer on the way out and on the way back.

Figure 3: The middleware pipeline. Each layer wraps the next.

The beauty is that your JokeService from earlier does not change at all. It still asks for IChatClient and calls GetResponseAsync. The extra features come for free, decided at startup.

Step 6: Watch it all in the Aspire dashboard

When you run the AppHost, Aspire opens a dashboard in your browser. This is your control room. You see every service, whether it is healthy, and a live stream of logs. If you added UseOpenTelemetry, you also see traces: a timeline of each AI call, how long it took, and which model answered.

From run to insight

Run AppHost

Dashboard opens

Send a chat

See traces

Steps

Run AppHost

dotnet run starts everything

Dashboard opens

One screen for all services

Send a chat

User asks a question

See traces

Timing and logs appear live

Running the AppHost gives you a live view of your AI app's health.

This single dashboard is a huge help when learning. You do not guess what happened. You watch it happen. If a call is slow or fails, the trace tells you exactly where.

A quick word on cost and the free tier

GitHub Models gives you a free tier with rate limits. This is perfect for learning, demos, and small side projects. You can build the whole app in this guide without paying anything.

When your app grows and you need more requests per minute, you move to paid, pay-per-use access. The good news: your code barely changes. Because everything goes through IChatClient and Aspire, switching tiers or even switching models is a small config change, not a rewrite. You could move from openai/gpt-4o-mini to a different model by editing one line in the AppHost.

A note on libraries and licensing

If you read older tutorials, you may see messaging libraries like MediatR and MassTransit used to pass requests around. Be aware that both of these are now commercially licensed for many uses. They are still fine tools, but check their license before adding them to a real project. For the small AI app in this guide you do not need them at all. Plain dependency injection and IChatClient are enough.

Also worth knowing for the future: C# 15, with its new union types, is currently in .NET 11 preview. For today, .NET 10 LTS and C# 14 are the stable, recommended base for shipping AI apps.

Putting the whole picture together

Let us zoom out and see the full shape of what we built. The AppHost is the manager. The web app is the worker. GitHub Models is the brain. MEAI is the shared language between them.

Figure 4: The complete app. Aspire wires the parts, MEAI gives a common interface, GitHub Models supplies the model.

Every piece has one clear job, and the connections between them are handled for you. That is what makes this stack pleasant to work with, even for a beginner.

Common mistakes to avoid

A few small traps catch new builders. Here is a short table so you can dodge them.

Mistake	What goes wrong	The fix
Token in source code	Your secret can leak on GitHub	Use user secrets and the `chat-gh-apikey` parameter
Name mismatch	App cannot find the model	Use the exact same name, like `"chat"`, in AppHost and web app
Wrong token scope	Calls get rejected	Make sure the token has `models:read` permission
Ignoring rate limits	Free tier blocks extra calls	Add caching or slow down during testing

Most "it does not work" moments come from one of these four. Check them first and you will save a lot of time.

References and further reading

Quick recap

GitHub Models is one door to many AI models, with a free tier for learning. You need a token with the models:read permission.
.NET Aspire holds your token once in the AppHost and injects it into every service, so you never hard-code secrets.
Declare the model with AddGitHubModel("chat", "openai/gpt-4o-mini") in the AppHost, then WithReference it into your web app.
In the web app, AddAzureChatCompletionsClient("chat").AddChatClient() gives you a ready-to-use IChatClient.
Use GetResponseAsync for a full answer and GetStreamingResponseAsync to stream words live.
Wrap IChatClient with UseFunctionInvocation, UseOpenTelemetry, and UseLogging to add features without touching your core code.
The Aspire dashboard lets you watch logs and traces in one place.
Stick to .NET 10 LTS and C# 14 for stable apps. C# 15 union types are still in .NET 11 preview.
Watch out for token leaks, name mismatches, wrong token scope, and rate limits.