Updated
Using GitHub Models with ai-sdk
GitHub Models is a relatively new way to easily get programmatic access to some AI models for free, however Vercel's AI SDK doesn't have a specific provider for it yet.
Luckily GitHub Models provides a OpenAI compatible API, which can be easily integrated with the AI SDK.
Setup
-
Install the AI SDK and the OpenAI compatible provider
bun add ai @ai-sdk/openai-compatible
-
Configure the provider, we will also type the provider with a list of available models, which makes it easier to use.
ai.ts import { createOpenAICompatible } from "@ai-sdk/openai-compatible"; /** * List of supported models * * @remarks * This is a small subset, theres more available */ type Models = | "openai/gpt-4.1" | "openai/gpt-4.1-mini" | "openai/gpt-4.1-nano" | "openai/gpt-5" | "openai/gpt-5-mini" | "openai/gpt-5-nano"; /** * Currently these isn't a official provider for GitHub Models, * but it does provide a OpenAI compatible endpoint */ export const githubModels = createOpenAICompatible< Models, Models, Models, "" >({ name: "github-models", baseURL: "https://models.github.ai/inference", apiKey: process.env.GITHUB_TOKEN, });
-
Generate a GitHub token with the
models:read
permission, then copy the token into your.env
file.env GITHUB_TOKEN=<your-github-token>
-
Call a model
my-app.ts import { generateText } from "ai"; import { githubModels } from "@/lib/ai"; const { text } = await generateText({ model: githubModels("openai/gpt-4.1-mini"), prompt: "What is React?", }); console.log(text);
-
Run it with
bun run my-app.ts
After running that, you should have the response from the model printed in your terminal.
Rate Limits
GitHub Models has a range of different rate limits depending on which model you are using.
A lot of the models only allow 1 request per minute and a total of 8 requests per day, which makes it mostly unusable. The "Low" rate limit tier has pretty good rate limits though, allowing 15 requests per minute and a total of 150 requests per day.
GitHub's documentation doesn't make it easy to see which models are in the "Low" rate limit tier, you have to manually click into each model in the model catalog to see the rate limits. For a quick reference, the ones I've found in the Low rate limit tier are:
openai/gpt-4.1
openai/gpt-4.1-mini
openai/gpt-4.1-nano