Updated

Using GitHub Models with ai-sdk

AI

GitHub Models is a relatively new way to easily get programmatic access to some AI models for free, however Vercel's AI SDK doesn't have a specific provider for it yet.

Luckily GitHub Models provides a OpenAI compatible API, which can be easily integrated with the AI SDK.

Setup

  1. Install the AI SDK and the OpenAI compatible provider

    bun add ai @ai-sdk/openai-compatible
  2. Configure the provider, we will also type the provider with a list of available models, which makes it easier to use.

    ai.ts
    import { createOpenAICompatible } from "@ai-sdk/openai-compatible";
     
    /**
     * List of supported models
     *
     * @remarks
     * This is a small subset, theres more available
     */
    type Models =
      | "openai/gpt-4.1"
      | "openai/gpt-4.1-mini"
      | "openai/gpt-4.1-nano"
      | "openai/gpt-5"
      | "openai/gpt-5-mini"
      | "openai/gpt-5-nano";
     
    /**
     * Currently these isn't a official provider for GitHub Models,
     * but it does provide a OpenAI compatible endpoint
     */
    export const githubModels = createOpenAICompatible<
      Models,
      Models,
      Models,
      ""
    >({
      name: "github-models",
      baseURL: "https://models.github.ai/inference",
      apiKey: process.env.GITHUB_TOKEN,
    });
  3. Generate a GitHub token with the models:read permission, then copy the token into your .env file

    .env
    GITHUB_TOKEN=<your-github-token>
  4. Call a model

    my-app.ts
    import { generateText } from "ai";
    import { githubModels } from "@/lib/ai";
     
    const { text } = await generateText({
      model: githubModels("openai/gpt-4.1-mini"),
      prompt: "What is React?",
    });
     
    console.log(text);
  5. Run it with bun run my-app.ts

After running that, you should have the response from the model printed in your terminal.

Rate Limits

GitHub Models has a range of different rate limits depending on which model you are using.

A lot of the models only allow 1 request per minute and a total of 8 requests per day, which makes it mostly unusable. The "Low" rate limit tier has pretty good rate limits though, allowing 15 requests per minute and a total of 150 requests per day.

GitHub's documentation doesn't make it easy to see which models are in the "Low" rate limit tier, you have to manually click into each model in the model catalog to see the rate limits. For a quick reference, the ones I've found in the Low rate limit tier are: