推断

🌐 Inference logo Inference

通过 Mastra 的模型路由访问 9 个推断模型。身份验证会自动使用 INFERENCE_API_KEY 环境变量处理。

🌐 Access 9 Inference models through Mastra's model router. Authentication is handled automatically using the INFERENCE_API_KEY environment variable.

在推断文档中了解更多信息。

🌐 Learn more in the Inference documentation.

.env
INFERENCE_API_KEY=your-api-key

src/mastra/agents/my-agent.ts
import { Agent } from "@mastra/core/agent";

const agent = new Agent({
  id: "my-agent",
  name: "My Agent",
  instructions: "You are a helpful assistant",
  model: "inference/google/gemma-3"
});

// Generate a response
const response = await agent.generate("Hello!");

// Stream a response
const stream = await agent.stream("Tell me a story");
for await (const chunk of stream) {
  console.log(chunk);
}

info

Mastra 使用与 OpenAI 兼容的 /chat/completions 端点。某些特定于提供商的功能可能无法使用。详见推断文档。

🌐 Mastra uses the OpenAI-compatible /chat/completions endpoint. Some provider-specific features may not be available. Check the Inference documentation for details.

模型
Direct link to 模型

🌐 Models

9 available models
Model	Context	Input $/1M	Output $/1M
`inference/google/gemma-3`	125K	$0.15	$0.30
`inference/meta/llama-3.1-8b-instruct`	16K	$0.03	$0.03
`inference/meta/llama-3.2-11b-vision-instruct`	16K	$0.06	$0.06
`inference/meta/llama-3.2-1b-instruct`	16K	$0.01	$0.01
`inference/meta/llama-3.2-3b-instruct`	16K	$0.02	$0.02
`inference/mistral/mistral-nemo-12b-instruct`	16K	$0.04	$0.10
`inference/osmosis/osmosis-structure-0.6b`	4K	$0.10	$0.50
`inference/qwen/qwen-2.5-7b-vision-instruct`	125K	$0.20	$0.20
`inference/qwen/qwen3-embedding-4b`	32K	$0.01	—

高级配置
Direct link to 高级配置

🌐 Advanced Configuration

自定义头
Direct link to 自定义头

🌐 Custom Headers

src/mastra/agents/my-agent.ts
const agent = new Agent({
  id: "custom-agent",
  name: "custom-agent",
  model: {
    url: "https://inference.net/v1",
    id: "inference/google/gemma-3",
    apiKey: process.env.INFERENCE_API_KEY,
    headers: {
      "X-Custom-Header": "value"
    }
  }
});

动态模型选择
Direct link to 动态模型选择

🌐 Dynamic Model Selection

src/mastra/agents/my-agent.ts
const agent = new Agent({
  id: "dynamic-agent",
  name: "Dynamic Agent",
  model: ({ requestContext }) => {
    const useAdvanced = requestContext.task === "complex";
    return useAdvanced
      ? "inference/qwen/qwen3-embedding-4b"
      : "inference/google/gemma-3";
  }
});

模型Direct link to 模型

高级配置Direct link to 高级配置

自定义头Direct link to 自定义头

动态模型选择Direct link to 动态模型选择

模型
Direct link to 模型

高级配置
Direct link to 高级配置

自定义头
Direct link to 自定义头

动态模型选择
Direct link to 动态模型选择