答案相关性评分器

🌐 Answer Relevancy Scorer

createAnswerRelevancyScorer() 函数接受一个包含以下属性的单个选项对象：

🌐 The createAnswerRelevancyScorer() function accepts a single options object with the following properties:

参数
Direct link to 参数

🌐 Parameters

model:

LanguageModel

Configuration for the model used to evaluate relevancy.

uncertaintyWeight:

number

= 0.3

Weight given to 'unsure' verdicts in scoring (0-1).

scale:

number

= 1

Maximum score value.

此函数返回 MastraScorer 类的一个实例。.run() 方法接受与其他评分器相同的输入（参见 MastraScorer 参考），但返回值包括如下所述的特定于 LLM 的字段。

🌐 This function returns an instance of the MastraScorer class. The .run() method accepts the same input as other scorers (see the MastraScorer reference), but the return value includes LLM-specific fields as documented below.

.run() 返回
Direct link to .run() 返回

🌐 .run() Returns

runId:

string

The id of the run (optional).

score:

number

Relevancy score (0 to scale, default 0-1)

preprocessPrompt:

string

The prompt sent to the LLM for the preprocess step (optional).

preprocessStepResult:

object

Object with extracted statements: { statements: string[] }

analyzePrompt:

string

The prompt sent to the LLM for the analyze step (optional).

analyzeStepResult:

object

Object with results: { results: Array<{ result: 'yes' | 'unsure' | 'no', reason: string }> }

generateReasonPrompt:

string

The prompt sent to the LLM for the reason step (optional).

reason:

string

Explanation of the score.

评分详情
Direct link to 评分详情

🌐 Scoring Details

评分者通过查询与答案的对齐来评估相关性，考虑完整性和细节水平，但不考虑事实正确性。

🌐 The scorer evaluates relevancy through query-answer alignment, considering completeness and detail level, but not factual correctness.

评分流程
Direct link to 评分流程

🌐 Scoring Process

语句预处理：
- 将输出分解为有意义的语句，同时保留上下文。
相关性分析：
- 每个语句被评估为：
  - “是”：直接匹配的全权重
  - "unsure": 对近似匹配的部分权重（默认值：0.3）
  - “不”：对无关内容零权重
分数计算：
- ((direct + uncertainty * partial) / total_statements) * scale

分数解释
Direct link to 分数解释

🌐 Score Interpretation

示例
Direct link to 示例

🌐 Example

评估代理在不同场景下的响应相关性：

🌐 Evaluate agent responses for relevancy across different scenarios:

src/example-answer-relevancy.ts
import { runEvals } from "@mastra/core/evals";
import { createAnswerRelevancyScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createAnswerRelevancyScorer({ model: "openai/gpt-4o" });

const result = await runEvals({
  data: [
    {
      input: "What are the health benefits of regular exercise?",
    },
    {
      input: "What should a healthy breakfast include?",
    },
    {
      input: "What are the benefits of meditation?",
    },
  ],
  scorers: [scorer],
  target: myAgent,
  onItemComplete: ({ scorerResults }) => {
    console.log({
      score: scorerResults[scorer.id].score,
      reason: scorerResults[scorer.id].reason,
    });
  },
});

console.log(result.scores);

有关 runEvals 的更多详细信息，请参阅 runEvals 参考。

🌐 For more details on runEvals, see the runEvals reference.

要将此评分器添加到代理中，请参阅评分器概览指南。

🌐 To add this scorer to an agent, see the Scorers overview guide.

🌐 Related

忠实度评分器

参数Direct link to 参数

model:

uncertaintyWeight:

scale:

.run() 返回Direct link to .run() 返回

runId:

score:

preprocessPrompt:

preprocessStepResult:

analyzePrompt:

analyzeStepResult:

generateReasonPrompt:

reason:

评分详情Direct link to 评分详情

评分流程Direct link to 评分流程

分数解释Direct link to 分数解释

示例Direct link to 示例

相关Direct link to 相关

参数
Direct link to 参数

.run() 返回
Direct link to .run() 返回

评分详情
Direct link to 评分详情

评分流程
Direct link to 评分流程

分数解释
Direct link to 分数解释

示例
Direct link to 示例

相关
Direct link to 相关