答案相关性评分器
🌐 Answer Relevancy Scorer
createAnswerRelevancyScorer() 函数接受一个包含以下属性的单个选项对象:
🌐 The createAnswerRelevancyScorer() function accepts a single options object with the following properties:
参数Direct link to 参数
🌐 Parameters
model:
uncertaintyWeight:
scale:
此函数返回 MastraScorer 类的一个实例。.run() 方法接受与其他评分器相同的输入(参见 MastraScorer 参考),但返回值包括如下所述的特定于 LLM 的字段。
🌐 This function returns an instance of the MastraScorer class. The .run() method accepts the same input as other scorers (see the MastraScorer reference), but the return value includes LLM-specific fields as documented below.
.run() 返回Direct link to .run() 返回
🌐 .run() Returns
runId:
score:
preprocessPrompt:
preprocessStepResult:
analyzePrompt:
analyzeStepResult:
generateReasonPrompt:
reason:
评分详情Direct link to 评分详情
🌐 Scoring Details
评分者通过查询与答案的对齐来评估相关性,考虑完整性和细节水平,但不考虑事实正确性。
🌐 The scorer evaluates relevancy through query-answer alignment, considering completeness and detail level, but not factual correctness.
评分流程Direct link to 评分流程
🌐 Scoring Process
- 语句预处理:
- 将输出分解为有意义的语句,同时保留上下文。
- 相关性分析:
- 每个语句被评估为:
- “是”:直接匹配的全权重
- "unsure": 对近似匹配的部分权重(默认值:0.3)
- “不”:对无关内容零权重
- 每个语句被评估为:
- 分数计算:
((direct + uncertainty * partial) / total_statements) * scale
分数解释Direct link to 分数解释
🌐 Score Interpretation
相关性评分介于 0 到 1 之间:
🌐 A relevancy score between 0 and 1:
- 1.0:该回答完全回答了问题,提供了相关且集中的信息。
- 0.7–0.9:回答大部分满足查询需求,但可能包含少量无关内容。
- 0.4–0.6:回答部分解决了问题,混合了相关和无关的信息。
- 0.1–0.3:响应仅包含少量相关内容,且在很大程度上未能抓住查询的意图。
- 0.0:回复完全无关,并没有回答问题。
示例Direct link to 示例
🌐 Example
评估代理在不同场景下的响应相关性:
🌐 Evaluate agent responses for relevancy across different scenarios:
import { runEvals } from "@mastra/core/evals";
import { createAnswerRelevancyScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";
const scorer = createAnswerRelevancyScorer({ model: "openai/gpt-4o" });
const result = await runEvals({
data: [
{
input: "What are the health benefits of regular exercise?",
},
{
input: "What should a healthy breakfast include?",
},
{
input: "What are the benefits of meditation?",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
reason: scorerResults[scorer.id].reason,
});
},
});
console.log(result.scores);
有关 runEvals 的更多详细信息,请参阅 runEvals 参考。
🌐 For more details on runEvals, see the runEvals reference.
要将此评分器添加到代理中,请参阅 评分器概览 指南。
🌐 To add this scorer to an agent, see the Scorers overview guide.
相关Direct link to 相关
🌐 Related