Skip to main content

内容相似度评分器

🌐 Content Similarity Scorer

createContentSimilarityScorer() 函数用于测量两个字符串之间的文本相似度,提供一个分数来表示它们的匹配程度。它支持可配置选项,例如区分大小写和空格处理。

🌐 The createContentSimilarityScorer() function measures the textual similarity between two strings, providing a score that indicates how closely they match. It supports configurable options for case sensitivity and whitespace handling.

参数
Direct link to 参数

🌐 Parameters

createContentSimilarityScorer() 函数接受一个包含以下属性的选项对象:

🌐 The createContentSimilarityScorer() function accepts a single options object with the following properties:

ignoreCase:

boolean
= true
Whether to ignore case differences when comparing strings.

ignoreWhitespace:

boolean
= true
Whether to normalize whitespace when comparing strings.

此函数返回 MastraScorer 类的一个实例。有关 .run() 方法及其输入/输出的详细信息,请参见 MastraScorer 参考

🌐 This function returns an instance of the MastraScorer class. See the MastraScorer reference for details on the .run() method and its input/output.

.run() 返回
Direct link to .run() 返回

🌐 .run() Returns

runId:

string
The id of the run (optional).

preprocessStepResult:

object
Object with processed input and output: { processedInput: string, processedOutput: string }

analyzeStepResult:

object
Object with similarity: { similarity: number }

score:

number
Similarity score (0-1) where 1 indicates perfect similarity.

评分详情
Direct link to 评分详情

🌐 Scoring Details

评分器通过字符级匹配和可配置的文本规范化来评估文本相似性。

🌐 The scorer evaluates textual similarity through character-level matching and configurable text normalization.

评分流程
Direct link to 评分流程

🌐 Scoring Process

  1. 规范化文本:
    • 大小写规范化(如果 ignoreCase 为 true)
    • 空白字符规范化(如果 ignoreWhitespace 为 true)
  2. 使用字符串相似度算法比较处理后的字符串:
    • 分析字符序列
    • 对齐词边界
    • 考虑相对位置
    • 考虑长度差异

最终得分:similarity_value * scale

🌐 Final score: similarity_value * scale

示例
Direct link to 示例

🌐 Example

评估预期输出和实际代理输出之间的文本相似性:

🌐 Evaluate textual similarity between expected and actual agent outputs:

src/example-content-similarity.ts
import { runEvals } from "@mastra/core/evals";
import { createContentSimilarityScorer } from "@mastra/evals/scorers/prebuilt";
import { myAgent } from "./agent";

const scorer = createContentSimilarityScorer();

const result = await runEvals({
data: [
{
input: "Summarize the benefits of TypeScript",
groundTruth:
"TypeScript provides static typing, better tooling support, and improved code maintainability.",
},
{
input: "What is machine learning?",
groundTruth:
"Machine learning is a subset of AI that enables systems to learn from data without explicit programming.",
},
],
scorers: [scorer],
target: myAgent,
onItemComplete: ({ scorerResults }) => {
console.log({
score: scorerResults[scorer.id].score,
groundTruth: scorerResults[scorer.id].groundTruth,
});
},
});

console.log(result.scores);

有关 runEvals 的更多详细信息,请参阅 runEvals 参考

🌐 For more details on runEvals, see the runEvals reference.

要将此评分器添加到代理中,请参阅 评分器概览 指南。

🌐 To add this scorer to an agent, see the Scorers overview guide.

分数解释
Direct link to 分数解释

🌐 Score interpretation

一个介于 0 和 1 之间的相似度分数:

🌐 A similarity score between 0 and 1:

  • 1.0:完全匹配——内容几乎完全相同。
  • 0.7–0.9:高度相似——仅在用词或句子结构上有细微差别。
  • 0.4–0.6:中等相似度——总体重叠明显,但存在显著差异。
  • 0.1–0.3:低相似度——很少共有元素或共同含义。
  • 0.0:没有相似性——内容完全不同。

🌐 Related