使用RAG构建研究论文助手

🌐 Building a Research Paper Assistant with RAG

在本指南中，你将创建一个 AI 研究助理，它可以使用检索增强生成（RAG）分析学术论文并回答有关其内容的具体问题。

🌐 In this guide, you'll create an AI research assistant that can analyze academic papers and answer specific questions about their content using Retrieval Augmented Generation (RAG).

你将以基础的 Transformer 论文《Attention Is All You Need》(https://arxiv.org/html/1706.03762) 作为示例。作为数据库，你将使用本地 libSQL 数据库。

🌐 You'll use the foundational Transformer paper "Attention Is All You Need" as your example. As a database you'll use a local libSQL database.

先决条件
Direct link to 先决条件

🌐 Prerequisites

已安装 Node.js v22.13.0 或更高版本
来自支持的模型提供商的 API 密钥
一个现有的 Mastra 项目（按照安装指南设置新项目）

RAG 的工作原理
Direct link to RAG 的工作原理

🌐 How RAG works

让我们来了解一下RAG的工作原理，以及你将如何实现每个组件。

🌐 Let's understand how RAG works and how you'll implement each component.

知识存储/索引
Direct link to 知识存储/索引

🌐 Knowledge Store/Index

将文本转换为向量表示
创建内容的数值表示
实现：你将使用 OpenAI 的 text-embedding-3-small 创建嵌入并将其存储在 LibSQLVector 中

寻回犬
Direct link to 寻回犬

🌐 Retriever

通过相似性搜索查找相关内容
将查询嵌入与存储向量匹配
实现：你将使用 LibSQLVector 对存储的嵌入进行相似性搜索

生成器
Direct link to 生成器

🌐 Generator

使用大型语言模型处理检索到的内容
创建有情境信息的回应
实现：你将使用 GPT-4o-mini 根据检索到的内容生成答案

你的实现将会：

🌐 Your implementation will:

将 Transformer 论文处理成嵌入向量
将它们存储在 LibSQLVector 中以便快速检索
使用相似性搜索查找相关部分
使用检索到的上下文生成准确的回答

创建代理
Direct link to 创建代理

🌐 Creating the Agent

让我们定义代理的行为，将其连接到你的Mastra项目，并创建向量存储。

🌐 Let's define the agent's behavior, connect it to your Mastra project, and create the vector store.

安装额外的依赖
🌐 Install additional dependencies
在运行安装指南之后，你需要安装额外的依赖：
🌐 After running the installation guide you'll need to install additional dependencies:
- npm
- pnpm
- Yarn
- Bun
npm install @mastra/rag@latest ai
pnpm add @mastra/rag@latest ai
yarn add @mastra/rag@latest ai
bun add @mastra/rag@latest ai

现在你将创建支持 RAG 的研究助理。该代理使用：

🌐 Now you'll create your RAG-enabled research assistant. The agent uses:

一个用于在向量存储上执行语义搜索以查找论文中相关内容的向量查询工具
GPT-4o-mini 用于理解查询和生成回复
自定义指令，用于指导代理如何分析论文、有效使用检索到的内容以及承认自身的局限性

创建一个新文件 src/mastra/agents/researchAgent.ts 并定义你的代理：

🌐 Create a new file src/mastra/agents/researchAgent.ts and define your agent:

src/mastra/agents/researchAgent.ts
import { Agent } from "@mastra/core/agent";
import { ModelRouterEmbeddingModel } from "@mastra/core/llm";
import { createVectorQueryTool } from "@mastra/rag";

// Create a tool for semantic search over the paper embeddings
const vectorQueryTool = createVectorQueryTool({
  vectorStoreName: "libSqlVector",
  indexName: "papers",
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

export const researchAgent = new Agent({
  id: "research-agent",
  name: "Research Assistant",
  instructions: `You are a helpful research assistant that analyzes academic papers and technical documents.
    Use the provided vector query tool to find relevant information from your knowledge base,
    and provide accurate, well-supported answers based on the retrieved content.
    Focus on the specific content available in the tool and acknowledge if you cannot find sufficient information to answer a question.
    Base your responses only on the content provided, not on general knowledge.`,
  model: "openai/gpt-5.1",
  tools: {
    vectorQueryTool,
  },
});

在你的项目根目录下，使用 pwd 命令获取绝对路径。路径可能类似于这样：
🌐 In the root of your project, grab the absolute path with the pwd command. The path might be similar to this:
```
> pwd
/Users/your-name/guides/research-assistant
```
在你的 src/mastra/index.ts 文件中，将以下内容添加到现有的文件和配置中：
🌐 In your src/mastra/index.ts file, add the following to your existing file and configuration:
src/mastra/index.ts
```
import { Mastra } from "@mastra/core";
import { LibSQLVector } from "@mastra/libsql";

const libSqlVector = new LibSQLVector({
  id: 'research-vectors',
  url: "file:/Users/your-name/guides/research-assistant/vector.db",
});

export const mastra = new Mastra({
  vectors: { libSqlVector },
});
```
对于 url，使用你从 pwd 命令获得的绝对路径。这样，vector.db 文件就会在你的项目根目录下创建。
🌐 For the url use the absolute path you got from the pwd command. This way the vector.db file is created at the root of your project.
note
在本指南中，你使用了指向本地 libSQL 文件的硬编码绝对路径，但在生产环境中这样做是不可行的。你应当使用远程的持久化数据库。
🌐 For the purpose of this guide you are using a hardcoded absolute path to your local libSQL file, however for production usage this won't work. You should use a remote persistent database then.

在 src/mastra/index.ts 文件中，将代理添加到 Mastra：

🌐 In the src/mastra/index.ts file, add the agent to Mastra:

src/mastra/index.ts
import { Mastra } from "@mastra/core";
import { LibSQLVector } from "@mastra/libsql";
import { researchAgent } from "./agents/researchAgent";

const libSqlVector = new LibSQLVector({
  id: 'research-vectors',
  url: "file:/Users/your-name/guides/research-assistant/vector.db",
});

export const mastra = new Mastra({
  agents: { researchAgent },
  vectors: { libSqlVector },
});

处理文档
Direct link to 处理文档

🌐 Processing documents

在接下来的步骤中，你将获取研究论文，将其拆分成较小的部分，为这些部分生成嵌入向量，并将这些信息块存储到向量数据库中。

🌐 In the following steps you'll fetch the research paper, split it into smaller chunks, generate embeddings for them, and store these chunks of information into the vector database.

在这一步，通过提供 URL 检索研究论文，然后将其转换为文档对象，并分割成较小的、可管理的块。通过分割成块，处理速度更快，效率更高。
🌐 In this step the research paper is retrieved by providing an URL, then converted to a document object, and split into smaller, manageable chunks. By splitting into chunks the processing is faster and more efficient.
创建一个新文件 src/store.ts 并添加以下内容：
🌐 Create a new file src/store.ts and add the following:
src/store.ts
```
import { MDocument } from "@mastra/rag";

// Load the paper
const paperUrl = "https://arxiv.org/html/1706.03762";
const response = await fetch(paperUrl);
const paperText = await response.text();

// Create document and chunk it
const doc = MDocument.fromText(paperText);
const chunks = await doc.chunk({
  strategy: "recursive",
  maxSize: 512,
  overlap: 50,
  separators: ["\n\n", "\n", " "],
});

console.log("Number of chunks:", chunks.length);
```
在终端中运行该文件：
🌐 Run the file in your terminal:
```
npx bun src/store.ts
```
你应该收到这个回复：
🌐 You should get back this response:
```
Number of chunks: 892
```

最后，你将通过以下方式为RAG准备内容：

🌐 Finally, you'll prepare the content for RAG by:

为每段文本生成嵌入
创建一个向量存储索引来保存嵌入
在向量数据库中存储嵌入向量和元数据（原始文本和来源信息）

note

这些元数据至关重要，因为当向量存储找到相关匹配时，它能够返回实际内容。

🌐 This metadata is crucial as it allows for returning the actual content when the vector store finds relevant matches.

这使得代理能够高效地搜索和获取相关信息。

🌐 This allows the agent to efficiently search and retrieve relevant information.

打开 src/store.ts 文件并添加以下内容：

🌐 Open the src/store.ts file and add the following:

src/store.ts
import { MDocument } from "@mastra/rag";
import { embedMany } from "ai";
import { mastra } from "./mastra";

// Load the paper
const paperUrl = "https://arxiv.org/html/1706.03762";
const response = await fetch(paperUrl);
const paperText = await response.text();

// Create document and chunk it
const doc = MDocument.fromText(paperText);
const chunks = await doc.chunk({
  strategy: "recursive",
  maxSize: 512,
  overlap: 50,
  separators: ["\n\n", "\n", " "],
});

// Generate embeddings
const { embeddings } = await embedMany({
  model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
  values: chunks.map((chunk) => chunk.text),
});

// Get the vector store instance from Mastra
const vectorStore = mastra.getVector("libSqlVector");

// Create an index for paper chunks
await vectorStore.createIndex({
  indexName: "papers",
  dimension: 1536,
});

// Store embeddings
await vectorStore.upsert({
  indexName: "papers",
  vectors: embeddings,
  metadata: chunks.map((chunk) => ({
    text: chunk.text,
    source: "transformer-paper",
  })),
});

最后，你现在需要通过再次运行脚本来存储嵌入：

🌐 Lastly, you'll now need to store the embeddings by running the script again:

npx bun src/store.ts

如果操作成功，你应该在终端中看不到任何输出或错误。

🌐 If the operation was successful you should see no output/errors in your terminal.

测试助手
Direct link to 测试助手

🌐 Test the Assistant

现在矢量数据库中已经包含了所有嵌入，你可以使用不同类型的查询来测试研究助手。

🌐 Now that the vector database has all embeddings, you can test the research assistant with different types of queries.

创建一个新文件 src/ask-agent.ts 并添加不同类型的查询：

🌐 Create a new file src/ask-agent.ts and add different types of queries:

src/ask-agent.ts
import { mastra } from "./mastra";
const agent = mastra.getAgent("researchAgent");

// Basic query about concepts
const query1 =
  "What problems does sequence modeling face with neural networks?";
const response1 = await agent.generate(query1);
console.log("\nQuery:", query1);
console.log("Response:", response1.text);

运行脚本：

🌐 Run the script:

npx bun src/ask-agent.ts

你应该看到类似这样的输出：

🌐 You should see output like:

Query: What problems does sequence modeling face with neural networks?
Response: Sequence modeling with neural networks faces several key challenges:
1. Vanishing and exploding gradients during training, especially with long sequences
2. Difficulty handling long-term dependencies in the input
3. Limited computational efficiency due to sequential processing
4. Challenges in parallelizing computations, resulting in longer training times

试试另一个问题：

🌐 Try another question:

src/ask-agent.ts
import { mastra } from "./mastra";
const agent = mastra.getAgent("researchAgent");

// Query about specific findings
const query2 = "What improvements were achieved in translation quality?";
const response2 = await agent.generate(query2);
console.log("\nQuery:", query2);
console.log("Response:", response2.text);

输出：

🌐 Output:

Query: What improvements were achieved in translation quality?
Response: The model showed significant improvements in translation quality, achieving more than 2.0
BLEU points improvement over previously reported models on the WMT 2014 English-to-German translation
task, while also reducing training costs.

启动应用
Direct link to 启动应用

🌐 Serve the Application

启动 Mastra 服务器，通过 API 暴露你的研究助手：

🌐 Start the Mastra server to expose your research assistant via API:

mastra dev

你的研究助理将在以下时间提供服务：

🌐 Your research assistant will be available at:

http://localhost:4111/api/agents/researchAgent/generate

使用 curl 测试：

🌐 Test with curl:

curl -X POST http://localhost:4111/api/agents/researchAgent/generate \
  -H "Content-Type: application/json" \
  -d '{
    "messages": [
      { "role": "user", "content": "What were the main findings about model parallelization?" }
    ]
  }'

高级 RAG 示例
Direct link to 高级 RAG 示例

🌐 Advanced RAG Examples

查看更多这些示例以了解更高级的RAG技术：

🌐 Explore these examples for more advanced RAG techniques:

Filter RAG 用于使用元数据过滤结果
清理 RAG 以优化信息密度
链式思维 RAG 用于使用工作流处理复杂推断查询
重新排序 RAG 以提高结果相关性

先决条件Direct link to 先决条件

RAG 的工作原理Direct link to RAG 的工作原理

知识存储/索引Direct link to 知识存储/索引

寻回犬Direct link to 寻回犬

生成器Direct link to 生成器

创建代理Direct link to 创建代理

处理文档Direct link to 处理文档

测试助手Direct link to 测试助手

启动应用Direct link to 启动应用

高级 RAG 示例Direct link to 高级 RAG 示例