Skip to main content

createVectorQueryTool()

createVectorQueryTool() 函数用于创建一个在向量存储上进行语义搜索的工具。它支持过滤、重新排序、数据库特定配置,并可与各种向量存储后端集成。

🌐 The createVectorQueryTool() function creates a tool for semantic search over vector stores. It supports filtering, reranking, database-specific configurations, and integrates with various vector store backends.

基本用法
Direct link to 基本用法

🌐 Basic Usage

import { createVectorQueryTool } from "@mastra/rag";
import { ModelRouterEmbeddingModel } from "@mastra/core/llm";

const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

参数
Direct link to 参数

🌐 Parameters

note

参数要求: 大多数字段可以在创建时设置为默认值。一些字段可以在运行时通过请求上下文或输入进行覆盖。如果创建时和运行时都缺少必填字段,将会抛出错误。请注意,modeliddescription 只能在创建时设置。

id?:

string
Custom ID for the tool. By default: 'VectorQuery {vectorStoreName} {indexName} Tool'. (Set at creation only.)

description?:

string
Custom description for the tool. By default: 'Access the knowledge base to find information needed to answer user questions' (Set at creation only.)

model:

EmbeddingModel
Embedding model to use for vector search. (Set at creation only.)

vectorStoreName:

string
Name of the vector store to query. (Can be set at creation or overridden at runtime.)

indexName:

string
Name of the index within the vector store. (Can be set at creation or overridden at runtime.)

enableFilter?:

boolean
= false
Enable filtering of results based on metadata. (Set at creation only, but will be automatically enabled if a filter is provided in the request context.)

includeVectors?:

boolean
= false
Include the embedding vectors in the results. (Can be set at creation or overridden at runtime.)

includeSources?:

boolean
= true
Include the full retrieval objects in the results. (Can be set at creation or overridden at runtime.)

reranker?:

RerankConfig
Options for reranking results. (Can be set at creation or overridden at runtime.)

databaseConfig?:

DatabaseConfig
Database-specific configuration options for optimizing queries. (Can be set at creation or overridden at runtime.)

providerOptions?:

Record<string, Record<string, any>>
Provider-specific options for the embedding model (e.g., outputDimensionality). **Important**: Only works with AI SDK EmbeddingModelV2 models. For V1 models, configure options when creating the model itself.

vectorStore?:

MastraVector | VectorStoreResolver
Direct vector store instance or a resolver function for dynamic selection. Use a function for multi-tenant applications where the vector store is selected based on request context. When provided, `vectorStoreName` becomes optional.

DatabaseConfig
Direct link to DatabaseConfig

DatabaseConfig 类型允许你指定数据库特定的配置,这些配置会自动应用于查询操作。这使你能够利用不同向量存储提供的独特功能和优化。

🌐 The DatabaseConfig type allows you to specify database-specific configurations that are automatically applied to query operations. This enables you to take advantage of unique features and optimizations offered by different vector stores.

pinecone?:

PineconeConfig
Configuration specific to Pinecone vector store
object

namespace?:

string
Pinecone namespace for organizing vectors

sparseVector?:

{ indices: number[]; values: number[]; }
Sparse vector for hybrid search

pgvector?:

PgVectorConfig
Configuration specific to PostgreSQL with pgvector extension
object

minScore?:

number
Minimum similarity score threshold for results

ef?:

number
HNSW search parameter - controls accuracy vs speed tradeoff

probes?:

number
IVFFlat probe parameter - number of cells to visit during search

chroma?:

ChromaConfig
Configuration specific to Chroma vector store
object

where?:

Record<string, any>
Metadata filtering conditions

whereDocument?:

Record<string, any>
Document content filtering conditions

RerankConfig
Direct link to RerankConfig

model:

MastraLanguageModel
Language model to use for reranking

options?:

RerankerOptions
Options for the reranking process
object

weights?:

WeightConfig
Weights for scoring components (semantic: 0.4, vector: 0.4, position: 0.2)

topK?:

number
Number of top results to return

返回
Direct link to 返回

🌐 Returns

该工具返回一个对象,其中包含:

🌐 The tool returns an object with:

relevantContext:

string
Combined text from the most relevant document chunks

sources:

QueryResult[]
Array of full retrieval result objects. Each object contains all information needed to reference the original document, chunk, and similarity score.

QueryResult 对象结构
Direct link to QueryResult 对象结构

🌐 QueryResult object structure

{
id: string; // Unique chunk/document identifier
metadata: any; // All metadata fields (document ID, etc.)
vector: number[]; // Embedding vector (if available)
score: number; // Similarity score for this retrieval
document: string; // Full chunk/document text (if available)
}

默认工具描述
Direct link to 默认工具描述

🌐 Default Tool Description

默认描述强调:

🌐 The default description focuses on:

  • 在存储的知识中寻找相关信息
  • 回答用户问题
  • 检索事实内容

结果处理
Direct link to 结果处理

🌐 Result Handling

该工具根据用户的查询确定返回结果的数量,默认返回10条结果。可以根据查询需求进行调整。

🌐 The tool determines the number of results to return based on the user's query, with a default of 10 results. This can be adjusted based on the query requirements.

带过滤器的示例
Direct link to 带过滤器的示例

🌐 Example with Filters

const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
enableFilter: true,
});

启用筛选后,该工具会处理查询,以构建与语义搜索结合的元数据筛选器。其过程如下:

🌐 With filtering enabled, the tool processes queries to construct metadata filters that combine with semantic search. The process works as follows:

  1. 用户发出带有特定筛选要求的查询,例如“查找 'version' 字段大于 2.0 的内容”

  2. 代理分析查询并构建适当的筛选条件:

    {
    "version": { "$gt": 2.0 }
    }

这种以代理为驱动的方法:

🌐 This agent-driven approach:

  • 将自然语言查询处理为过滤器规范
  • 实现向量存储特定的过滤语法
  • 将查询词转换为过滤操作符

有关详细的过滤器语法和特定存储的功能,请参阅 Metadata Filters 文档。

🌐 For detailed filter syntax and store-specific capabilities, see the Metadata Filters documentation.

有关代理驱动筛选如何工作的示例,请参见 代理驱动元数据筛选 示例。

🌐 For an example of how agent-driven filtering works, see the Agent-Driven Metadata Filtering example.

带重排序的示例
Direct link to 带重排序的示例

🌐 Example with Reranking

const queryTool = createVectorQueryTool({
vectorStoreName: "milvus",
indexName: "documentation",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
reranker: {
model: "openai/gpt-5.1",
options: {
weights: {
semantic: 0.5, // Semantic relevance weight
vector: 0.3, // Vector similarity weight
position: 0.2, // Original position weight
},
topK: 5,
},
},
});

重新排序通过结合以下内容来提高结果质量:

🌐 Reranking improves result quality by combining:

  • 语义相关性:使用基于大语言模型的文本相似度评分
  • 向量相似度:原始向量距离分数
  • 位置偏差:考虑原始结果的排序
  • 查询分析:根据查询特性进行调整

重新排序器处理初始向量搜索结果,并返回一个经过优化以提高相关性的重新排序列表。

🌐 The reranker processes the initial vector search results and returns a reordered list optimized for relevance.

自定义描述示例
Direct link to 自定义描述示例

🌐 Example with Custom Description

const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
description:
"Search through document archives to find relevant information for answering questions about company policies and procedures",
});

这个例子展示了如何为特定用例自定义工具描述,同时保持其信息检索的核心功能。

🌐 This example shows how to customize the tool description for a specific use case while maintaining its core purpose of information retrieval.

特定数据库的配置示例
Direct link to 特定数据库的配置示例

🌐 Database-Specific Configuration Examples

databaseConfig 参数允许你利用每个向量数据库特有的功能和优化。这些配置会在查询执行期间自动应用。

🌐 The databaseConfig parameter allows you to leverage unique features and optimizations specific to each vector database. These configurations are automatically applied during query execution.

Pinecone 配置
Direct link to Pinecone 配置

const pineconeQueryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "production", // Organize vectors by environment
sparseVector: { // Enable hybrid search
indices: [0, 1, 2, 3],
values: [0.1, 0.2, 0.15, 0.05]
}
}
}
});

Pinecone特性:

  • 命名空间:在同一索引中隔离不同的数据集
  • 稀疏向量:结合密集和稀疏嵌入以提高搜索质量
  • 使用案例:多租户应用,混合语义搜索

运行时配置覆盖
Direct link to 运行时配置覆盖

🌐 Runtime Configuration Override

你可以在运行时覆盖数据库配置,以适应不同的场景:

🌐 You can override database configurations at runtime to adapt to different scenarios:

import { RequestContext } from "@mastra/core/request-context";

const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
databaseConfig: {
pinecone: {
namespace: "development",
},
},
});

// Override at runtime
const requestContext = new RequestContext();
requestContext.set("databaseConfig", {
pinecone: {
namespace: "production", // Switch to production namespace
},
});

const response = await agent.generate("Find information about deployment", {
requestContext,
});

这种方法使你能够:

🌐 This approach allows you to:

  • 在环境之间切换(开发/预发布/生产)
  • 根据负载调整性能参数
  • 根据请求应用不同的过滤策略

示例:使用请求上下文
Direct link to 示例:使用请求上下文

🌐 Example: Using Request Context

const queryTool = createVectorQueryTool({
vectorStoreName: "pinecone",
indexName: "docs",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

在使用请求上下文时,通过请求上下文在执行时提供所需的参数:

🌐 When using request context, provide required parameters at execution time via the request context:

const requestContext = new RequestContext<{
vectorStoreName: string;
indexName: string;
topK: number;
filter: VectorFilter;
databaseConfig: DatabaseConfig;
}>();
requestContext.set("vectorStoreName", "my-store");
requestContext.set("indexName", "my-index");
requestContext.set("topK", 5);
requestContext.set("filter", { category: "docs" });
requestContext.set("databaseConfig", {
pinecone: { namespace: "runtime-namespace" },
});
requestContext.set("model", "openai/text-embedding-3-small");

const response = await agent.generate(
"Find documentation from the knowledge base.",
{
requestContext,
},
);

有关请求上下文的更多信息,请参见:

🌐 For more information on request context, please see:

在没有 Mastra 服务器的情况下使用
Direct link to 在没有 Mastra 服务器的情况下使用

🌐 Usage Without a Mastra Server

该工具可以单独使用来检索与查询匹配的文档:

🌐 The tool can be used by itself to retrieve documents matching a query:

src/index.ts
import { RequestContext } from "@mastra/core/request-context";
import { createVectorQueryTool } from "@mastra/rag";
import { PgVector } from "@mastra/pg";

const pgVector = new PgVector({
id: 'pg-vector',
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
});

const vectorQueryTool = createVectorQueryTool({
vectorStoreName: "pgVector", // optional since we're passing in a store
vectorStore: pgVector,
indexName: "embeddings",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
});

const requestContext = new RequestContext();
const queryResult = await vectorQueryTool.execute(
{ queryText: "foo", topK: 1 },
{ requestContext }
);

console.log(queryResult.sources);

面向多租户应用的动态向量存储
Direct link to 面向多租户应用的动态向量存储

🌐 Dynamic Vector Store for Multi-Tenant Applications

对于每个租户都有独立数据的多租户应用(例如,单独的 PostgreSQL schema),你可以传递一个解析函数,而不是静态的向量存储实例。该函数接收请求上下文,并可以返回当前租户的适当向量存储:

🌐 For multi-tenant applications where each tenant has isolated data (e.g., separate PostgreSQL schemas), you can pass a resolver function instead of a static vector store instance. The function receives the request context and can return the appropriate vector store for the current tenant:

src/index.ts
import { createVectorQueryTool, VectorStoreResolver } from "@mastra/rag";
import { PgVector } from "@mastra/pg";

// Cache for tenant-specific vector stores
const vectorStoreCache = new Map<string, PgVector>();

// Resolver function that returns the correct vector store based on tenant
const vectorStoreResolver: VectorStoreResolver = async ({ requestContext }) => {
const tenantId = requestContext?.get("tenantId");

if (!tenantId) {
throw new Error("tenantId is required in request context");
}

// Return cached instance or create new one
if (!vectorStoreCache.has(tenantId)) {
vectorStoreCache.set(tenantId, new PgVector({
id: `pg-vector-${tenantId}`,
connectionString: process.env.POSTGRES_CONNECTION_STRING!,
schemaName: `tenant_${tenantId}`, // Each tenant has their own schema
}));
}

return vectorStoreCache.get(tenantId)!;
};

const vectorQueryTool = createVectorQueryTool({
indexName: "embeddings",
model: new ModelRouterEmbeddingModel("openai/text-embedding-3-small"),
vectorStore: vectorStoreResolver, // Dynamic resolution!
});

// Usage with tenant context
const requestContext = new RequestContext();
requestContext.set("tenantId", "acme-corp");

const result = await vectorQueryTool.execute(
{ queryText: "company policies", topK: 5 },
{ requestContext }
);

这个模式类似于 Agent.memory 支持动态配置并启用的方式:

🌐 This pattern is similar to how Agent.memory supports dynamic configuration and enables:

  • 模式隔离:每个租户的数据存储在独立的 PostgreSQL 模式中
  • 数据库隔离:为每个租户路由到不同的数据库实例
  • 动态配置:根据请求上下文调整向量存储设置

工具详情
Direct link to 工具详情

🌐 Tool Details

该工具由以下内容创建:

🌐 The tool is created with:

  • IDVectorQuery {vectorStoreName} {indexName} Tool
  • 输入模式:需要 queryText 和 filter 对象
  • 输出架构:返回 relevantContext 字符串

🌐 Related