Chroma 向量存储
🌐 Chroma Vector Store
ChromaVector 类使用 Chroma 提供向量搜索,Chroma 是一个开源的嵌入数据库。它提供高效的向量搜索,并支持元数据过滤和混合搜索功能。
🌐 The ChromaVector class provides vector search using Chroma, an open-source embedding database. It offers efficient vector search with metadata filtering and hybrid search capabilities.
Chroma Cloud 提供无服务器向量和全文搜索。它速度极快、成本高效、可扩展且使用无忧。创建一个数据库,并用 5 美元的免费额度在不到 30 秒的时间内试用。
构造函数选项Direct link to 构造函数选项
🌐 Constructor Options
host?:
port?:
ssl?:
apiKey?:
tenant?:
database?:
headers?:
fetchOptions?:
运行 Chroma 服务器Direct link to 运行 Chroma 服务器
🌐 Running a Chroma Server
如果你是 Chroma Cloud 用户,只需向 ChromaVector 构造函数提供你的 API 密钥、租户和数据库名称。
🌐 If you are a Chroma Cloud user, simply provide the ChromaVector constructor your API key, tenant, and database name.
当你安装 @mastra/chroma 包时,你可以使用 Chroma CLI,它可以为你设置这些环境变量:chroma db connect [DB-NAME] --env-file。
🌐 When you install the @mastra/chroma package, you get access to the Chroma CLI, which can set these as environment variables for you: chroma db connect [DB-NAME] --env-file.
否则,你有几种选项可以设置你的单节点 Chroma 服务器:
🌐 Otherwise, you have several options for setting up your single-node Chroma server:
- 使用 Chroma CLI 本地运行一个实例:
chroma run。你可以在 Chroma 文档 中找到更多配置选项。 - 使用官方 Chroma 镜像在 Docker 上运行。
- 在你选择的服务提供商上部署自己的 Chroma 服务器。Chroma 提供了适用于 AWS、Azure 和 GCP 的示例模板。
方法Direct link to 方法
🌐 Methods
createIndex()Direct link to createIndex()
indexName:
dimension:
metric?:
forkIndex()Direct link to forkIndex()
注意:仅在 Chroma Cloud 上支持分叉,或者如果你部署自己的开源 分布式 Chroma,也支持分叉。
🌐 Note: Forking is only supported on Chroma Cloud, or if you deploy your own OSS distributed Chroma.
forkIndex 让你可以立即分叉现有的 Chroma 索引。对分叉索引的操作不会影响原始索引。更多信息请参见 Chroma 文档。
indexName:
newIndexName:
upsert()Direct link to upsert()
indexName:
vectors:
metadata?:
ids?:
documents?:
query()Direct link to query()
使用 queryVector 查询索引。返回按与 queryVector 距离排序的语义相似记录数组。每条记录的结构如下:
🌐 Query an index using a queryVector. Returns an array of semantically similar records in order of distance from the queryVector. Each record has the shape:
{
id: string;
score: number;
document?: string;
metadata?: Record<string, string | number | boolean>;
embedding?: number[]
}
你还可以向 query 调用提供你的元数据的形状以进行类型推断:query<T>()。
indexName:
queryVector:
topK?:
filter?:
includeVector?:
documentFilter?:
get()Direct link to get()
通过ID、元数据和文档过滤器从你的Chroma索引中获取记录。它返回一个形状如下的记录数组:
🌐 Get records from your Chroma index by IDs, metadata, and document filters. It returns an array of records of the shape:
{
id: string;
document?: string;
metadata?: Record<string, string | number | boolean>;
embedding?: number[]
}
你还可以向 get 调用提供你的元数据的形状以进行类型推断:get<T>()。
indexName:
ids?:
filter?:
includeVector?:
documentFilter?:
limit?:
offset?:
listIndexes()Direct link to listIndexes()
返回一个由索引名称组成的字符串数组。
🌐 Returns an array of index names as strings.
describeIndex()Direct link to describeIndex()
indexName:
返回:
🌐 Returns:
interface IndexStats {
dimension: number;
count: number;
metric: "cosine" | "euclidean" | "dotproduct";
}
deleteIndex()Direct link to deleteIndex()
indexName:
updateVector()Direct link to updateVector()
通过 ID 或元数据过滤器更新单个向量。必须提供 id 或 filter 中的一个,但不能同时提供两者。
🌐 Update a single vector by ID or by metadata filter. Either id or filter must be provided, but not both.
indexName:
id?:
filter?:
update:
update 对象可以包含:
🌐 The update object can contain:
vector?:
metadata?:
示例:
🌐 Example:
// Update by ID
await vectorStore.updateVector({
indexName: 'docs',
id: 'vec_123',
update: { metadata: { status: 'reviewed' } }
});
// Update by filter
await vectorStore.updateVector({
indexName: 'docs',
filter: { source_id: 'manual.pdf' },
update: { metadata: { version: 2 } }
});
deleteVector()Direct link to deleteVector()
indexName:
id:
deleteVectors()Direct link to deleteVectors()
通过 ID 或元数据过滤器删除多个向量。此方法支持批量删除和基于来源的向量管理。必须提供 ids 或 filter,但不能同时提供两者。
🌐 Delete multiple vectors by IDs or by metadata filter. This method enables bulk deletion and source-based vector management. Either ids or filter must be provided, but not both.
indexName:
ids?:
filter?:
示例:
🌐 Example:
// Delete all chunks from a document
await vectorStore.deleteVectors({
indexName: 'docs',
filter: { source_id: 'manual.pdf' }
});
// Delete multiple vectors by ID
await vectorStore.deleteVectors({
indexName: 'docs',
ids: ['vec_1', 'vec_2', 'vec_3']
});
// Delete old temporary documents
await vectorStore.deleteVectors({
indexName: 'docs',
filter: {
$and: [
{ bucket: 'temp' },
{ indexed_at: { $lt: '2025-01-01' } }
]
}
});
响应类型Direct link to 响应类型
🌐 Response Types
查询结果以此格式返回:
🌐 Query results are returned in this format:
interface QueryResult {
id: string;
score: number;
metadata: Record<string, any>;
document?: string; // Chroma-specific: Original document if it was stored
vector?: number[]; // Only included if includeVector is true
}
错误处理Direct link to 错误处理
🌐 Error Handling
该存储会抛出可以被捕获的类型化错误:
🌐 The store throws typed errors that can be caught:
try {
await store.query({
indexName: "index_name",
queryVector: queryVector,
});
} catch (error) {
if (error instanceof VectorStoreError) {
console.log(error.code); // 'connection_failed' | 'invalid_dimension' | etc
console.log(error.details); // Additional error context
}
}
相关Direct link to 相关
🌐 Related