Skip to main content

CompositeVoice

CompositeVoice 类允许你将不同的语音提供商组合用于文本转语音和语音转文本操作。当你希望为每个操作使用最优的提供商时,这尤其有用——例如,使用 OpenAI 进行语音转文本,使用 PlayAI 进行文本转语音。

🌐 The CompositeVoice class allows you to combine different voice providers for text-to-speech and speech-to-text operations. This is particularly useful when you want to use the best provider for each operation - for example, using OpenAI for speech-to-text and PlayAI for text-to-speech.

CompositeVoice 支持 Mastra 语音提供商和 AI SDK 模型提供商

🌐 CompositeVoice supports both Mastra voice providers and AI SDK model providers

构造函数参数
Direct link to 构造函数参数

🌐 Constructor Parameters

config:

object
Configuration object for the composite voice service

config.input?:

MastraVoice | TranscriptionModel
Voice provider or AI SDK transcription model to use for speech-to-text operations. AI SDK models are automatically wrapped.

config.output?:

MastraVoice | SpeechModel
Voice provider or AI SDK speech model to use for text-to-speech operations. AI SDK models are automatically wrapped.

config.realtime?:

MastraVoice
Voice provider to use for real-time speech-to-speech operations

方法
Direct link to 方法

🌐 Methods

speak()
Direct link to speak()

使用配置的语音提供商将文本转换为语音。

🌐 Converts text to speech using the configured speaking provider.

input:

string | NodeJS.ReadableStream
Text to convert to speech

options?:

object
Provider-specific options passed to the speaking provider

注意:

🌐 Notes:

  • 如果未配置语音提供程序,此方法将抛出错误
  • 选项会传递给配置的语音提供商
  • 返回音频数据流

listen()
Direct link to listen()

使用配置的监听提供程序将语音转换为文本。

🌐 Converts speech to text using the configured listening provider.

audioStream:

NodeJS.ReadableStream
Audio stream to convert to text

options?:

object
Provider-specific options passed to the listening provider

注意:

🌐 Notes:

  • 如果未配置监听提供程序,此方法将抛出错误
  • 选项会传递给配置的监听提供程序
  • 根据提供方不同,返回字符串或转录文本流

getSpeakers()
Direct link to getSpeakers()

返回语音提供商可用语音的列表,每个节点包含:

🌐 Returns a list of available voices from the speaking provider, where each node contains:

voiceId:

string
Unique identifier for the voice

key?:

value
Additional voice properties that vary by provider (e.g., name, language)

注意:

🌐 Notes:

  • 仅返回语音提供商的语音
  • 如果未配置语音提供商,则返回空数组
  • 每个语音对象将至少有一个 voiceId 属性
  • 附加语音属性取决于语音提供商

使用示例
Direct link to 使用示例

🌐 Usage Examples

使用 Mastra 语音提供商
Direct link to 使用 Mastra 语音提供商

🌐 Using Mastra Voice Providers

import { CompositeVoice } from "@mastra/core/voice";
import { OpenAIVoice } from "@mastra/voice-openai";
import { PlayAIVoice } from "@mastra/voice-playai";

// Create voice providers
const openai = new OpenAIVoice();
const playai = new PlayAIVoice();

// Use OpenAI for listening (speech-to-text) and PlayAI for speaking (text-to-speech)
const voice = new CompositeVoice({
input: openai,
output: playai,
});

// Convert speech to text using OpenAI
const text = await voice.listen(audioStream);

// Convert text to speech using PlayAI
const audio = await voice.speak("Hello, world!");

使用 AI SDK 模型提供商
Direct link to 使用 AI SDK 模型提供商

🌐 Using AI SDK Model Providers

你可以将 AI SDK 转录和语音模型直接传递给 CompositeVoice:

🌐 You can pass AI SDK transcription and speech models directly to CompositeVoice:

import { CompositeVoice } from "@mastra/core/voice";
import { openai } from "@ai-sdk/openai";
import { elevenlabs } from "@ai-sdk/elevenlabs";

// Use AI SDK models directly - they will be auto-wrapped
const voice = new CompositeVoice({
input: openai.transcription('whisper-1'), // AI SDK transcription
output: elevenlabs.speech('eleven_turbo_v2'), // AI SDK speech
});

// Works the same way as with Mastra providers
const text = await voice.listen(audioStream);
const audio = await voice.speak("Hello from AI SDK!");

混合搭配
Direct link to 混合搭配

🌐 Mix and Match

你可以将 Mastra 提供商与 AI SDK 模型结合使用:

🌐 You can combine Mastra providers with AI SDK models:

import { CompositeVoice } from "@mastra/core/voice";
import { PlayAIVoice } from "@mastra/voice-playai";
import { groq } from "@ai-sdk/groq";

const voice = new CompositeVoice({
input: groq.transcription('whisper-large-v3'), // AI SDK for STT
output: new PlayAIVoice(), // Mastra for TTS
});