OpenAI
Mastra 中的 OpenAIVoice 类使用 OpenAI 的模型提供文本到语音和语音到文本的功能。
🌐 The OpenAIVoice class in Mastra provides text-to-speech and speech-to-text capabilities using OpenAI's models.
使用示例Direct link to 使用示例
🌐 Usage Example
import { OpenAIVoice } from "@mastra/voice-openai";
// Initialize with default configuration using environment variables
const voice = new OpenAIVoice();
// Or initialize with specific configuration
const voiceWithConfig = new OpenAIVoice({
speechModel: {
name: "tts-1-hd",
apiKey: "your-openai-api-key",
},
listeningModel: {
name: "whisper-1",
apiKey: "your-openai-api-key",
},
speaker: "alloy", // Default voice
});
// Convert text to speech
const audioStream = await voice.speak("Hello, how can I help you?", {
speaker: "nova", // Override default voice
speed: 1.2, // Adjust speech speed
});
// Convert speech to text
const text = await voice.listen(audioStream, {
filetype: "mp3",
});
配置Direct link to 配置
🌐 Configuration
构造函数选项Direct link to 构造函数选项
🌐 Constructor Options
speechModel?:
OpenAIConfig
= { name: 'tts-1' }
Configuration for text-to-speech synthesis.
listeningModel?:
OpenAIConfig
= { name: 'whisper-1' }
Configuration for speech-to-text recognition.
speaker?:
OpenAIVoiceId
= 'alloy'
Default voice ID for speech synthesis.
OpenAI配置Direct link to OpenAI配置
🌐 OpenAIConfig
name?:
'tts-1' | 'tts-1-hd' | 'whisper-1'
Model name. Use 'tts-1-hd' for higher quality audio.
apiKey?:
string
OpenAI API key. Falls back to OPENAI_API_KEY environment variable.
方法Direct link to 方法
🌐 Methods
speak()Direct link to speak()
使用 OpenAI 的文本到语音模型将文本转换为语音。
🌐 Converts text to speech using OpenAI's text-to-speech models.
input:
string | NodeJS.ReadableStream
Text or text stream to convert to speech.
options.speaker?:
OpenAIVoiceId
= Constructor's speaker value
Voice ID to use for speech synthesis.
options.speed?:
number
= 1.0
Speech speed multiplier.
返回:Promise<NodeJS.ReadableStream>
🌐 Returns: Promise<NodeJS.ReadableStream>
listen()Direct link to listen()
使用 OpenAI 的 Whisper 模型转录音频。
🌐 Transcribes audio using OpenAI's Whisper model.
audioStream:
NodeJS.ReadableStream
Audio stream to transcribe.
options.filetype?:
string
= 'mp3'
Audio format of the input stream.
返回:Promise<string>
🌐 Returns: Promise<string>
getSpeakers()Direct link to getSpeakers()
返回一个可用语音选项的数组,每个节点包含:
🌐 Returns an array of available voice options, where each node contains:
voiceId:
string
Unique identifier for the voice
注意Direct link to 注意
🌐 Notes
- API 密钥可以通过构造函数选项或
OPENAI_API_KEY环境变量提供 tts-1-hd型号提供更高质量的音频,但处理速度可能较慢- 语音识别支持包括 mp3、wav 和 webm 在内的多种音频格式