Octave (Omni-capable text and voice engine) isn't a traditional TTS model. It’s a voice-based LLM. That means it understands what words mean in context, so it can predict emotions, cadence, and more.
Octave is the first TTS system that can take natural language instructions to change emotional delivery and speaking style. Give directions like "sound sarcastic" or "whisper fearfully." For the first time, creators have total control.
Use Instant Mode to reduce your time to first token (TTFT) to 200ms with our streaming API for developers. Hume TTS is perfect for AI assistants, avatars, phone calling, and so much more.