The world's most realistic voice AI, in real-time
Prompt the first LLM for text-to-speech to create new voices, instruct emotions, and more
A text-to-speech system that understands what it's saying
Octave (Omni-capable text and voice engine) isn't a traditional TTS model. It’s a voice-based LLM. That means it understands what words mean in context, so it can predict emotions, cadence, and more.
Create any voice you can imagine with Octave Voice Design
Sarcastic medieval peasant
Full prompt: The speaker is a medieval peasant with a cockney accent, raspy voice, dripping with sarcasm.
Literature professor
Full prompt: A retired Black female literature professor who analyzes poetry with precise academic language and references to her own published criticism.
Charming cowboy
Full prompt: The speaker is a grizzled old cowboy with a folksy Texan drawl Southern accent, speaking in a charismatic tone with a deep but relaxed vibe.
Sitcom inner monologue
Full prompt: The star of a popular sitcom, with frequent inner monologues about her life.
Dungeon master
Full prompt: A know-it-all dungeons and dragons dungeon master speaking excitedly with a lisp.
Warm English narrator
Full prompt: The speaker is a sophisticated British female narrator with a gentle, warm voice, recounting the ending of a classic romance novel.
Unserious movie trailer guy
Full prompt: The speaker is an American, deep middle-aged male film trailer narrator for a film about chickens.
Raspy evil vampire
Full prompt: A villainous undead vampire, with a horrifying raspy voice, and a slight Transylvanian accent.
Reminiscing man
Full prompt: A middle-aged African American man, reminiscing with a slightly gravelly voice and a tone of hard-earned wisdom.
Nature documentary narrator
Full prompt: The speaker is a distinguished British narrator, whose voice carries a deep sense of wisdom and curiosity.
Texan fishing guru
Prompt: The speaker has a booming, charismatic radio voice, like a Texan fishing guru with a hint of gravel and an infectious laugh, perfect for reeling in listeners to 'Big Dicky's live fishing frenzy.'
Any emotion or speaking style, on command
Octave is the first TTS system that can take natural language instructions to change emotional delivery and speaking style. Give directions like "sound sarcastic" or "whisper fearfully." For the first time, creators have total control.
For creators and developers alike
Octave was built to generate the most expressive AI voices for any content: podcasts, voiceovers, audiobooks, and more. With our API, you can bring it to any application.

We research foundation models and how to align them with human well-being
00/00
The world's most realistic and instructible speech-to-speech model
As a speech-language model, where the same intelligence handles transcription, language, and speech, EVI 3 brings more expressiveness, realism, and emotional understanding to voice AI.
Octave Text to Speech
Hume's Text-to-Speech model, Octave, is available today for content creators and developers. Octave understands what words mean in context, so it can predict emotions, cadence, and more. It can also take natural language instructions to change emotional delivery and speaking style. Give directions like "sound sarcastic" or "whisper fearfully." For the first time, creators have total control.

Emotional intelligence for any application
Measure emotional expression with unmatched precision. One API, four modalities, hundreds of dimensions of emotional expression.
Trusted By


























































































Developer Resources

Platform
Create your Hume account, get your API keys, monitor your usage, and explore our products in the interactive platform.

Documentation
Explore our documentation with concise guides, hands-on tutorials, and an in-depth API reference—crafted to support your integration.

Community
Join our community of developers and researchers working with Hume APIs—your go-to hub for collaboration, support, and knowledge sharing.
00/00