Voice Training Data Built by Researchers, for Researchers
Datasets for creating realistic voices across global languages—powering our own state-of-the-art models, and now available to power yours.
Voice AI Datasets
Covering the Full Spectrum of Voice
Expression & Multimodal
Datasets for expression measurement and analysis
Purpose-built for training expression analysis models across face, voice, and text. The same data behind our peer-reviewed research and production models.
Speech Prosody
Annotated speech rhythm, stress, and intonation patterns across diverse speakers and emotional contexts.
Vocal Expression
Voice timbre, resonance, and vocal quality samples labeled across 48 emotion dimensions.
Vocal Bursts
Laughter, sighs, gasps, and non-verbal vocalizations categorized by type and emotional context.
FACS 2.0
Facial Action Coding System data with precise action unit annotations and intensity scoring.
Dynamic Reaction
Temporal expression sequences capturing how facial and vocal responses change over time.
Facial Expression
Cross-cultural facial emotion samples spanning 48 expression categories and varied lighting conditions.
Language
Text samples annotated for emotional expression, sentiment, and content safety across multiple languages.
How It Works
From Research Question to Production-Ready Data
Hume operates a research-grade data pipeline purpose-built for voice.
Request Samples
Start with curated speech datasets from our library.
Create Your Own
Launch custom collections with defined speakers and recording conditions.
License Access
Datasets include rich metadata—demographics, acoustics, and labels.
API Access
Programmatically refresh or generate new training data.
Ready to explore our training data?
Talk to our research team about how Hume's datasets can accelerate your voice and emotion AI development.