Training Data

Voice Training Data Built by Researchers, for Researchers

Datasets for creating realistic voices across global languages—powering our own state-of-the-art models, and now available to power yours.

Contact Research Team
50+Languages
48+Emotions
600+Voice Descriptors

Voice AI Datasets

Covering the Full Spectrum of Voice

Conversational Audio

Turn-taking, interruptions, and multi-speaker dynamics.

Request Samples

Emotional Reproduction

Fine-grained annotations across a wide range of expressive speech.

Request Samples

Multilingual Audio

Native speaker recordings across global languages and dialects.

Request Samples

Voice Realism

Prosody, intonation, pacing, and expressive range.

Request Samples

Domain-Specific

Industry contexts like healthcare, education, and customer service.

Request Samples

Task-Specific

Conversations for assistants, support, tutoring, and research.

Request Samples

Expression & Multimodal

Datasets for expression measurement and analysis

Purpose-built for training expression analysis models across face, voice, and text. The same data behind our peer-reviewed research and production models.

Speech Prosody

Annotated speech rhythm, stress, and intonation patterns across diverse speakers and emotional contexts.

audio

Vocal Expression

Voice timbre, resonance, and vocal quality samples labeled across 48 emotion dimensions.

audio

Vocal Bursts

Laughter, sighs, gasps, and non-verbal vocalizations categorized by type and emotional context.

audio

FACS 2.0

Facial Action Coding System data with precise action unit annotations and intensity scoring.

visual

Dynamic Reaction

Temporal expression sequences capturing how facial and vocal responses change over time.

visualaudio

Facial Expression

Cross-cultural facial emotion samples spanning 48 expression categories and varied lighting conditions.

visual

Language

Text samples annotated for emotional expression, sentiment, and content safety across multiple languages.

text

How It Works

From Research Question to Production-Ready Data

Hume operates a research-grade data pipeline purpose-built for voice.

1

Request Samples

Start with curated speech datasets from our library.

2

Create Your Own

Launch custom collections with defined speakers and recording conditions.

3

License Access

Datasets include rich metadata—demographics, acoustics, and labels.

4

API Access

Programmatically refresh or generate new training data.

Ready to explore our training data?

Talk to our research team about how Hume's datasets can accelerate your voice and emotion AI development.

Stay in the loop

Get the latest on empathic AI research, product updates, and company news.

Join the community

Connect with other developers, share projects, and get help from the team.

Join our Discord