What is semantic space theory?
By Jeffrey Brooks, PhD on Feb 21, 2024
Our models and products are built on a cutting-edge approach to understanding emotion: semantic space theory (SST), which uses computational methods and data-driven approaches to map the full spectrum of our feelings
Emotions are deeply personal, shaping the multitude of patterns and preferences that make up who we are: our social interactions, our relationships, our responses to art and music, and the decisions we make about what to work on and focus on in life.
But for years the scientific study of emotion has been grappling with foundational questions: what is an emotion? How are emotions best represented for scientific study in the first place?
Scientific approaches to emotion have typically simplified emotions into a small number of categories (e.g. anger, disgust, fear, happiness, sadness, and surprise; or the “basic 6”) or dimensions (e.g. valence and arousal).
These ideas have shaped the study of emotion for decades, from the stimulus sets available to researchers, to the outputs provided by computational models. Even researchers critical of the “basic 6” approach tend to only use the basic 6 as conditions in their studies due to these practical limitations.
How can we move beyond this to capture the complex, nuanced appraisals that people actually make about emotion?
The foundation of our research at Hume is semantic space theory (SST), which is an inductive, data-driven approach to these questions. Using wide-ranging naturalistic data and open-ended, cutting edge statistical approaches, SST maps the full spectrum of human emotion.
A data-driven, computational approach to emotion
With recent technological advances in computing, data storage, online data collection, and more, the stage has been set for a new generation of emotion science. Collecting large amounts of naturalistic data and using statistical modeling to quantitatively describe large datasets has become more tractable due to advances in our ability to store and process this amount of data.
Inductive, data-driven approaches make fewer assumptions about how the data are distributed in advance and – as a result – can pick up on a variety of different contributors to emotion – both the actual structure we’re trying to measure as well as the confounds that have been notoriously hard to measure in the past. Combining these techniques with large-scale data collection offers the ability to train models that pick up on specific cues and behaviors of interest and reduce the contribution of bias to these models.
SST moves beyond low-dimensional theories of emotion, conceiving of emotion as a high-dimensional semantic space, which these techniques allow us to map in detail.
These approaches have led to three major insights that pave the way for a detailed computational understanding of emotion.
1. Emotion is high-dimensional
How many kinds of emotion are distinguished within a semantic space?
Instead of deciding in advance that only 6 categories define the space of emotion and only including stimuli and response options that conform to this assumption, the SST approach has been to conduct experiments with a vast array of expressive and evocative stimuli, collecting responses for a broad range of emotions, mental states, and nonverbal behaviors. Identifying how many dimensions capture the systematic variation in these responses requires data-driven statistical modeling such as multidimensional reliability analysis.
These approaches have been taken to understand the semantic spaces of emotions represented in self-reported experience, facial expressions, speech prosody, nonverbal vocalizations (vocal bursts), the feelings evoked by music, and the expressions depicted in ancient art. Many of these studies have taken place in multiple cultures, including our most recent work which we conducted in 5-6 countries.
These studies consistently find that upwards of 20 distinct dimensions characterize the emotions people use to describe these various domains. The specific numbers of emotions matter less than the general insight - emotion is at least four times more complex than traditionally assumed.
Semantic spaces of emotion. A) The semantic space framework is able to provide a quantitative understanding of how many dimensions of emotion there are, how these emotions are distributed in a semantic space, and how these emotions are best conceptualized. B) The semantic space of emotions expressed in facial expressions and vocal bursts. C) The semantic space of emotions described in self-reports of emotional experience. D) The semantic space of feelings evoked by music across cultures. E) The semantic space of emotions expressed in speech prosody. F) The semantic space of emotions depicted in ancient American art. Figure presented courtesy of Trends in Cognitive Sciences
2. Emotion categories are not discrete
How are emotions distributed in a semantic space?
The methods of SST allow the structure of emotion to emerge from the data itself, which routinely shows that emotions are not discrete, but are heterogeneous and often blended.
When mapping the semantic space of emotion, many emotions that might be treated as discrete and separate by traditional theories - like anger and disgust, or awe and fear - instead are bridged by continuous gradients in meaning. Many states lie between awe and fear, for example, and SST shows that these states are accompanied by composite feelings and expressions that reliably represent these blended, intermediate meanings.
Human experience is complex - but SST provides a scientific approach that can capture and understand this complexity.
3. Specific emotions are real and they organize emotional behavior
Is emotion best understood using distinct categories, like anger, awe, or fear? Or are these states best described by other constructs, like valence and arousal?
Our studies show that valence and arousal only capture a small fraction of the variance in ratings of emotional states. Meanwhile, ratings of emotion categories capture almost the entirety of the variance in valence and arousal. In other words, the ways people describe their own experiences, and the meanings they infer from expressions, are better captured statistically by emotion categories rather than broader “core affect” dimensions.
Moreover, people both within and across the different cultures we’ve studied show much more consistency in their ratings of emotion and mental state terms than they do in their ratings of valence and arousal.
That’s why, at Hume, our expression measurement models provide outputs that correspond to emotion categories (like fear, triumph, and sadness) rather than trying to reduce them to underlying dimensions. Emotions are best understood and described using emotion categories.
What traditional models capture. Venn diagrams show the proportion of reliable variance in reported emotional experience and perception captured by the Basic 6 and valence/arousal compared to semantic space theory. Altogether, traditional models based on the Basic 6 and valence and arousal largely fail to capture the rich space of meanings that emotional expressions convey and which emerge in self-reports. Figure presented courtesy of Trends in Cognitive Sciences.
Conclusion
Semantic space theory is a data-driven, computational approach to emotion science that overcomes the constraints of existing approaches - both in terms of research methods and frameworks for thinking about emotion. The resulting approach isn't reductive, captures the complexity of emotion, and has the power to explain and integrate the nuanced theoretical claims in the field.
SST harnesses advanced computational, statistical, and data collection approaches to provide a path forward - providing a framework for the scientific study of emotion that acknowledges the full complexity of human experiences and feelings.
This framework guides how we train our models and build our products at Hume. These discoveries also pave the way for further efforts at mapping emotion, in the brain, in the body, in social relationships, and in the expressions of emotion in myriad cultural forms.
Keep up our latest developments at https://hume.ai/science. For more information, and to read some of our published research, visit https://dev.hume.ai/docs/resources/science#published-research.
Subscribe
Sign up now to get notified of any updates or new articles.
Share article
Recent articles
Introducing EVI 2, our new foundational voice-to-voice model
EVI 2 is our new foundational voice-to-voice model. It is one of the first AI models with which you can have remarkably human-like voice conversations. It can converse rapidly and fluently with users with subsecond response times, understand a user’s tone of voice, generate any tone of voice, and even respond to some more niche requests like changing its speaking rate or rapping. It can emulate a wide range of personalities, accents, and speaking styles and possesses emergent multilingual capabilities.
Comparing the world’s first voice-to-voice AI models
The world’s first working voice-to-voice models are Hume AI's Empathic Voice Interface 2 (EVI 2) and OpenAI's GPT-4o Advanced Voice Mode (GPT-4o-voice). EVI 2 is publicly available, as an app and an API that developers can build on. On the other hand, GPT-4o-voice has been previewed to a small number of ChatGPT users. Here we explore the similarities, differences, and potential applications of these systems.
How Tone AI uses Hume’s API to boost audience growth
How Tone AI uses Hume’s Expression Measurement API to boost audience growth for NFL teams and media organizations