Announcing our Custom Model API
Published on Dec 14, 2023
Meet our Custom Model API — a cutting edge AI tool that integrates language, voice, and/or facial movement to predict human preferences and needs more accurately than any LLM
You can now use our Custom Model API to predict well-being, satisfaction, mental health, and more. Using a few labeled examples, our API integrates dynamic patterns of language, vocal expression, and/or facial expression into a custom multimodal model.
Leveraging Hume’s AI models pretrained on millions of videos and audio files, our API can usually predict your labels accurately after seeing just a few dozen examples. That means that with just a few labeled examples and a few clicks, you can deploy powerful AI models that predict the outcomes your users care about most. Of course, the models you train using our API are yours alone to deploy and share.
Visit dev.hume.ai/docs/custom-models or login to beta.hume.ai to get started!
Our new API translates nuanced multimodal measures into personalized insights more accurately than any LLM
![image](https://hume-website.directus.app/assets/6e18370e-7267-408d-ab5a-50bb940fa906/image.png?width=1920&height=868&quality=75&format=webp&fit=inside)
For instance, we partnered with Lawyer.com to predict the quality of customer support calls. Using just 73 calls, we were able to train a model that predicted expert ratings of whether a call went well or poorly with 97.3% accuracy. By contrast, using language models alone—including one of the world’s most capable language models along with our in-house language emotion model—resulted in a 3x higher error rate.
Leveraging dynamic patterns of language, vocal expression, and/or facial movement
Our Custom Model API works by integrating complex patterns of language, vocal expression, and/or facial movement captured using Hume’s expression AI models.
![emoclouds for dimensions](https://hume-website.directus.app/assets/5bdd18ac-986b-4bda-a83f-b077be42af4e/Screen_Shot_2023-12-18_at_3.31.53_PM_11zon.png?width=1920&height=976&quality=75&format=webp&fit=inside)
To combine these signals with language, we inserted the expression measures extracted by each of our expression models along with transcribed language into a novel empathic large language model (eLLM). We then pretrained our eLLM on millions of human interactions.
![highres flow](https://hume-website.directus.app/assets/f50eb213-9579-4bfe-905e-88f8439691ba/highres_flow.jpeg?width=1920&height=623&quality=75&format=webp&fit=inside)
When you train a custom model using our Custom Model API, you are leveraging the joint language-expression embeddings extracted by our eLLM to predict your own labels.
Pricing
There are two steps to using our Custom Model API, (1) Training and (2) Inference:
1. Training: During our beta release, the model training process is completely free. This includes uploading data, training, evaluating results, and retraining.
2. Inference: When deploying your custom model in your application, a fee is charged for each file processed by your model. Detailed pricing can be found on our pricing page.
Hope you enjoy using our Custom Model API! If you have any questions, you can post them on our Discord channel. We look forward to seeing what you build.
Subscribe
Sign up now to get notified of any updates or new articles.
Share article
Recent articles
![Blog 2.14.24 (1)](https://hume-website.directus.app/assets/58908544-0b2f-4497-8930-7e0e7dadcde0/Blog - 2.14.24 (1).jpg?width=1000&height=1000&quality=75&format=webp&fit=inside)
Understanding how emotions are experienced and expressed across different cultures has long been a central focus of debate and study in psychology, cognitive science, and anthropology. What emotions do people in different cultures experience in response to the same evocative scenes and scenarios? What facial movements do they produce? How are feelings and expressions related?
![Chatter](https://hume-website.directus.app/assets/da0ab281-b013-420b-a308-1cd3c3f8f886/Chatter.png?width=435&height=435&quality=75&format=webp&fit=inside)
EVI Web Search Demo: The First Interactive Voice AI Podcast
Hume’s Empathic Voice Interface (EVI) is now the first voice API capable of native web search.
![Frame](https://hume-website.directus.app/assets/00a32aa2-dfb9-4aab-a467-2bf982f15cd5/Frame.png?width=1000&height=1000&quality=75&format=webp&fit=inside)
Introducing Hume’s Empathic Voice Interface (EVI) API
Last month, we released the demo of our Empathic Voice Interface (EVI). The first emotionally intelligent voice AI API is finally here! EVI does a lot more than stitch together transcription, LLMs, and text-to-speech. With a new empathic LLM (eLLM) that processes your tone of voice, EVI unlocks new capabilities like knowing when to speak, generating more empathic language, and intelligently modulating its own tune, rhythm, and timbre. EVI is the first voice AI that really sounds like it understands you.