Product Updates

Tutorial: Hands-on with Hume's Custom Model API

Published on Dec 18, 2023

Meet our Custom Model API — a cutting edge AI tool that integrates language, voice, and/or facial movement to predict human preferences and needs more accurately than any LLM

You can now use our Custom Model API to predict well-being, satisfaction, mental health, and more. Using a few labeled examples, our API integrates dynamic patterns of language, vocal expression, and/or facial expression into a custom multimodal model. 

Leveraging Hume’s AI models pretrained on millions of videos and audio files, our API can usually predict your labels accurately after seeing just a few dozen examples. That means that with just a few labeled examples and a few clicks, you can deploy powerful AI models that predict the outcomes your users care about most. Of course, the models you train using our API are yours alone to deploy and share.

Visit dev.hume.ai/docs/custom-models or login to beta.hume.ai to get started!

Step 1: Prepare your dataset

The cool thing about our Custom Model API is that it learns rapidly from your own data. You just need to choose a dataset of image, video, or audio files for it to learn from—ideally, one that captures the different states, preferences, or outcomes that are important for your application.

For example, consider a virtual education platform is building a feature that helps users stay focused. This customer could start by putting together a few examples of video snippets when students reported paying attention or being distracted. This dataset could then be submitted to our Custom Model API to build a model that detects signs of distraction automatically.

The model creation process will be easier if you sort your image, video, or audio files into subfolders based on their labels. For example, you could create an umbrella folder called ‘Student Focus’ with subfolders called ‘Attentive’ and ‘Distracted’. Please note we currently don't support mixed media datasets.

The amount of data you’ll need to build an accurate model depends on your goal’s complexity. Generally, it is good practice to have a similar number of samples with each label you want to predict. You also want to consider other forms of imbalance or bias in your dataset. Remember that the length of file, number of speakers, and language spoken will also impact the model's predictive accuracy. You'll essentially want to put together a training dataset very similar to the kinds of files you'll ultimately want to predict on. For more information, see Data Tips: What should your data look like?

Step 2: Create your dataset

Once you’ve assembled your dataset, it’s time to visit beta.hume.ai.

After logging in, navigate to the ‘Datasets’ page. 

Step 2.1: Find the ‘+ Create Dataset’ button on the top right corner of your screen. Clicking this button will take you to the page where you can add your dataset to our Custom Model platform. 

Step 2.2: Give your dataset a title and enter in one of your column names along with the data type (categorial or numerical). Don't worry, you can always go back and make edits later.  Your column is where you store your “label names”, which should just be the overall category. In our example, ‘Student Focus’ would be a suitable name.

Step 2.3: Now you can just drag-and-drop the folder that contains your dataset.

Remember, the folder should include subfolders for each label containing the corresponding samples. For example, if you drag in a folder with the subfolders ‘Attentive’ and ‘Distracted’,  our platform will interpret  ‘Attentive’ and ‘Distracted’ as labels belonging to the samples in each respective subfolder. 

Step 2.4: Assign a label name in the pop-up window. Your “label name” should just be the overall category. In our example, ‘Student Focus’ would be a suitable name. Then, hit ‘Save Labels and Continue’ and subsequently approve the uploading process. 

Step 2.5: Verify your uploads. Check the total file count and address any detected issues. 

Step 2.6: Hit the ‘Save’ button on the top right of the page once you’re ready. If you accidentally uploaded a mixed-media dataset, a pop-up window will ask you to select the single file type you would like to keep. 

Step 2.7: Now, you’re ready to create your model! Click ‘Create Custom Model’ on the top right of your screen.

Step 3: Create your model

Step 3.1: Select your Training Dataset. If you’re navigating to this page from a specific dataset (as recommended in this guide), this step will already be completed for you. However, if you want to check or change your dataset, you can do this by clicking on the ‘Edit’ button to the right of the heading ‘Select Training Dataset’. 

Step 3.2: Select which labels you want to predict and hit ‘Continue.’ 

Step 3.3: Select your Task Type. Based on your data, we'll recommend creating either a Classification or Regression model. Next, select the specific type of model you want to create; you can currently select between binary or multiclass classification or univariate or multivariate regression.

Step 3.4: Fill in your new Custom Model’s name and description. 

Step 3.5: Hit ‘Start Training’ once you’re ready. 

That’s it! You’ve successfully begun training your Custom Model. 

You should be redirected to a page confirming that your model is actively training. 

To check on the status of your model, click ‘View Jobs.’ To see existing, finished models, click ‘View Models’.

Step 4: View and evaluate your model

Step 4.1: Find your Custom Model on the ‘Models’ page. Once your model is done training, you’ll see it on the ‘Models’ page. Click to explore how it performed.

Step 4.2: View your model’s estimated accuracy. The Custom Models page includes statistics on the accuracy of your trained model, obtained by iteratively training on ~90% of your data and testing on the remaining ~10%, across the samples in your dataset. 

Step 4.4: Interpret confidence scores and misclassified samples. Explore the confidence scores assigned to each sample, as well as the samples that were misclassified, by navigating the “Classification confidence” visualization. Note that the confidence scores give you a continuous measure that might have additional utility beyond your original labels, potentially representing gradations between your classes.

Step 4.4: Use the 'Descriptive statistics' visualization to interpret how different expression modalities relate to your labels. You can explore how your labels are differentiated by the different modalities of emotional expression measured by our core models (Facial Expression, Prosody, Vocal Bursts, and Language). Toggle between 'Emotions' and 'Classes'.

If you’re happy with your model’s performance, it’s time to put it to use.

Step 5: Test your Custom Model on new files

Step 5.1: Navigate to the ‘Playground’ page. 

Step 5.2: Locate the dropdown menu to select your model of interest. Select the custom model you successfully created from the dropdown.

Step 5.3: To select a file to analyze, locate the 'Upload files' button. You can upload files stored locally on your computer or select files you have already uploaded to Hume. You can also use example files for the custom models.

Step 5.4: Click ‘Analyze’ to evaluate your selected file with your custom model.

That’s it! You’ve successfully applied your Custom Model to a new file. For more information on how to interpret your results, see How do I interpret my results?

Your model is automatically deployed on our API so that you can build it into your application. For further instruction, see Start Custom Model Inference Job.

In closing

The Hume AI Platform strives to be the only toolkit developers need to measure verbal and nonverbal cues in audio, video, or images, based on rigorous scientific studies of human expressive behavior. Our Custom Model API is the most powerful way to apply our models to your specialized use case. 

In this post, we walked through the basics of how to use our new Custom Model API. For more details and the latest documentation, be sure to bookmark our tutorials. And if you plan to develop an application using our API, note that it will need to adhere to the ethical guidelines of The Hume Initiative.

We’re excited to expand our private beta over the next few months, and look forward to what you’ll build with it! If you have any questions or want more direct support, please join our Discord channel.

Connect with us

Follow us on Twitter @hume_ai or join our Discord channel. If you’re interested in beta access, you can sign up.

Subscribe

Sign up now to get notified of any updates or new articles.

Recent articles

blogtile
Science

What is semantic space theory?

Our models and products are built on a cutting-edge approach to understanding emotion: semantic space theory (SST), which uses computational methods and data-driven approaches to map the full spectrum of our feelings

Blog - 2.14.24
Science

Publication in iScience: Understanding what facial expressions mean in different cultures

How many different facial expressions do people form? How do they differ in meaning across cultures? Can AI capture these nuances? Our new paper provides new in-depth answers to these questions with the help of machine learning.

Blog Card 2.9.24
Science

Introducing a new evaluation for creative ability in Large Language Models

Introducing HumE-1 (Human Evaluation 1), our new evaluation for large language models (LLMs) that uses human ratings to evaluate LLMs for their ability to perform creative tasks in the ways that matter to us, evoking the feelings we want to feel.