Author:Lexie

Editor: Lu

 

In the great discussion about AI, the role people assign to it is either our most capable and efficient assistant, or the "machine army" that will subvert us. Regardless of whether it is an enemy or a friend, AI must not only be able to complete the tasks assigned by humans, but also be able to "read" people's hearts. This mind-reading ability is the highlight of the AI ​​field since this year.

In the enterprise SaaS emerging technology research report released by PitchBook this year, "emotional AI" has become a major technological highlight. It refers to the use of emotional computing and artificial intelligence technology to perceive, understand and interact with human emotions, trying to understand human emotions by analyzing text, facial expressions, sounds and other physiological signals. Simply put, emotional AI hopes that machines can "read" emotions like humans, or even better than humans.

Its main technologies include:

  • Facial expression analysis: Detect micro-expressions and facial muscle movements using cameras, computer vision, and deep learning.

  • Voice analysis: Identify emotional states through voiceprint, intonation, and rhythm.

  • Text analytics: Uncovering sentences and context using natural language processing (NLP) techniques.

  • Physiological signal monitoring: Use wearable devices to analyze heart rate, skin reaction, etc. to enhance the personalization and emotional richness of interactions.

Emotion AI

The predecessor of emotion AI is sentiment analysis technology, which mainly performs analysis through text interaction, such as analyzing and extracting user emotions through text on social media. With the support of AI and the integration of multiple input methods such as vision and audio, emotion AI promises more accurate and complete sentiment analysis.

01 VCs throw money around and startups receive huge amounts of funding

Silicon Rabbit observes that the potential of emotional AI has attracted the attention of many investors, and some startups focusing on this field, such as Uniphore and MorphCast, have already received a lot of investment in this field.

Uniphore from California has been exploring providing automated conversation solutions for enterprises since 2008. It has developed multiple product lines including U-Self Serve, U-Assist, U-Capture and U-Analyze to help customers conduct more personalized and emotional interactions through voice, text, vision and emotional AI technologies. U-Self Serve focuses on accurately identifying emotions and tones in conversations, allowing enterprises to provide more personalized services to improve user participation satisfaction;

U-Self Serve

U-Assist can improve the efficiency of customer service agents through real-time guidance and workflow automation; U-Capture can provide companies with deep insights into customer needs and satisfaction through automated sentiment data collection and analysis; and U-Analyze can help customers identify key trends and sentiment changes in interactions, providing data-driven decision support to enhance brand loyalty.

Uniphore's technology is not just about making machines understand language, but about capturing and interpreting the emotions behind the tone and expression when interacting with humans. This capability allows companies to better meet the emotional needs of customers instead of just mechanical responses when interacting with customers. By using Uniphore, companies can achieve 87% user satisfaction and improve customer service performance by 30%.

Uniphore has completed over US$620 million in financing to date. The most recent round of investment came from a US$400 million round led by NEA in 2022. Existing investors such as March Capital also participated in the investment. After this round, the valuation reached US$2.5 billion.

Uniphore

Hume AI has launched the world's first empathetic voice AI. It was founded by former Google scientist Alan Cowen, who is famous for pioneering the semantic space theory, which understands emotional experience and expression by revealing the nuances of voice, face and gesture. Cowen's research results have been published in many journals such as "Nature" and "Trends in Cognitive Sciences", involving the most extensive and diverse emotional samples studied to date.

Driven by this research, Hume developed the Conversational Voice API - EVI, which combines large language models and empathy algorithms to deeply understand and parse human emotional states. It can not only recognize emotions in voice, but also make more delicate and personalized responses in interactions with users. Developers can use these features with just a few lines of code and build them into any application.

Hume AI

One of the main limitations of most current AI systems is that their instructions are mainly given by humans. These instructions and prompts are prone to errors and fail to tap the huge potential of artificial intelligence. The empathetic large language model (eLLM) developed by Hume can adjust the words and tone it uses according to the context and the user's emotional expression. By taking human happiness as the first principle for machine learning, adjustment and interaction, it can bring users a more natural and real experience in multiple scenarios such as mental health, education and training, emergency calls, brand analysis, etc.

Just in March of this year, Hume AI completed a US$50 million Series B financing round led by EQT Ventures. Investors also included Union Square Ventures, Nat Friedman & Daniel Gross, Metaplanet and Northwell Holdings.

Also in this field is Entropik, which specializes in measuring consumer cognitive and emotional responses. Through Decode, a function that combines the combined power of emotional AI, behavioral AI, generative AI, and predictive AI, it can better understand consumer behavior and preferences, thereby providing more personalized marketing recommendations. Entropik recently completed a $25 million Series B financing round in February 2023, with investors including SIG Venture Capital and Bessemer Venture Partners.

Entropik

02 Giants join in, a chaotic battle

Technology giants have also made arrangements in the field of emotional AI by leveraging their own advantages.

Includes the Microsoft Azure Cognitive Services Emotion API, which can identify joy, anger, sadness, surprise and other emotions in images and videos by analyzing facial expressions and emotions;

IBM Watson's natural language understanding API can process large amounts of text data and identify the underlying sentiment (such as positive, negative or neutral) to more accurately interpret user intent;

Google Cloud AI’s Cloud Vision API has powerful image analysis capabilities that can quickly identify emotional expressions in images and supports text recognition and emotion association;

AWS's Rekognition can also detect emotions, recognize facial features, and track changes in expression. It can also be used in combination with other AWS services to become a complete social media analysis or emotional AI-driven marketing application.

Cloud Vision API

Some startups are moving faster in the field of emotional AI, and even tech giants are trying to poach talent. For example, unicorn Inflection AI was favored by investor Microsoft for its AI team and model. After Microsoft, together with Bill Gates, Eric Schmidt, NVIDIA and other parties, invested $1.3 billion in Inflection AI, it extended an olive branch to Mustafa Suleyman, a leader in AI and one of the co-founders of Inflection AI. Suleyman subsequently joined Microsoft with more than 70 employees, and Microsoft paid nearly $650 million for the move.

However, Inflection AI quickly regrouped and formed a new team with backgrounds in Google Translate, AI consulting, and AR to continue working on its core product Pi. Pi is a personal assistant that can understand and respond to user emotions. Unlike traditional AI, Pi focuses more on establishing an emotional connection with users, sensing emotions through analysis of voice, text, and other inputs, and showing empathy in conversations. Inflection AI sees Pi as a coach, confidant, listener, and creative partner, rather than a simple AI assistant. In addition, Pi has a powerful memory function that can remember the user's multiple conversation histories to enhance the continuity and personalized experience of the interaction.

Inflection AI Pi

03 Development path, attention and doubt coexist

Although emotional AI carries our expectations for a more humane way of interaction, like all AI technologies, its promotion is accompanied by concerns and doubts. First of all, can emotional AI really accurately interpret human emotions? In theory, this technology can indeed enrich the experience of services, devices, and technologies, but from a realistic perspective, human emotions are inherently vague and subjective. As early as 2019, researchers questioned this technology, saying that facial expressions cannot reliably reflect real human emotions. Therefore, relying solely on machines to simulate human facial expressions, posture, and tone of voice to understand emotions has certain limitations.

Secondly, strict regulatory supervision has always been a stumbling block to the development of AI. For example, the EU AI Act prohibits the use of computer vision emotion detection systems in fields such as education, which may limit the promotion of certain emotion AI solutions. States such as Illinois in the United States also have laws prohibiting the collection of biometric data without permission, which directly limits the premise for the use of certain emotion AI technologies. At the same time, data privacy and protection is an even more important issue. Emotion AI is usually used in fields such as education, health, and insurance that have particularly strict requirements for data privacy. Therefore, ensuring the security and legal use of emotion data is a topic that every emotion AI company needs to face.

Finally, communication and emotional interpretation between people from different cultures and regions are difficult problems, and even more so for AI. For example, different regions have different understandings and expressions of emotions, which may affect the effectiveness and integrity of the emotional AI system. In addition, emotional AI may also face considerable difficulties in dealing with racial, gender, and gender identity biases.

Emotional AI not only promises to reduce manpower and be efficient, but also has the caring ability to read people’s hearts. But can it really become a universal solution for human interaction, or will it become an intelligent assistant similar to Siri, with mediocre performance in tasks that require real emotional understanding? Perhaps in the future, AI’s “mind reading” will subvert human-computer and even human interaction, but at least for now, truly understanding and responding to human emotions may still require more human participation and prudence.