Microsoft recently launched a new voice technology that will make the intelligent voice more delicate and controllable. The new artificial intelligent speech technology will let the users adjust the “multi-emotional” level.
It is a general consensus that human emotions can be detected in the slightest chance of voice and tone. The new Smart Voice technology from Microsoft can detect the underlying human emotion in a voice by distinguishing Happy, Sad, Angry, Fearful, Disgruntled, Serious, Affectionate, Gentle, Depressed, Embarrassed (Embarrassed), calm (Calm), and other emotions. The calm tone will act as the zero points and one percent as a quantitative unit. The new technology can also detect the change in emotion through speech in Chinese voice also, such as Xiaoxiao, Yunxi, Yunye, Xiaohan, Xiaoxuan, Xiaomo, and Xiaorui all support the “emotional level” adjustment technology. Moreover, the new Smart Voice has different ages, genders and personalities.
The new intelligent speech emotion adjustment is based on an adaptive neural network. The developers make use of the SSML tags to control the degree of emotion. However, this does not mean that only the developers possessing the SSML labeling can use this feature; users who do not have any programming training can also access the feature through the audio content creation platform.
The Smart Voice technology integrates the automatic text sentiment analysis technology to predict the range of emotional categories and successfully interpret the underlying emotion under speech. This will come in handy when interpreting any speech pattern. Some reports have compared the new technology with the director’s casting since it can better interpret works through the most suitable voice and the most appropriate emotions.
It is suitable for chat robots, audiobook reading, automatic film, and television dubbing, and games Wait for many scenarios. Hopefully, in the future, this feature will expand to encompass other services as well.