Identification of formant values in pre-recorded audio samples displaying various emotions using Praat software.

Author Jaini Gandhi
Co-Author Aqsa Temrikar
DOI http://wwj

Country : India
Subject :

Keywords : speech, emotions, formant, synthesis, audio

Abstract

Detection of emotions in voices can be crucial to understanding how these emotions can be displayed in synthesized voices. Emotion detection can be done by extracting standard formant frequencies from an existing dataset which displays 6 basic emotions according to Paul Ekman – happiness, sadness, fear, disgust, surprise, and anger – and using neutral voice audio to then compare the frequency values. Analysis is done using RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song). This led to finding average formant values F1, F2, F3 for aforementioned emotions. Displaying emotions in synthesized voices can open an entire new realm in terms of enhanced user experience and better AI-human understanding.

Download