Identification of formant values in pre-recorded audio samples displaying various emotions using Praat software.

Jaini Gandhi

Identification of formant values in pre-recorded audio samples displaying various emotions using Praat software.

Keywords : speech, emotions, formant, synthesis, audio


Abstract

Detection of emotions in voices can be crucial to understanding how these emotions can be displayed in synthesized voices. Emotion detection can be done by extracting standard formant frequencies from an existing dataset which displays 6 basic emotions according to Paul Ekman – happiness, sadness, fear, disgust, surprise, and anger – and using neutral voice audio to then compare the frequency values. Analysis is done using RAVDESS (Ryerson Audio-Visual Database of Emotional Speech and Song). This led to finding average formant values F1, F2, F3 for aforementioned emotions. Displaying emotions in synthesized voices can open an entire new realm in terms of enhanced user experience and better AI-human understanding.

Download



Comments
No have any comment !
Leave a Comment