In
this work, the spectral feature MFCC is used for high accuracy, and the prosody
feature Pitch is used in conjunction with two other features, Cepstrum and DWT,
to create an Emotion- Specific Feature Set. In recent years, a great study has
been conducted to improve human-machine interaction in the area of Speech
Emotion Recognition. The Speaker's age, gender, and emotional state are all
revealed in his or her speech. Emotion recognition is the difficult challenge
of recognising a single emotion from a speaker. The database in question is
Telugu-Database, which covers four emotions: joyful, angry, sad, and neutral,
and is prompted by two male and female speakers. Different combinations of
traits are utilised to identify the associated emotion, and these features are
referred to as Emotionspecific characteristics. When these characteristics are
taken into consideration, the rate of combination identification improves. The
DWT, Cepstrum, MFCC, and MFCC features are used to extract function
information. as well as Pitch After feature extraction, the data is classified
using a back-propagation neural network technique, and the results are
reviewed. The study found that increasing the number of nodes in the network
and the number of iterations increases the recognition rate above 90%, that
combining feature sets gives a better emotion recognition rate than using
individual feature sets, and that the feature set combination
DWT+Pitch+Cepstrum produced an individual emotion recognition rate of over 95%.
Author (s) Details
JNTUA, Anantapuramu, India.
M. Kamaraju
Department of Electronics and Communication Engineering, Gudlavalleru Engineering College, Gudlavalleru, India.
V. Sumalatha
Department of Electronics and Communication Engineering, JNTU College of Engineering, Anantapuramu, India.
View Book :- https://stm.bookpi.org/AAER-V12/article/view/1279
No comments:
Post a Comment