Go to JKU Homepage
Institute of Measurement Technology
What's that?

Institutes, schools, other departments, and programs create their own web content and menus.

To help you better navigate the site, see here where you are at the moment.

Bachelor Theses.

Vowel detection in continuous speech

Rene Eglseer

 

The spoken language can be divided into two groups, vowels and consonants. This distinction can be done with the help of a computer. To achieve that, some interm steps are necessary.

 

A suitable database containing speech samples is used to calculate features. The voice recordings must be divided into time frames. For each frame nearly 100 different values are calculated which help to differentiate the sounds of speech. All of the algorithms used in this thesis are utilized in audio and signal processing.

 

Figure 1: First steps, data processing

For machine learning it is advantageous to have few meaningful features. That is why the number of features get reduced. This is done with different algorithms, such as principal component analysis.

 

The data is used to train an artificial intelligence. Through the collected information it is able to distinguish between vowels and consonants with a high accuracy. Post-processing improves this result by correcting outliers.

 

Figure 2: Fitting and optimizing of the AI

At the start, the thesis discusses the basics needed to understand the algorithms and the database. All relevant language processing algorithms and used databases are subsequently described in greater detail. Since this thesis focuses on signal processing, machine learning is treated in less detail.

Keywords: vowel recognition, AI, formant

August 18th, 2020