Continuous HMM connected digit recognition
In this thesis we develop a system for recognition of strings of connected digits that can be used in a hands-free telephone system. We present a detailed description of the elements of the recognition system, such as an endpoint algorithm, the extraction of feature vectors from the speech samples, and the practical issues involved in training and recognition, in a Hidden Markov Model (HMM) based speech recognition system.
We use continuous mixture densities to approximate the observation probability density functions (pdfs) in the HMM. While more complex in implementation, continuous (observation) HMMs provide superior performance to the discrete (observation) HMMs.
Due to the nature of the application, ours is a speaker dependent recognition system and we have used a single speaker's speech to train and test our system. From the experimental evaluation of the effects of various model sizes on recognition performance, we observed that the use of HMMs with 7 states and 4 mixture density components yields average recognition rates better than 99% on the isolated digits. The level-building algorithm was used with the isolated digit models, which produced a recognition rate of better than 90% for 2-digit strings. For 3 and 4-digit strings, the performance was 83 and 64% respectively. These string recognition rates are much lower than expected for concatenation of single digits. This is most likely due to uncertainties in the location of the concatenated digits, which increases disproportionately with an increase in the number of digits in the string.