Automatic Phoneme Recognition with Segmental Hidden Markov Models

dc.contributor.authorBaghdasaryan, Areg Gagiken
dc.contributor.committeechairBeex, A. A. Louisen
dc.contributor.committeememberWyatt, Christopher L.en
dc.contributor.committeememberda Silva, Claudio R. C. M.en
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2014-03-14T20:31:38Zen
dc.date.adate2010-03-10en
dc.date.available2014-03-14T20:31:38Zen
dc.date.issued2010-01-27en
dc.date.rdate2010-03-10en
dc.date.sdate2010-02-08en
dc.description.abstractA speaker independent continuous speech phoneme recognition and segmentation system is presented. We discuss the training and recognition phases of the phoneme recognition system as well as a detailed description of the integrated elements. The Hidden Markov Model (HMM) based phoneme models are trained using the Baum-Welch re-estimation procedure. Recognition and segmentation of the phonemes in the continuous speech is performed by a Segmental Viterbi Search on a Segmental Ergodic HMM for the phoneme states. We describe in detail the three phases of the phoneme joint recognition and segmentation system. First, the extraction of the Mel-Frequency Cepstral Coefficients (MFCC) and the corresponding Delta and Delta Log Power coefficients is described. Second, we describe the operation of the Baum-Welch re-estimation procedure for the training of the phoneme HMM models, including the K-Means and the Expectation-Maximization (EM) clustering algorithms used for the initialization of the Baum-Welch algorithm. Additionally, we describe the structural framework of - and the recognition procedure for - the ergodic Segmental HMM for the phoneme segmentation and recognition. We include test and simulation results for each of the individual systems integrated into the phoneme recognition system and finally for the phoneme recognition/segmentation system as a whole.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-02082010-174617en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-02082010-174617/en
dc.identifier.urihttp://hdl.handle.net/10919/31182en
dc.publisherVirginia Techen
dc.relation.haspartBaghdasaryan_AG_T_2010.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectViterbien
dc.subjectBaum Welchen
dc.subjectHidden Markov Modelen
dc.subjectSegmental HMMen
dc.subjectClusteren
dc.subjectSpeechen
dc.subjectSpeakeren
dc.titleAutomatic Phoneme Recognition with Segmental Hidden Markov Modelsen
dc.typeThesisen
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Baghdasaryan_AG_T_2010.pdf
Size:
8.99 MB
Format:
Adobe Portable Document Format

Collections