Speech Coder using Line Spectral Frequencies of Cascaded Second Order Predictors

dc.contributor.authorNamburu, Visalaen
dc.contributor.committeechairBeex, A. A. Louisen
dc.contributor.committeememberBaumann, William T.en
dc.contributor.committeememberWoerner, Brian D.en
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2014-03-14T20:47:46Zen
dc.date.adate2001-11-14en
dc.date.available2014-03-14T20:47:46Zen
dc.date.issued2001-11-09en
dc.date.rdate2002-11-14en
dc.date.sdate2001-11-12en
dc.description.abstractA major objective in speech coding is to represent speech with as few bits as possible. Usual transmission parameters include auto regressive parameters, pitch parameters, excitation signals and excitation gains. The pitch predictor makes these coders sensitive to channel errors. Aiming for robustness to channel errors, we do not use pitch prediction and compensate for its lack with a better representation of the excitation signal. We propose a new speech coding approach, Vector Sum Excited Cascaded Linear Prediction (VSECLP), based on code excited linear prediction. We implement forward linear prediction using five cascaded second order sections - parameterized in terms of line spectral frequency - in place of the conventional tenth order filter. The line spectral frequency parameters estimated by the Direct Line Spectral Frequency (DLSF) adaptation algorithm are closer to the true values than those estimated by the Cascaded Recursive Least Squares - Subsection algorithm. A simplified version of DLSF is proposed to further reduce computational complexity. Split vector quantization is used to quantize the line spectral frequency parameters and vector sum codebooks to quantize the excitation signals. The effect on reconstructed speech quality and transmission rate, of an increased number of bits and differently split combinations, is analyzed by testing VSECLP on the TIMIT database. The quantization of the excitation vectors using the discrete cosine transform resulted in segmental signal to noise ratio of 4 dB at 20.95 kbps, whereas the same quality was obtained at 9.6 kbps using vector sum codebooks.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-11122001-094938en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-11122001-094938/en
dc.identifier.urihttp://hdl.handle.net/10919/35670en
dc.publisherVirginia Techen
dc.relation.haspartVN_etd.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectVector Quantizationen
dc.subjectSpeech Codingen
dc.subjectCascaded Second Order Predictorsen
dc.subjectLinear Predictionen
dc.subjectLine Spectral Frequenciesen
dc.titleSpeech Coder using Line Spectral Frequencies of Cascaded Second Order Predictorsen
dc.typeThesisen
thesis.degree.disciplineElectrical and Computer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
VN_etd.pdf
Size:
1.08 MB
Format:
Adobe Portable Document Format

Collections