Mathematical Expression Detection and Segmentation in Document Images

dc.contributor.authorBruce, Jacob Roberten
dc.contributor.committeechairAbbott, A. Lynnen
dc.contributor.committeememberHsiao, Michael S.en
dc.contributor.committeememberXuan, Jianhuaen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2014-03-20T08:00:17Zen
dc.date.available2014-03-20T08:00:17Zen
dc.date.issued2014-03-19en
dc.description.abstractVarious document layout analysis techniques are employed in order to enhance the accuracy of optical character recognition (OCR) in document images. Type-specific document layout analysis involves localizing and segmenting specific zones in an image so that they may be recognized by specialized OCR modules. Zones of interest include titles, headers/footers, paragraphs, images, mathematical expressions, chemical equations, musical notations, tables, circuit diagrams, among others. False positive/negative detections, oversegmentations, and undersegmentations made during the detection and segmentation stage will confuse a specialized OCR system and thus may result in garbled, incoherent output. In this work a mathematical expression detection and segmentation (MEDS) module is implemented and then thoroughly evaluated. The module is fully integrated with the open source OCR software, Tesseract, and is designed to function as a component of it. Evaluation is carried out on freely available public domain images so that future and existing techniques may be objectively compared.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:2315en
dc.identifier.urihttp://hdl.handle.net/10919/46724en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectdocument layout analysisen
dc.subjectoptical character recognitionen
dc.subjectmathematical expression detection and segmentationen
dc.subjectdocument imageen
dc.subjecttype-specific layout analysisen
dc.titleMathematical Expression Detection and Segmentation in Document Imagesen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Bruce_JR_T_2014.pdf
Size:
9.52 MB
Format:
Adobe Portable Document Format
Loading...
Thumbnail Image
Name:
Bruce_JR_T_2014_support_1.pdf
Size:
2.63 MB
Format:
Adobe Portable Document Format
Description:
Supporting documents

Collections