A Comparison of Statistical Filtering Methods for Automatic Term Extraction for Domain Analysis

dc.contributor.authorTilley, Jason W.en
dc.contributor.committeechairFrakes, William B.en
dc.contributor.committeememberKulczycki, Gregory W.en
dc.contributor.committeememberBelli, Gabriella M.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2014-03-14T20:30:11Zen
dc.date.adate2009-05-13en
dc.date.available2014-03-14T20:30:11Zen
dc.date.issued2008-12-22en
dc.date.rdate2009-05-13en
dc.date.sdate2009-01-05en
dc.description.abstractFourteen word frequency metrics were tested to evaluate their effectiveness in identifying vocabulary in a domain. Fifteen domain engineering projects were examined to measure how closely the vocabularies selected by the fourteen word frequency metrics were to the vocabularies produced by domain engineers. Six filtering mechanisms were also evaluated to measure their impact on selecting proper vocabulary terms. The results of the experiment show that stemming and stop word removal do improve overlap scores and that term frequency is a valuable contributor to overlap. Variations on term frequency are not always significant improvers of overlap.en
dc.description.degreeMaster of Scienceen
dc.identifier.otheretd-01052009-103100en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-01052009-103100/en
dc.identifier.urihttp://hdl.handle.net/10919/30818en
dc.publisherVirginia Techen
dc.relation.haspartJasonThesis_5_10_09.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectdomain analysisen
dc.subjectterm extractionen
dc.titleA Comparison of Statistical Filtering Methods for Automatic Term Extraction for Domain Analysisen
dc.typeThesisen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
JasonThesis_5_10_09.pdf
Size:
1.19 MB
Format:
Adobe Portable Document Format

Collections