Phrasal Document Analysis for Modeling


TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Tech


Specifications of digital hardware systems are typically written in a natural language. The objective of this research is automatic information extraction from specifications to aid model generation for system level design automation. This is done by automatic extraction of the noun phrases and the verbs from the natural language specification statements. First, the natural language sentences are parsed using a chart parser. Then, a noun phrase and verb extractor scans these charts to obtain the noun phrases with their frequencies of occurrence. The noun phrases are then classified by semantic types. Also the verbs are automatically assigned their respective roots and classified. Finally, each sentence is summarized as a sequence of "chunks": noun phrases, verbs and prepositions. Vectors are generated from these chunks and imported into MS Excel for plotting occurrence graphs of noun phrases and verbs with respect to the sentences in which they occur. Finally, inter-term dependencies between noun phrases, and between noun phrases and verbs were studied. The frequencies of occurrence, the classification of chunks, the occurrence graphs and the inter-term dependencies together give useful information about the subject, the hardware components and the behavior of a system described by a natural language specification document.



Chunk, Information Extraction, Modeling, ModelMaker, Noun Phrase, Parser