Computational Linguistics Hurricane Group
dc.contributor.author | Crowder, Nicholas | en |
dc.contributor.author | Nguyen, David | en |
dc.contributor.author | Hsu, Andy | en |
dc.contributor.author | Mecklenburg, Will | en |
dc.contributor.author | Morris, Jeff | en |
dc.date.accessioned | 2014-12-14T01:00:10Z | en |
dc.date.available | 2014-12-14T01:00:10Z | en |
dc.date.issued | 2014-12 | en |
dc.description | The report appears in Word and PDF formats as "FinalPaper". The FinalPresentation as PDF resulted from the use of Prezi; see FinalPresentationRaw.zip. The source code developed is in the "main" ZIP archive. | en |
dc.description.abstract | The problem-project based learning described in our presentation and report addresses automatic summarization of web content using natural language processing. Initially, we used simple techniques such as word frequencies and WordNet along with n-grams to create summaries. Further approaches became more complex due to the introduction of tools such as Mahout and k-means for topics and clustering. This finally culminated in the use of custom templates and a grammar to generate English sentences to accurately summarize a corpus. Our English summary was created using a grammar alongside regular expressions to extract information. The previous units all built up to the construction of quality regular expressions, in addition to a clean dataset, and some extra tools, such as a classifier trained on our data, as well as a part-of-speech tagger. | en |
dc.description.sponsorship | NSF DUE-1141209 and IIS-1319578 | en |
dc.identifier.uri | http://hdl.handle.net/10919/51136 | en |
dc.language.iso | en_US | en |
dc.rights | Creative Commons CC0 1.0 Universal Public Domain Dedication | en |
dc.rights.uri | http://creativecommons.org/publicdomain/zero/1.0/ | en |
dc.subject | hurricanes | en |
dc.subject | nlp | en |
dc.subject | linguistics | en |
dc.subject | summary | en |
dc.subject | Typhoon Haiyan | en |
dc.subject | natural language processing | en |
dc.subject | automatic generation | en |
dc.subject | automatic summarization | en |
dc.subject | Hurricane Sandy | en |
dc.title | Computational Linguistics Hurricane Group | en |
dc.type | Presentation | en |
dc.type | Software | en |
dc.type | Technical report | en |
Files
Original bundle
1 - 5 of 5
Loading...
- Name:
- FinalPaper.pdf
- Size:
- 428.53 KB
- Format:
- Adobe Portable Document Format
- Description:
- Final Report (PDF)
Loading...
- Name:
- FinalPresentation.pdf
- Size:
- 3.83 MB
- Format:
- Adobe Portable Document Format
- Description:
- Project Presentation (Prezi presentation in PDF format)
- Name:
- main.zip
- Size:
- 28.57 KB
- Format:
- Unknown data format
- Description:
- Source Code (in ZIP format)
- Name:
- FinalPresenationRaw.zip
- Size:
- 54.13 MB
- Format:
- Unknown data format
- Description:
- Project Presentation Raw (Prezi presentation in ZIP format)
License bundle
1 - 1 of 1
- Name:
- license.txt
- Size:
- 1.5 KB
- Format:
- Item-specific license agreed upon to submission
- Description: