Arabic News Article Summarization
MetadataShow full item record
This project involves taking Arabic PDF news articles to produce results from our new program that indexes, categorizes, and summarizes them. We fill out a template to summarize news articles with predetermined attributes. These values will be extracted using a named entity recognizer (NER) which will recognize organizations and people, topic generation using an LDA algorithm, and direct information extraction from news articles’ authors and dates. We use Fusion LucidWorks (a Solr based system) to help with the indexing of our data and provide an interface for the user to search and browse the articles with their summaries. Solr is used for information retrieval. The final program should enable end users to sift through news articles quickly.