IDEAL Pages
Files
TR Number
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
The main goal of this project is to provide a convenient Web enabled interface to a large collection of event-related webpages supporting the two main services of browsing and searching. We first studied the events and decided what fields are required to build the events index based on the dataset available to us. We then configured a SolrCloud with a collection based on these fields in the Schema.xml file. Then we built a Hadoop Map-Reduce function along with SolrCloud to index documents related to the data about 60 events crawled from the Web. Then we were able to find a way to interface with the Solr server and indexed documents through a PHP server application. Finally, we were able to design a convenient user interface that allows users to browse the documents by event category and event name as well as to search the document collection for particular keywords.