Show simple item record

dc.contributor.authorXia, Longen
dc.contributor.authorJiang, Tingtingen
dc.contributor.authorGalad, Andrejen
dc.contributor.authorMaharshi, Shivamen
dc.date.accessioned2016-05-07T13:24:33Zen
dc.date.available2016-05-07T13:24:33Zen
dc.date.issued2016-05-04en
dc.identifier.urihttp://hdl.handle.net/10919/70928en
dc.descriptionThis submission describes the work of the Solr team as part of the IDEAL project with the main goal of designing and developing a distributed search infrastructure. It includes the project reports, final presentations as well as the solutions (configuration files & Java code) developed.en
dc.description.abstractThis submission describes the work of the Solr team as part of the IDEAL project with the main goal of designing and developing a distributed search infrastructure. It includes the project reports, final presentations, as well as the solutions (configuration files & Java code) developed. The main responsibility of our team was to configure Near Real Time Indexing and implement Custom Ranking for tweets and web page collections. The idea behind NRT Indexing is to help perform incremental updates from an HBase table into the Solr index, thereby optimizing time utilized and compute resources. The main motivation behind the Custom Ranking solution is to improve system precision and recall by transforming user queries with the use of the metadata provided by the other teams. The implementation leverages these three techniques: Query Expansion, Psuedo Relevance Feedback and Query Boosting. Throughout the semester we closely collaborated with several other teams both in getting requirements and the input data.en
dc.description.sponsorshipNSF grant IIS - 1319578, III: Small: Integrated Digital Event Archiving and Library (IDEAL)en
dc.language.isoen_USen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectIDEALen
dc.subjectSolren
dc.subjectLuceneen
dc.subjectCustom Rankingen
dc.subjectQuery Expansionen
dc.subjectNear Real Time Indexingen
dc.subjectBatch-Indexingen
dc.subjectMorphlineen
dc.subjectLily Indexeren
dc.subjectCloudera Searchen
dc.subjectPseudo relevance feedbacken
dc.titleSolr Project with IDEAL, in CS5604 (Information Storage and Retrieval)en
dc.typePresentationen
dc.typeSoftwareen
dc.typeTechnical reporten


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record