Show simple item record

dc.contributor.authorXia, Long
dc.contributor.authorJiang, Tingting
dc.contributor.authorGalad, Andrej
dc.contributor.authorMaharshi, Shivam
dc.date.accessioned2016-05-07T13:24:33Z
dc.date.available2016-05-07T13:24:33Z
dc.date.issued2016-05-04
dc.identifier.urihttp://hdl.handle.net/10919/70928
dc.descriptionThis submission describes the work of the Solr team as part of the IDEAL project with the main goal of designing and developing a distributed search infrastructure. It includes the project reports, final presentations as well as the solutions (configuration files & Java code) developed.en_US
dc.description.abstractThis submission describes the work of the Solr team as part of the IDEAL project with the main goal of designing and developing a distributed search infrastructure. It includes the project reports, final presentations, as well as the solutions (configuration files & Java code) developed. The main responsibility of our team was to configure Near Real Time Indexing and implement Custom Ranking for tweets and web page collections. The idea behind NRT Indexing is to help perform incremental updates from an HBase table into the Solr index, thereby optimizing time utilized and compute resources. The main motivation behind the Custom Ranking solution is to improve system precision and recall by transforming user queries with the use of the metadata provided by the other teams. The implementation leverages these three techniques: Query Expansion, Psuedo Relevance Feedback and Query Boosting. Throughout the semester we closely collaborated with several other teams both in getting requirements and the input data.en_US
dc.description.sponsorshipNSF grant IIS - 1319578, III: Small: Integrated Digital Event Archiving and Library (IDEAL)en_US
dc.language.isoen_USen_US
dc.subjectIDEALen_US
dc.subjectSolren_US
dc.subjectLuceneen_US
dc.subjectCustom Rankingen_US
dc.subjectQuery Expansionen_US
dc.subjectNear Real Time Indexingen_US
dc.subjectBatch-Indexingen_US
dc.subjectMorphlineen_US
dc.subjectLily Indexeren_US
dc.subjectCloudera Searchen_US
dc.subjectPseudo relevance feedbacken_US
dc.titleSolr Project with IDEAL, in CS5604 (Information Storage and Retrieval)en_US
dc.typePresentationen_US
dc.typeSoftwareen_US
dc.typeTechnical reporten_US


Files in this item

Thumbnail
Thumbnail
Thumbnail
Thumbnail
Thumbnail

This item appears in the following Collection(s)

Show simple item record