Virginia Tech
    • Log in
    View Item 
    •   VTechWorks Home
    • Student Works
    • CS5604: Information Retrieval
    • View Item
    •   VTechWorks Home
    • Student Works
    • CS5604: Information Retrieval
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Solr Project with IDEAL, in CS5604 (Information Storage and Retrieval)

    Thumbnail
    View/Open
    Solr team Software Code package including Solr schema (Schema.xml), Morphline configuration (Morphlines.conf), batch indexing script (batch_indexing.sh), Lily indexer script (add-indexer.sh), and Java code. (7.559Mb)
    Downloads: 53
    Solr team final report in Word version (7.828Mb)
    Downloads: 1932
    Solr team final report in PDF version (7.381Mb)
    Downloads: 3658
    Solr team final presentation in PowerPoint version (2.991Mb)
    Downloads: 91
    Solr team final presentation in PDF version (2.118Mb)
    Downloads: 473
    Date
    2016-05-04
    Author
    Xia, Long
    Jiang, Tingting
    Galad, Andrej
    Maharshi, Shivam
    Metadata
    Show full item record
    Abstract
    This submission describes the work of the Solr team as part of the IDEAL project with the main goal of designing and developing a distributed search infrastructure. It includes the project reports, final presentations, as well as the solutions (configuration files & Java code) developed. The main responsibility of our team was to configure Near Real Time Indexing and implement Custom Ranking for tweets and web page collections. The idea behind NRT Indexing is to help perform incremental updates from an HBase table into the Solr index, thereby optimizing time utilized and compute resources. The main motivation behind the Custom Ranking solution is to improve system precision and recall by transforming user queries with the use of the metadata provided by the other teams. The implementation leverages these three techniques: Query Expansion, Psuedo Relevance Feedback and Query Boosting. Throughout the semester we closely collaborated with several other teams both in getting requirements and the input data.
    URI
    http://hdl.handle.net/10919/70928
    Collections
    • CS5604: Information Retrieval [51]
    • Research and Informatics Division, University Libraries [171]

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us
     

     

    VTechWorks

    AboutPoliciesHelp

    Browse

    All of VTechWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Log inRegister

    Statistics

    View Usage Statistics

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us