Team 2 : Search and Recommendation

Abstract

Theses and dissertations represent significant bodies of work accomplished by others, often containing remarkable contributions. The advent of electronic theses and dissertations (ETDs) aimed to simplify the storage and accessibility of these documents. However, their true value is realized when accompanied by an effective system for searching and retrieving specific documents. Our project involved building an Information Retrieval System that supports searching, ranking, browsing and recommendations for a large collection of ETDs. We divided the main goal into two modules - Search and Recommendation. Search is accomplished using Elasticsearch. An overview of the tool is given in the report, along with goals and the implementation process. A recommendation module will provide relevant recommendations for a user, built by experimenting with multiple algorithms in order to obtain the best results. The user manual has been provided for the reference of other groups. The developer manual includes how the project was developed, including architecture, data flow, module overviews, etc. The final report provides an overview of the tasks undertaken, how we planned to achieve our goals, milestones and our timelines. By the project's conclusion, we successfully scaled the system to manage 500K ETDs. Our efforts resulted in enhancements, particularly in bulk indexing and achieving faster response times for searches. Additionally, we refined the existing index schema and implemented a logging mechanism within Elasticsearch to accommodate logs from all collaborating teams.

Description

Team2SearchAndRecommendationReport.pdf is the PDF version of the final report. Team2SearchAndRecommendationReport.zip is the Overleaf project version of that report. Team2SearchAndRecommendationPresentation.pdf is the PDF version of the final presentation. Team2SearchAndRecommendationPresentation.pptx is the PowerPoint version of the final presentation.

Keywords

search, recommendation, clustering, elasticsearch, experiments

Citation