Now showing items 1-20 of 46

    • Analyzing and Visualizing Disaster Phases from Social Media Streams 

      Lin, Xiao; Chen, Liangzhe; Wood, Andrew (2012-12-11)
      Working under the direction of CTRNet, we developed a procedure for classifying Twitter data related to natural/man-made disasters into one of the Four Phases of Emergency Management (response, recovery, mitigation, and ...
    • CINET GDS-Calculator: Graph Dynamical Systems Visualization 

      Wu, Sichao; Zhang, Yao (2012-12-10)
      This report summarizes the project of Graph Dynamical Systems Visualization, which is a subproject under the umbrella of project CINET. Base on some input information, we extract the character of system dynamics and output ...
    • CINETGraphCrawl - Constructing graphs from blogs 

      Kaw, Rushi; Subbiah, Rajesh; Makkapati, Hemanth (2012-12-11)
      Internet forums, weblogs, social networks, and photo and video sharing websites are some forms of social media that are at the forefront of enabling communication among individuals. The rich information captured in social ...
    • Classification Project in CS5604, Spring 2016 

      Bock, Matthew; Cantrell, Michael; Shahin, Hossameldin L. (2016-05-04)
      In the grand scheme of a large Information Retrieval project, the work of our team was that of performing text classification on both tweet collections and their associated webpages. In order to accomplish this task, we ...
    • Classification of Arabic Documents 

      Elbery, Ahmed (2012-12-19)
      Arabic language is a very rich language with complex morphology, so it has a very different and difficult structure than other languages. So it is important to build an Arabic Text Classifier (ATC) to deal with this complex ...
    • Classification Team Project for IDEAL in CS5604, Spring 2015 

      Cui, Xuewen; Tao, Rongrong; Zhang, Ruide (2015-05-10)
      Given the tweets from the instructor and cleaned webpages from the Reducing Noise team, the planned tasks for our group were to find the best: (1) way to extract information that will be used for document representation; ...
    • Clustering and Topic Analysis in CS 5604 Information Retrieval Fall 2016 

      Bartolome, Abigail; Islam, M. D.; Vundekode, Soumya (Virginia Tech, 2016-12-08)
      The IDEAL (Integrated Digital Event Archiving and Library) and Global Event and Trend Archive Research (GETAR) projects aim to build a robust Information Retrieval (IR) system by retrieving tweets and webpages from social ...
    • Collaborative Filtering for IDEAL 

      Li, Tianyi; Nakate, Pranav; Song, Ziqian (2016-05-04)
      The students of CS5604 (Information Retrieval and Storage), have been building an Information Retrieval System based on tweet and webpage collections of the Digital Library Research Laboratory (DLRL). The students have ...
    • Collection Management for IDEAL 

      Ma, Yufeng; Nan, Dong (2016-05-04)
      The collection management portion of the information retrieval system has three major tasks. The first task is to perform incremental update of the new data flow from the tweet MySQL database to HDFS and then to HBase. ...
    • Collection Management of Electronic Theses and Dissertations (CME) CS5604 Fall 2019 

      Kaushal, Kulendra Kumar; Kulkarni, Rutwik; Sumant, Aarohi; Wang, Chaoran; Yuan, Chenhan; Yuan, Liling (Virginia Tech, 2019-12-23)
      The class ``CS 5604: Information Storage and Retrieval'' in the fall of 2019 is divided into six teams to enhance the usability of the corpus of electronic theses and dissertations maintained by Virginia Tech University ...
    • Collection Management Tobacco Settlement Documents (CMT) CS5604 Fall 2019 

      Muhundan, Sushmethaa; Bendelac, Alon; Zhao, Yan; Svetovidov, Andrei; Biswas, Debasmita; Marin Thomas, Ashin (Virginia Tech, 2019-12-11)
      Consumption of tobacco causes health issues, both mental and physical. Despite this widely known fact, tobacco companies had sustained their huge presence in the market over the past century owing to a variety of successful ...
    • Collection Management Tweets Project Fall 2017 

      Khaghani, Farnaz; Zeng, Junkai; Bhuiyan, Momen; Tabassum, Anika; Bandyopadhyay, Payel (Virginia Tech, 2018-01-17)
      The report included in this submission documents the work by the Collection Management Tweets (CMT) team, which is a part of the bigger effort in CS5604 on building a state-of-the-art information retrieval and analysis ...
    • Collection Management Webpages 

      Eagan, Mackenzie; Liang, Xiao; Michael, Louis; Patil, Supritha (Virginia Polytechnic Institute and State University, 2017-12-25)
      The Collection Management Webpages team is responsible for collecting, processing, and storing webpages from different sources. Our team worked on familiarizing ourselves with the necessary tools and data required to produce ...
    • Collection Management Webpages - Fall 2016 CS5604 

      Dao, Tung; Wakeley, Christopher; Weigang, Liu (Virginia Tech, 2017-03-23)
      The Collection Management Webpages (CMW) team is responsible for collecting, processing and storing webpages from different sources including tweets from multiple collections and contributors, such as those related to ...
    • CS 5604 2020: Information Storage and Retrieval TWT - Tweet Collection Management Team 

      Baadkar, Hitesh; Chimote, Pranav; Hicks, Megan; Juneja, Ikjot; Kusuma, Manisha; Mehta, Ujjval; Patil, Akash; Sharma, Irith (Virginia Tech, 2020-12-16)
      The Tweet Collection Management (TWT) Team aims to ingest 5 billion tweets, clean this data, analyze the metadata present, extract key information, classify tweets into categories, and finally, index these tweets into ...
    • CS 5604 INFORMATION STORAGE AND RETRIEVAL Front-End Team Fall 2016 Final Report 

      Kohler, Rachel; Tasooji, Reza; Sullivan, Patrick (Virginia Tech, 2016-12-08)
      Information Retrieval systems are a common tool for building research and disseminating knowledge. For this to be possible, these systems must be able to effectively show varying amounts of relevant information to the ...
    • CS 5604: Information Storage and Retrieval - Webpages (WP) Team 

      Barry-Straume, Jostein; Vives, Cristian; Fan, Wentao; Tan, Peng; Zhang, Shuaicheng; Hu, Yang; Wilson, Tishauna (Virginia Tech, 2020-12-18)
      The first major goal of this project is to build a state-of-the-art information retrieval engine for searching webpages and for opening up access to existing and new webpage collections resulting from Digital Library ...
    • CS5604 (Information Retrieval) Fall 2020 Front-end (FE) Team Project 

      Cao, Yusheng; Mazloom, Reza; Ogunleye, Makanjuola (Virginia Tech, 2020-12-16)
      With the demand and abundance of information increasing over the last two decades, generations of computer scientists are trying to improve the whole process of information searching, retrieval, and storage. With the ...
    • CS5604 Fall 2016 Classification Team Final Report 

      Williamson, Eric R.; Chakravarty, Saurabh (Virginia Tech, 2016-12-08)
      Content is generated on the Web at an exponential rate. The type of content varies from text on a traditional webpage to text on social media portals (e.g., social network sites and microblogs). One such example of social ...
    • CS5604 Fall 2016 Solr Team Project Report 

      Li, Liuqing; Pillai, Anusha; Wang, Ye; Tian, Ke (Virginia Tech, 2016-12-07)
      This submission describes the work the SOLR team completed in Fall 2016. It includes the final report and presentation, as well as key relevant materials (indexing scripts & Java code). Based on the work in Spring 2016, ...