Virginia Tech
    • Log in
    View Item 
    •   VTechWorks Home
    • Student Works
    • CS6604: Digital Libraries
    • View Item
    •   VTechWorks Home
    • Student Works
    • CS6604: Digital Libraries
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    CS6604 Spring 2017 Global Events Team Project

    Thumbnail
    View/Open
    GlobalEvents_Code.zip (2.593Mb)
    Downloads: 58
    GlobalEvents_Presentation.pdf (1.097Mb)
    Downloads: 101
    GlobalEvents_Presentation.pptx (1.217Mb)
    Downloads: 107
    GlobalEvents_Report.docx (7.611Mb)
    Downloads: 746
    GlobalEvents_Report.pdf (5.790Mb)
    Downloads: 377
    Date
    2017-05-03
    Author
    Li, Liuqing
    Harb, Islam
    Galad, Andrej
    Metadata
    Show full item record
    Abstract
    This submission describes the work the Global Events team completed in Spring 2017. It includes the final report and presentation, as well as key relevant materials (source code). Based on the previous reports and different modules created by former teams, the Global Events team established a pipeline for processing Web ARChives supporting the IDEAL and GETAR projects, both funded by NSF. With the Internet Archive’s help, the Global Events team enhanced the Event Focused Crawler to retrieve more relevant webpages (i.e., about school shooting events) in WARC format. ArchiveSpark, an Apache Spark framework that facilitates access to Web Archives, was deployed on a stand-alone server, and multiple techniques, such as parsing, Stanford NER, regular expression and statistical methods, were leveraged to process and analyze the data, and describe those events. For the data visualization, an integrated user interface using Gradle was designed and implemented for trend results, which can be easily used by both CS and non-CS researchers and students. Moreover, new well written manuals could be easier for users and developers to read and get familiar with ArchiveSpark, Spark, and Scala.
    URI
    http://hdl.handle.net/10919/77867
    Collections
    • CS6604: Digital Libraries [19]

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us
     

     

    VTechWorks

    AboutPoliciesHelp

    Browse

    All of VTechWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Log inRegister

    Statistics

    View Usage Statistics

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us