Integrated Digital Event Archiving and Library (IDEAL): Preview of Award 1319578 - Annual Project Report

Fox, Edward A.; Hanna, Kristine; Kavanaugh, Andrea L.; Sheetz, Steven D.; Shoemaker, Donald J.

Integrated Digital Event Archiving and Library (IDEAL): Preview of Award 1319578 - Annual Project Report

Files

2014_Integrated_Digital_Event_Archiving.pdf (180.88 KB)

Downloads: 357

Date

2014-07-09

Authors

Abstract

The goals of this project are to ingest tweets and Web-based content from social media and the general Web, including news and governmental information. In addition to archiving materials found, the project team will build an information system that includes related metadata and knowledge bases, consistent with the 5S (Societies, Scenarios, Spaces, Structures, Streams) framework, along with results from our intelligent focused crawler, to support comprehensive access to event related content. With the support of key partners, the IDEAL team will undertake important research, education, and dissemination efforts, to achieve three complementary objectives: 1. Collecting: The project team will spot, identify, and make sense of interesting events. We also will accept specific or general requests about types of events. Given resource and sampling constraints, we will integrate methods to identify appropriate URLs as seeds, and specify when to start crawling and when to stop, with regard to each event or sub-event. We will integrate focused crawling and filtering approaches in order to ingest content and generate new collections, with high precision and recall. 2. Archiving & Accessing: Permanent archiving, and access to those archives, will be ensured by our partner, Internet Archive (IA). Immediate access to ingested content will be facilitated through big data software built on top of our new Hadoop cluster. 3. Analyzing & Visualizing: We will provide a wide range of integrated services beyond the usual (faceted) browsing and searching, including: classification, clustering, summarization, text mining, theme and topic identification, and visualization.

Keywords

Digital event archiving, Twitter, Social media, Information systems

Citation

Fox, Edward A., Kristine Hannah, Andrea Kavanaugh, Steven Sheetz, Donald Shoemaker. Integrated Digital Event Archiving and Library (IDEAL). 2014

Persistent link

http://hdl.handle.net/10919/52853

Collections

Reports, Digital Library Research Laboratory

Full item page

Integrated Digital Event Archiving and Library (IDEAL): Preview of Award 1319578 - Annual Project Report

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections