NSF 3rd Year Report: CTRnet: Integrated Digital Library Support for Crisis, Tragedy, and Recovery
The Crisis, Tragedy and Recovery (CTR) network, or CTRnet, is a human and digital library network for providing a range of services relating to different kinds of tragic events, including broad collaborative studies related to Egypt, Tunisia, Mexico, and Arlington, Virginia. Through this digital library, we collect and archive different types of CTR related information, and apply advanced information analysis methods to this domain. It is hoped that services provided through CTRnet can help communities, as they heal and recover from tragic events. We have taken several major steps towards our goal of building a digital library for CTR events. Different strategies for collecting comprehensive information surrounding various CTR events have been explored, initially using school shooting events as a testbed. Many GBs worth of related data has been collected using the web crawling tools and methodologies we developed. Several different methods for removing non-relevant pages (noise) from the crawled data have been explored. A focused crawler is being developed with the aim of providing users the ability to build high quality collections for CTR events focused on their interests. Use of social media for CTRnet related research is being explored. Software to integrate the popular social networking site Facebook with the CTRnet digital library has been prototyped, and is being developed further. Integration of the popular micro-blogging site Twitter with the CTRnet digital library has proceeded well, and is being further automated, becoming a key part of our methodology.