Digital Library Research Laboratory

Permanent URI for this community

https://hdl.handle.net/10919/18732

Browse

Now showing 1 - 20 of 134

5SL: A Language for Declarative Specification and Generation of Digital Libraries
Goncalves, Marcos A.; Fox, Edward A. (2002-07-01)
Digital Libraries (DLs) are among the most complex kinds of information systems, due in part to their intrinsic multi-disciplinary nature. Nowadays DLs are built within monolithic, tightly integrated, and generally inflexible systems- or by assembling disparate components together in an ad-hoc way, with resulting problems in interoperability and adaptability. More importantly, conceptual modeling, requirements analysis, and software engineering approaches are rarely supported, making it extremely difficult to tailor DL content and behavior to the interests, needs, and preferences of particular communities. In this paper, we address these problems. In particular, we present 5SL, a declarative language for specifying and generating domain-specific digital libraries. 5L is based on the 5S formal theory for digital libraries and enables high-level specification of DLs in five complementary dimensions, including: the kinds of multimedia information the DL supports (Stream Model); how that information is structured and organized (Structural Model); different logical and presentational properties and operations of DL components (Spatial Model); the behavior of the DL (Scenario Model); and the different societies of actors and managers of services that act together to carry out the DL behavior (Societal Model). The practical feasibility of the approach is demonstrated by the presentation of a 5SL digital library generator for the MARIAN digital library system.
The Academy: A Community of Information Retrieval Agents
France, Robert K. (1994-09-06)
We commonly picture text as a sequence of words; or alternatively as a sequence of paragraphs, each of which is composed of a sequence of sentences, each of which is itself a sequence of words. It is also worth noting that text is not so much a sequence of words as a sequence of terms, including most commonly words, but also including names, numbers, code sequences, and a variety of other $#*&)&@^ tokens. Just as we commonly simplify text into a sequence of words, so too it is common in information retrieval to regard documents as single texts. Nothing is less common, though, than a document with only a single part, and that unstructured text. Search and retrieval in such a universe involves new questions: Where does a document begin and end? How can we decide how much to show to a user? When does a query need to be matched by a single node in a hypertext, and when may partial matches in several nodes count?
Apache Solr: Indexing and Searching
Sethi, Iccha; Aslan, Serdar; Fox, Edward A. (2010-10-26)
This module addresses the basic concepts of the open source Apache Solr platform that is specifically designed for indexing documents and executing searches.
Application Software
Yang, Seungwon (2009-10-07)
This module covers commonly used application software, which are specifically designed for the creation and development of digital library (DL) systems and similar types of collections and services, such as open access archives.
ArchiveSpark - MS Independent Study Final Submission
Galad, Andrej (Virginia Tech, 2016-12-13)
This project expands upon the work at the Internet Archive of researcher Vinay Goel and of Jefferson Bailey (co-PI on two NSF-funded collaborative projects with Virginia Tech: IDEAL, GETAR) on the ArchiveSpark project - a framework for efficient Web archive access, extraction, and derivation. The main goal of the project is to quantitatively and qualitatively evaluate ArchiveSpark against mainstream Web archive processing solutions and extend it as necessary with regard to the processing of testing collections. This also relates to an IMLS funded project. This report describes the efforts and contributions made as part of this project. The primary focus of these efforts lies in the comprehensive evaluation of ArchiveSpark against existing archive-processing solutions (pure Apache Spark with pre-installed Warcbase tools and HBase) in a variety of environments and setups in order to comparatively analyze performance improvements that ArchiveSpark brings to the table as well as understand the shortcomings and tradeoffs of its usage under varying scenarios.
Audacity 1.3
Brown, Chris (2011-05-01)
This module covers the use of Audacity 1.3 hosted on an IBM Cloud Instance. Topics covered include accessing and running Audacity, and manipulating audio files such as splitting, copying, pasting, merging, and exporting.
Bringing Your Library into View
Wildemuth, Barbara M. (2007-10-01)
This presentation illustrates libraries' roles in meeting end-users' information needs, such as supporting information use and re-use, creating new information objects, and supporting research and learning.
Building Digital Libraries Made Easy: Toward Open Digital Libraries
Fox, Edward A.; Suleman, Hussein; Luo, Ming (2002)
Digital libraries (DLs) promote a sharing culture among those who contribute and those who use resources. This same approach works when building Open Digital Libraries (ODLs). Leveraging the intellectual and practical investment made in the Open Archives Initiative through an eXtended Protocol for Metadata Harvesting (XPMH), one can build lightweight protocols to tie together key components that together make up the core of a DL. DL developers in various settings have learned how to apply this framework in a few hours. The ODL approach has been effective with the Computer Science Teaching Center (www.cstc.org), the Networked Digital Library of Theses and Dissertations (www.ndltd.org), and AmericanSouth.org. Hence, to support our Computing and Information Technology Interactive Digital Educational Library (www.citidel.org) and to provide a generic capability for other parts of the US National Science, technology, engineering, and mathematics education Digital Library (www.nsdl.org), we are developing a "DL-in-a-box" toolkit. When lightweight protocols, pools of components, and open standard reference mod-els are combined carefully, as suggested in the OCKHAM discussions, both the DL user and developer communities can benefit from the principle of sharing.
Building Interoperable Digital Libraries: A Practical Guide to Creating Open Archives
Suleman, Hussein (2001)
This presentation discusses the development of the Open Archives Initiative (OAI), metadata harvesting, digital library interoperability, the National Digital Library of Theses and Dissertations (NDLTD), and more.
Building Interoperable Digital Libraries: A Practical Guide to Creating Open Archives
Suleman, Hussein (2002)
This presentation focuses on the system architecture and history of the Open Archives Initiative (OAI). It discusses OAI case studies and the requirements to be an OAI data provider.
Building the CODER Lexicon: The Collins English Dictionary and its Adverb Definitions
Fox, Edward A.; Wohlwend, Robert C.; Sheldon, Phyllis R.; Chen, Qi-Fan; France, Robert K. (1986-10-01)
The CODER (COmposite Document Expert/extended/effective Retrieval) project is an investigation of the applicability of artificial intelligence techniques to the information retrieval task of analyzing, storing, and retrieving heterogeneous collections of "composite documents. "In order to support some of the processing desired, and to allow experimentation in information retrieval and natural language processing, a lexicon was constructed from the machine readable Collins Dictionary of the English Language. After giving background, motivation, and a survey of related work, the Collins lexicon is discussed. Following is a description of the conversion process, the format of the resulting Prolog database, and characteristics of the dictionary and relations. To illustrate what is present and to explain how it relates to the files produced from Webster's Seventh New Collegiate Dictionary, a number of comparative charts are given. Finally, a grammar for adverb definitions is presented, together with a description of defining formula that usually indicate the type of the adverb. Ultimately it is hoped that definitions for adverbs and other words will be parsed so that the relational lexicon being constructed will include many additional relationships and other knowledge about words and their usage.
CLUTO Toolkit
Vijay, Sony; El Meligy Abdelhamid, Sherif; Malayattil, Sarosh (2010-10-21)
The module briefly introduces the basic concepts of Clustering. The primary focus of the module is to describe the usage of CLUTO, a clustering Toolkit, comprised of various algorithms.
Co-located Collaboration on a Large, High-Resolution Display
Vogt, Katherine; North, Christopher L.; Andrews, Christopher; Endert, Alex (2010)
Few have studied co-located collaboration, let alone co-located collaboration and the sensemaking process. Here, we define co-located collaboration as multiple users working on the same display. Intelligence analysts often must filter through massive amounts of data which may contain large portions of text. As the benefits of collaboration [1] and large displays [2] have aheady separately proven themselves, we chose to examine the sensemaking process when these two aspects are combined. The environment we created also included multiple penwwl input devices to create a multiuser workspace. By observing the user roles adopted, collaborative processes, organization of the space, and perceived ownership or sharing of territory on the display. We hope to contribute valuable insight into the design implications of software.
Collaborative Research: Curriculum Development for Digital Library Education
Fox, Edward A.; Yang, Seungwon; Wildemuth, Barbara M.; Pomerantz, Jeffrey P.; Oh, Sanghee (2006-05-01)
This presentation provides an update on the Digital Library Curriculum Development project, including its development and evaluation plan, project timeline, and emerging objectives.
Conceptual Frameworks, Models, Theories, and Definitions
Fox, Edward A. (2011-05-11)
This module introduces several conceptual modules characterizing the digital library domain. Students will be provided with a high level yet comprehensive knowledge of several conceptual frameworks and models, a unifying and extended terminology, and an overall scheme helping to classify further readings.
The Core: Digital Library Education in Library and Information Science Programs
Pomerantz, Jeffrey P.; Oh, Sanghee; Yang, Seungwon; Fox, Edward A.; Wildemuth, Barbara M. (Corporation for National Research Initiatives, 2006-11-01)
This paper identifies the "state of the art" in digital library education in Library and Information Science programs, by identifying the readings that are assigned in digital library courses and the topics of these readings. The most frequently-assigned readings are identified at multiple units of analysis, as are the topics on which readings are most frequently assigned. While no core set of readings emerged, there was significant consensus on the authors to be included in digital library course reading assignments, as well as the topics to be covered. Implications for the range of assigned readings and topics for digital library education in library science education are discussed.
Crawling
Fox, Edward A.; Khandeparker, Ashwin S. (2012-11-28)
This module covers the basic concepts of Web crawling, policies, techniques and how these can be applied to Digital Libraries.
Crisis, Tragedy, and Recovery Network (CTRnet)
Fox, Edward A. (2009-09-14)
This poster provides an overview of the Crisis, Tragedy, and Recovery Network (CTRnet). The objectives of CTRnet are to build a digital library and preserve information (in various formats like HTML, images, videos, etc.) relating to all kinds of community crises and tragedies, as well as to integrate communities, content, and services relating to CTR.
Crisis, Tragedy, and Recovery Network Digital Library (CTRnet)
Chitturi, Kiran; Fox, Edward A. (2013-01-10)
This presentation outlines the goals of the Crisis, Tragedy and Recovery Network (CTRnet) project. These goals include researching the problems of integrating content, community, services related to crisis, tragedies, and recovery; integrating heterogeneous information in a specific domain, making it accessible, and preserving it for long-term reuse; extending the scope of digital libraries so they are closely but flexibly coupled with a wide variety of services to support diverse emerging communities; and supporting information exploration with advanced methods (Stepping Stones and Pathways (SSP), PathRank, and Storytelling) that facilitate searching, browsing, and discovery.
Crisis, Tragedy, and Recovery Network Digital Library (CTRnet) + Web Archiving in Qatar and VT
Fox, Edward A.; Yang, Seungwon; CTRnet Team (2013-07-01)
This presentation describes the Crisis, Tragedy, and Recovery Network's digital library development and web archiving activities in Qatar and Virginia Tech. The presentation covers project goals, archiving tasks, dissemination efforts, and the IDEAL project.

Browse

Browsing Digital Library Research Laboratory by Title

Results Per Page

Sort Options