Recent Submissions

  • Half-Day Tutorial: Collecting, Analyzing, and Visualizing Tweets Using Open Source Tools 

    Yang, Seungwon; Kavanaugh, Andrea L. (2011-06-01)
    This tutorial introduces various open source tools and methods to archive tweets on a user's local machine and convert them into topic clouds for quick content analysis. For more in-depth techniques such as n-grams and ...
  • OAI/ODL Component Composition Exercise 

    Suleman, Hussein (2002-09-01)
    This exercise is a hands-on introduction to building digital libraries from components. It is a good introduction for individuals with modest Unix skills, and it reviews the process of using DLbox to create a working digital ...
  • Protocols 

    Singh, Ajeet; Chen, Yinlin; Santhanam, Srinivasa; Zhu, Weihua (2009-10-09)
    This module addresses the concepts, development and implementation of digital library protocols and covers the roles of protocols in information retrieval systems (IR) and Service Oriented Architectures (SOA).
  • Application Software 

    Yang, Seungwon (2009-10-07)
    This module covers commonly used application software, which are specifically designed for the creation and development of digital library (DL) systems and similar types of collections and services, such as open access archives.
  • Relevance Feedback and Query Expansion 

    Wu, Sichao; Zhang, Yao (2012-10-17)
    This module introduces the methods to improve the recall of information retrieval systems, mainly focuses on relevance feedback and query expansion.
  • Pure Data Module 

    Scott, Conor (2011-05-06)
    This is the manual for the Pure Data (Pd) Module. Within this directory you will find the source for Pd and pd-l2ork (source and precompiled binary), several Pd tutorials and the patches that accompany them, as well as ...
  • Web Publishing 

    Karia, Pratik (2009-09-08)
    This module covers the general principles of web publishing and the various paradigms that can be used for storing and retrieving content within digital libraries. This module introduces various techniques to publish ...
  • Media Computation 

    Slack, Dylan (2011-05-04)
    Media Computation is a new type of introductory Computer Science class created to provide a path for those interested in doing creative, multimedia-related tasks with computing and attract them to the discipline of Computer ...
  • Audacity 1.3 

    Brown, Chris (2011-05-01)
    This module covers the use of Audacity 1.3 hosted on an IBM Cloud Instance. Topics covered include accessing and running Audacity, and manipulating audio files such as splitting, copying, pasting, merging, and exporting.
  • New Cloud Modules Fingerprint Module 

    Saraiya, Saptak (2011-05-04)
    The NIST Biometric Image Software (NBIS) distribution is developed by the National Institute of Standards and Technology (NIST) for the Federal Bureau of Investigation (FBI) and Department of Homeland Security (DHS). The ...
  • DL Architectures 

    Yang, Seungwon; Fox, Edward A. (2009-10-07)
    This module covers digital library architectures, specifically federated architectures, distributed architectures, and service-oriented architectures.
  • Metadata 

    Pomerantz, Jeffrey P. (2009-10-07)
    This module addresses the use of metadata, specific metadata standards that may be used to describe digital objects, and the creation of metadata records.
  • Digitization 

    Oh, Sanghee (2009-10-07)
    This module covers the principles and application of the digitization process for digital libraries. Students will be able to explain the digitization process, understand the critical issues and challenges of digitization ...
  • Hadoop Map-reduce 

    Shu, Xiaokui; Cohen, Ron (2010-12-10)
    Hadoop Map-Reduce is a software framework for writing applications for processing large amounts of data in parallel on commodity hardware.
  • SEDNA XML Database 

    Vijay, Sony; El Meligy Abdelhamid, Sherif; Malayattil, Sarosh (2010-12-09)
    The module introduces the use of SEDNA XML database for XML retrieval. The primary focus of the module is to describe the architecture of SEDNA database and how standard XML queries can be used to retrieve data from it.
  • Overview of LucidWorks Big Data Software 

    Chitturi, Kiran (2012-09-16)
    This module introduces the basic concepts and the overview of LucidWorks Big Data software that is specifically designed for searching, discovery, and analysis of massive content sets.
  • LucidWorks: Advanced Searching cURL 

    Makkapati, Hemanth; Subbiah, Rajesh; Kaw, Rushi (2012-10-07)
    This module focuses on advanced search techniques using Apache Solr through cURL. Successful completion of this module will enable students to employ advanced search techniques based on multi-values, multi-fields, phrase ...
  • Text Clustering Using LucidWorks and Apache Mahout 

    Chen, Liangzhe; Lin, Xiao; Wood, Andrew (2012-11-17)
    This module introduces algorithms and evaluation metrics for flat clustering. We focus on the usage of LucidWorks big data analysis software and Apache Mahout, an open source machine learning library in clustering of ...
  • LucidWorks: Searching with cURL 

    Schutt, Kyle; Morgan, Kyle (2012-10-01)
    This module addresses utilizing cURL and the Query admin to search documents. Students will be capable of querying an index, working with results, and describing query parsing.
  • Text Classification Using Mahout 

    Alam, Maksudul; Arifuzzaman, S. M.; Bhuiyan, Md Hasanuzzaman (2012-11-06)
    This module focuses on classification of text using Apache Mahout. After successful completion of this module, students will be able to explain and apply methods of classification, correctly classify a set of documents ...

View more