Virginia Tech
    • Log in
    View Item 
    •   VTechWorks Home
    • Student Works
    • CS4624: Multimedia, Hypertext, and Information Access
    • View Item
    •   VTechWorks Home
    • Student Works
    • CS4624: Multimedia, Hypertext, and Information Access
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    LucidWorks Vectorize Module for the Digital Library Curriculum Initiative

    Thumbnail
    View/Open
    Module introduction (PDF) (83.58Kb)
    Downloads: 208
    Module introduction (PowerPoint) (36.57Kb)
    Downloads: 303
    Module introduction (MP4 video) (4.489Mb)
    Downloads: 242
    Report on the creation of the Module (PDF) (128.2Kb)
    Downloads: 244
    Updated introduction (PDF) (64.03Kb)
    Downloads: 136
    Updated introduction (PowerPoint) (33.88Kb)
    Downloads: 62
    The module itself (PDF) (205.1Kb)
    Downloads: 276
    An editable version of the module (Word) (37.25Kb)
    Downloads: 69
    ASR Generated Captions (10.75Kb)
    Downloads: 235
    Date
    2013-05-18
    Author
    Kniphuisen, David
    Tran, Alan
    Metadata
    Show full item record
    Abstract
    The goal of our project was to create a learning module for students who are interested in converting a large number of documents of data into a usable form for machine learning, information retrieval, and related purposes. In order to complete this task, we wrote a module that gives information about how LucidWorks Big Data software handles the task of vectorizing documents using a workflow. This module details the approach that LucidWorks implements, and gives detailed instructions on how to create a collection, start the workflow, check the status of the workflow, and finally access the results after the workflow completes. Upon completion of our module, users will be able to test their understanding using the example documents provided by the LucidWorks software, and be familiar with Hadoop’s distributed file system. After users are familiar with how the software works, they will be able to create their own vectorized representations of documents. Our module also provides information about the installation of LucidWorks software on a virtual machine; if the users have no access to the software they will then be able to create their own instance of it. The module will be available also through http://en.wikiversity.org/wiki/Curriculum_on_Digital_Libraries.
    URI
    http://hdl.handle.net/10919/22061
    Collections
    • CS4624: Multimedia, Hypertext, and Information Access [130]

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us
     

     

    VTechWorks

    AboutPoliciesHelp

    Browse

    All of VTechWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Log inRegister

    Statistics

    View Usage Statistics

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us