Now showing items 1-5 of 5

    • Are Repositories Impeding Big Data Reuse? 

      Xie, Zhiwu; Galad, Andrej; Chen, Yinlin; Fox, Edward A. (Virginia Tech, 2016-06-14)
      In this intentionally provocative presentation, we question the scalability of popular digital repositories and whether they are suitable for big data reuse. Are the layers of API these repositories have painted over file ...
    • Evaluating Cost of Cloud Execution in a Data Repository 

      Xie, Zhiwu; Chen, Yinlin; Speer, Julie; Walters, Tyler (ACM, 2016-06)
      In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus ...
    • On-Demand Big Data Analysis in Digital Repositories 

      Xie, Zhiwu; Chen, Yinlin; Jiang, Tingting; Speer, Julie; Walters, Tyler; Tarazaga, Pablo A.; Kasarda, Mary (Springer International Publishing, 2015-12-18)
      We describe a use and reuse driven digital repository integrated with lightweight data analysis capabilities provided by the Docker framework. Using building sensor data collected from the Virginia Tech Goodwin Hall Living ...
    • Towards Use And Reuse Driven Big Data Management 

      Xie, Zhiwu; Chen, Yinlin; Speer, Julie; Walters, Tyler; Tarazaga, Pablo A.; Kasarda, Mary (2015-06-03)
      We propose a use and reuse driven big data management approach that fuses the data repository and data processing capabilities in a co-located, public cloud. It answers to the urgent data management needs from the growing ...
    • VTechData: An Institutional Data Repository 

      Xie, Zhiwu; Speer, Julie; Chen, Yinlin; Jiang, Tingting; Brittle, Collin; Mather, Paul (2016-06-14)
      We introduce VTechData, a Sufia/Fedora based institutional repository specifically implemented to meet the needs of research data management at Virginia Tech. Despite the rapid maturity of Hydra and Fedora code bases, the ...