Now showing items 1-6 of 6

    • Are Repositories Impeding Big Data Reuse? 

      Xie, Zhiwu; Galad, Andrej; Chen, Yinlin; Fox, Edward A. (Virginia Tech, 2016-06-14)
      In this intentionally provocative presentation, we question the scalability of popular digital repositories and whether they are suitable for big data reuse. Are the layers of API these repositories have painted over file ...
    • Big Data Processing in the Cloud: a Hydra/Sufia Experience 

      Brittle, Collin; Xie, Zhiwu (2014-06-10)
      Presentation video available at https://connectpro.helsinki.fi/p1txjdy74ts/ This presentation addresses the challenge of processing big data in a cloud-based data repository. Using the Hydra Project’s Hydra and Sufia ...
    • Evaluating Cost of Cloud Execution in a Data Repository 

      Xie, Zhiwu; Chen, Yinlin; Speer, Julie; Walters, Tyler (ACM, 2016-06)
      In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus ...
    • Facilitate Cross-Repository Big Data Discovery and Reuse 

      Xie, Zhiwu (Virginia Tech, 2013-03-13)
      Researchers have accumulated large amount of observational, experimental, and simulation data. Much effort has been made to collect, curate, preserve, and provide open access to them, but putting the data online is only ...
    • The Insitutional Repository's Role in Preserving Research Data 

      Xie, Zhiwu; McMillan, Gail; Walters, Tyler (Virginia Tech, 2012-07-25)
      In recent years, many funding agencies have started to require long-term preservation and open access to research data. While most research universities have already run their own institutional repositories (IR), it's not ...
    • VTechData: An Institutional Data Repository 

      Xie, Zhiwu; Speer, Julie; Chen, Yinlin; Jiang, Tingting; Brittle, Collin; Mather, Paul (2016-06-14)
      We introduce VTechData, a Sufia/Fedora based institutional repository specifically implemented to meet the needs of research data management at Virginia Tech. Despite the rapid maturity of Hydra and Fedora code bases, the ...