Now showing items 1-6 of 6
Big Data Processing in the Cloud: a Hydra/Sufia Experience
Presentation video available at https://connectpro.helsinki.fi/p1txjdy74ts/ This presentation addresses the challenge of processing big data in a cloud-based data repository. Using the Hydra Project’s Hydra and Sufia ...
Evaluating Cost of Cloud Execution in a Data Repository
In this paper, we utilize a set of controlled experiments to benchmark the cost associated with the cloud execution of typical repository functions such as ingestion, fixity checking, and heavy data processing. We focus ...
The Insitutional Repository's Role in Preserving Research Data
(Virginia Tech, 2012-07-25)
In recent years, many funding agencies have started to require long-term preservation and open access to research data. While most research universities have already run their own institutional repositories (IR), it's not ...
Facilitate Cross-Repository Big Data Discovery and Reuse
(Virginia Tech, 2013-03-13)
Researchers have accumulated large amount of observational, experimental, and simulation data. Much effort has been made to collect, curate, preserve, and provide open access to them, but putting the data online is only ...
VTechData: An Institutional Data Repository
We introduce VTechData, a Sufia/Fedora based institutional repository specifically implemented to meet the needs of research data management at Virginia Tech. Despite the rapid maturity of Hydra and Fedora code bases, the ...
Are Repositories Impeding Big Data Reuse?
(Virginia Tech, 2016-06-14)
In this intentionally provocative presentation, we question the scalability of popular digital repositories and whether they are suitable for big data reuse. Are the layers of API these repositories have painted over file ...