Research and Informatics Division, University Libraries
Permanent URI for this collection
Browse
Browsing Research and Informatics Division, University Libraries by Author "Brittle, Collin"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Big Data Processing in the Cloud: a Hydra/Sufia ExperienceBrittle, Collin; Xie, Zhiwu (2014-06-10)Presentation video available at https://connectpro.helsinki.fi/p1txjdy74ts/ This presentation addresses the challenge of processing big data in a cloud-based data repository. Using the Hydra Project’s Hydra and Sufia ruby gems and working with the Hydra community, we created a special repository for the project, and set up background jobs. Our approach is to create the metadata with these jobs, which are distributed across multiple computing cores. This will allow us to scale our infrastructure out on an as-needed basis, and decouples automatic metadata creation from the response times seen by the user. While the metadata is not immediately available after ingestion, it does mean that the object is. By distributing the jobs, we can compute complex properties without impacting the repository server. Hydra and Sufia allowed us to get a head start by giving us a simple self deposit repository, complete with background jobs support via Redis and Resque.
- VTechData: An Institutional Data RepositoryXie, Zhiwu; Griffin, Julie; Chen, Yinlin; Jiang, Tingting; Brittle, Collin; Mather, Paul (2016-06-14)We introduce VTechData, a Sufia/Fedora based institutional repository specifically implemented to meet the needs of research data management at Virginia Tech. Despite the rapid maturity of Hydra and Fedora code bases, the gaps between the released packages and a launched productionlevel service are still many and far from trivial. In this presentation we describe the strategy and efforts through which these gaps were filled and lessons learned in the process of creating our first Hydra/Sufiabased repository.