Are Repositories Impeding Big Data Reuse?

View/ Open
Downloads: 240
Downloads: 182
Date
2016-06-14Author
Xie, Zhiwu
Galad, Andrej
Chen, Yinlin
Fox, Edward A.
Metadata
Show full item recordAbstract
In this intentionally provocative presentation, we question the scalability of popular digital repositories and whether they are suitable for big data reuse. Are the layers of API these repositories have painted over file system primitives necessary? How essential is it for the repository to insist on being the sole manager of the content, and arranging files in ways to prevent access other than from their own APIs? We explore these questions from the perspective of big data reuse, and describe controlled reuse experiments against Fedora 4 to evaluate the cost of these practices.
Collections
License files: