Web Archiving Inconsistency: A Research Agenda

TR Number
Date
2015-10
Journal Title
Journal ISSN
Volume Title
Publisher
IEEE Technical Committee on Digital Libraries
Abstract

Scaling web applications usually boils down to a tradeoff between consistency and latency. Very large web operations typically favor low latency, hence purposefully sacrifice strict consistency in the sense of serializability. By definition, the breakdown of serializability may cause the web applications to disseminate, albeit ephemerally, inaccurate and even contradictory information. If captured and preserved in the web archives as historical records, such information will degrade the overall archival quality. Despite its near omnipresent in the popular web, such relaxation in data consistency is not widely reported nor thoroughly studied by the web archiving community.

Description
Keywords
Web archiving, Inconsistency
Citation