Web Archiving Inconsistency: A Research Agenda

TR Number

Date

2015-10

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE Technical Committee on Digital Libraries

Abstract

Scaling web applications usually boils down to a tradeoff between consistency and latency. Very large web operations typically favor low latency, hence purposefully sacrifice strict consistency in the sense of serializability. By definition, the breakdown of serializability may cause the web applications to disseminate, albeit ephemerally, inaccurate and even contradictory information. If captured and preserved in the web archives as historical records, such information will degrade the overall archival quality. Despite its near omnipresent in the popular web, such relaxation in data consistency is not widely reported nor thoroughly studied by the web archiving community.

Description

Keywords

Web archiving, Inconsistency

Citation