VTechWorks staff will be away for the Thanksgiving holiday starting Wednesday afternoon, Nov. 25, through Sunday Nov. 29, and will not be replying to requests during this time. Thank you for your patience.
Performance Evaluation of Web Archiving Through In-Memory Page Cache
This study proposes and evaluates a new method for Web archiving. We leverage the caching
infrastructure in Web servers for archiving. Redis is used as the page cache and its persistence mechanism is exploited for archiving. We experimentally evaluate the performance
of our archival technique using the Greek version of Wikipedia deployed on Amazon cloud
infrastructure. We show that there is a slight increase in latencies of the rendered pages due
to archiving. Though the server performance is comparable at larger page cache sizes, the
maximum throughput the server can handle decreases significantly at lower cache sizes due
to more disk write operations as a result of archiving. Since pages are dynamically rendered
and the technology stack of Wikipedia is extensively used in a number of Web applications,
our results should have broad impact.