Improving Network Performance and Document Dissemination by Enhancing Cache Consistency on the Web Using Proxy and Server Negotiation

TR Number

Date

2005-08-08

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Use of proxy caches in the World Wide Web is beneficial to the end user, network administrator, and server administrator since it reduces the amount of redundant traffic that circulates through the network. In addition, end users get quicker access to documents that are cached. However, the use of proxies introduces additional issues that need to be addressed. In particular, there is growing concern over how to maintain cache consistency and coherency among cached versions of documents.

The existing consistency protocols used in the Web are proving to be insufficient to meet the growing needs of the Internet population. For example, too many messages sent over the network are due to caches guessing when their copy is inconsistent. One option is to apply the cache coherence strategies already in use for many other distributed systems, such as parallel computers. However, these methods are not satisfactory for the World Wide Web due to its larger size and more diverse access patterns.

Many decisions must be made when exploring World Wide Web coherency, such as whether to provide consistency at the proxy level (client pull) or to allow the server to handle it (server push). What trade offs are inherent for each of these decisions? The relevant usage of any method strongly depends upon the conditions of the network (e.g., document types that are frequently requested or the state of the network load) and the resources available (e.g., disk space and type of cache available). Version 1.1 of HTTP is the first protocol version to give explicit rules for consistency on the Web. Many proposed algorithms require changes to HTTP/1.1. However, this is not necessary to provide a suitable solution.

One goal of this dissertation is to study the characteristics of document retrieval and modification to determine their effect on proposed consistency mechanisms. A set of effective consistency policies is identified from the investigation. The main objective of this dissertation is to use these findings to design and implement a consistency algorithm that provides improved performance over the current mechanisms proposed in the literature. Optimistically, we want an algorithm that provides strong consistency. However, we do not want to further degrade the network or cause undue burden on the server to gain this advantage. We propose a system based on the notion of soft-state and based on server push. In this system, the proxy would have some influence on what state information is maintained at the server (spatial consideration) as well as how long to maintain the information (temporal consideration). We perform a benchmark study of the performance of the new algorithm in comparison with existing proposed algorithms. Our results show that the Synchronous Nodes for Consistency (SINC) framework provides an average of 20% control message savings by limiting how much polling occurs with the current Web cache consistency mechanism, Adaptive Client Polling. In addition, the algorithm shows 30% savings on state space overhead at the server by limiting the amount of per-proxy and per-document state information required at the server.

Description

Keywords

Caching, Server, Consistency, Coherence, Proxy

Citation