Analysis and Modeling of World Wide Web Traffic

Abdulla, Ghaleb

Analysis and Modeling of World Wide Web Traffic

dc.contributor.author	Abdulla, Ghaleb	en
dc.contributor.committeechair	Fox, Edward A.	en
dc.contributor.committeemember	Kafura, Dennis G.	en
dc.contributor.committeemember	Balci, Osman	en
dc.contributor.committeemember	Abrams, Marc	en
dc.contributor.committeemember	Nayfeh, Ali H.	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2014-03-14T20:21:49Z	en
dc.date.adate	1998-04-30	en
dc.date.available	2014-03-14T20:21:49Z	en
dc.date.issued	1998-04-27	en
dc.date.rdate	1999-04-30	en
dc.date.sdate	1998-04-27	en
dc.description.abstract	This dissertation deals with monitoring, collecting, analyzing, and modeling of World Wide Web (WWW) traffic and client interactions. The rapid growth of WWW usage has not been accompanied by an overall understanding of models of information resources and their deployment strategies. Consequently, the current Web architecture often faces performance and reliability problems. Scalability, latency, bandwidth, and disconnected operations are some of the important issues that should be considered when attempting to adjust for the growth in Web usage. The WWW Consortium launched an effort to design a new protocol that will be able to support future demands. Before doing that, however, we need to characterize current users' interactions with the WWW and understand how it is being used. We focus on proxies since they provide a good medium or caching, filtering information, payment methods, and copyright management. We collected proxy data from our environment over a period of more than two years. We also collected data from other sources such as schools, information service providers, and commercial aites. Sampling times range from days to years. We analyzed the collected data looking for important characteristics that can help in designing a better HTTP protocol. We developed a modeling approach that considers Web traffic characteristics such as self-similarity and long-range dependency. We developed an algorithm to characterize users' sessions. Finally we developed a high-level Web traffic model suitable for sensitivity analysis. As a result of this work we develop statistical models of parameters such as arrival times, file sizes, file types, and locality of reference. We describe an approach to model long-range and dependent Web traffic and we characterize activities of users accessing a digital library courseware server or Web search tools. Temporal and spatial locality of reference within examined user communities is high, so caching can be an effective tool to help reduce network traffic and to help solve the scalability problem. We recommend utilizing our findings to promote a smart distribution or push model to cache documents when there is likelihood of repeat accesses.	en
dc.description.degree	Ph. D.	en
dc.identifier.other	etd-33098-142912	en
dc.identifier.sourceurl	http://scholar.lib.vt.edu/theses/available/etd-33098-142912/	en
dc.identifier.uri	http://hdl.handle.net/10919/30470	en
dc.publisher	Virginia Tech	en
dc.relation.haspart	thesis.pdf	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Time Series	en
dc.subject	Modeling	en
dc.subject	Scalability	en
dc.subject	World Wide Web	en
dc.subject	Log analysis	en
dc.subject	Caching	en
dc.subject	Proxy	en
dc.title	Analysis and Modeling of World Wide Web Traffic	en
dc.type	Dissertation	en
thesis.degree.discipline	Computer Science	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	doctoral	en
thesis.degree.name	Ph. D.	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: thesis.pdf
Size:: 2.5 MB
Format:: Adobe Portable Document Format

Download

Collections

Doctoral Dissertations