Analysis and Modeling of World Wide Web Traffic

dc.contributor.authorAbdulla, Ghaleben
dc.contributor.committeechairFox, Edward A.en
dc.contributor.committeememberKafura, Dennis G.en
dc.contributor.committeememberBalci, Osmanen
dc.contributor.committeememberAbrams, Marcen
dc.contributor.committeememberNayfeh, Ali H.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2014-03-14T20:21:49Zen
dc.date.adate1998-04-30en
dc.date.available2014-03-14T20:21:49Zen
dc.date.issued1998-04-27en
dc.date.rdate1999-04-30en
dc.date.sdate1998-04-27en
dc.description.abstractThis dissertation deals with monitoring, collecting, analyzing, and modeling of World Wide Web (WWW) traffic and client interactions. The rapid growth of WWW usage has not been accompanied by an overall understanding of models of information resources and their deployment strategies. Consequently, the current Web architecture often faces performance and reliability problems. Scalability, latency, bandwidth, and disconnected operations are some of the important issues that should be considered when attempting to adjust for the growth in Web usage. The WWW Consortium launched an effort to design a new protocol that will be able to support future demands. Before doing that, however, we need to characterize current users' interactions with the WWW and understand how it is being used. We focus on proxies since they provide a good medium or caching, filtering information, payment methods, and copyright management. We collected proxy data from our environment over a period of more than two years. We also collected data from other sources such as schools, information service providers, and commercial aites. Sampling times range from days to years. We analyzed the collected data looking for important characteristics that can help in designing a better HTTP protocol. We developed a modeling approach that considers Web traffic characteristics such as self-similarity and long-range dependency. We developed an algorithm to characterize users' sessions. Finally we developed a high-level Web traffic model suitable for sensitivity analysis. As a result of this work we develop statistical models of parameters such as arrival times, file sizes, file types, and locality of reference. We describe an approach to model long-range and dependent Web traffic and we characterize activities of users accessing a digital library courseware server or Web search tools. Temporal and spatial locality of reference within examined user communities is high, so caching can be an effective tool to help reduce network traffic and to help solve the scalability problem. We recommend utilizing our findings to promote a smart distribution or push model to cache documents when there is likelihood of repeat accesses.en
dc.description.degreePh. D.en
dc.identifier.otheretd-33098-142912en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-33098-142912/en
dc.identifier.urihttp://hdl.handle.net/10919/30470en
dc.publisherVirginia Techen
dc.relation.haspartthesis.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectTime Seriesen
dc.subjectModelingen
dc.subjectScalabilityen
dc.subjectWorld Wide Weben
dc.subjectLog analysisen
dc.subjectCachingen
dc.subjectProxyen
dc.titleAnalysis and Modeling of World Wide Web Trafficen
dc.typeDissertationen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.namePh. D.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
thesis.pdf
Size:
2.5 MB
Format:
Adobe Portable Document Format