Browsing by Author "Abdulla, Ghaleb"
Now showing 1 - 10 of 10
Results Per Page
Sort Options
- Analysis and Modeling of World Wide Web TrafficAbdulla, Ghaleb (Virginia Tech, 1998-04-27)This dissertation deals with monitoring, collecting, analyzing, and modeling of World Wide Web (WWW) traffic and client interactions. The rapid growth of WWW usage has not been accompanied by an overall understanding of models of information resources and their deployment strategies. Consequently, the current Web architecture often faces performance and reliability problems. Scalability, latency, bandwidth, and disconnected operations are some of the important issues that should be considered when attempting to adjust for the growth in Web usage. The WWW Consortium launched an effort to design a new protocol that will be able to support future demands. Before doing that, however, we need to characterize current users' interactions with the WWW and understand how it is being used. We focus on proxies since they provide a good medium or caching, filtering information, payment methods, and copyright management. We collected proxy data from our environment over a period of more than two years. We also collected data from other sources such as schools, information service providers, and commercial aites. Sampling times range from days to years. We analyzed the collected data looking for important characteristics that can help in designing a better HTTP protocol. We developed a modeling approach that considers Web traffic characteristics such as self-similarity and long-range dependency. We developed an algorithm to characterize users' sessions. Finally we developed a high-level Web traffic model suitable for sensitivity analysis. As a result of this work we develop statistical models of parameters such as arrival times, file sizes, file types, and locality of reference. We describe an approach to model long-range and dependent Web traffic and we characterize activities of users accessing a digital library courseware server or Web search tools. Temporal and spatial locality of reference within examined user communities is high, so caching can be an effective tool to help reduce network traffic and to help solve the scalability problem. We recommend utilizing our findings to promote a smart distribution or push model to cache documents when there is likelihood of repeat accesses.
- Caching Proxies: Limitations and PotentialsAbrams, Marc; Standridge, Charles R.; Abdulla, Ghaleb; Williams, Stephen; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1995-07-01)As the number of World-Wide Web users grow, so does the number of connections made to servers. This increases both network load and server load. Caching can reduce both loads by migrating copies of server files closer to the clients that use those files. Caching can either be done at a client or in the network (by a proxy server or gateway). We assess the potential of proxy servers to cache documents retrieved with the HTTP protocol. We monitored traffic corresponding to three types of educational workloads over a one semester period, and used this as input to a cache simulation. Our main findings are (1) that with our workloads a proxy has a 30-50% maximum possible hit rate no matter how it is designed; (2) that when the cache is full and a document is replaced, least recently used (LRU) is a poor policy, but simple variations can dramatically improve hit rate and reduce cache size; (3) that a proxy server really functions as a second level cache, and its hit rate may tend to decline with time after initial loading given a more or less constant set of users; and (4) that certain tuning configuration parameters for a cache may have little benefit.
- Characterizing World Wide Web QueriesAbdulla, Ghaleb; Liu, Binzhang; Saad, Rani A.; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1997-02-01)Locating information on the WWW is a major activity for users, and Web Information Retrieval Systems (IRS) are becoming more important to support their endeavors. In this paper we characterize queries performed by Web users to such systems and give distributions for accesses to different Web IRS. We characterize clients' accesses, queries and user sessions. Our purpose is to reduce network and bandwidth by identifying ways to optimize interactions with the Web. We characterize clients' sessions by a sequence of Browsing, Searching, and Next steps, and demonstrate that more search steps correlate with a reduction in the number of bytes transferred.
- An image processing tool for cropping and enhancing imagesAbdulla, Ghaleb (Virginia Tech, 1993-12-07)An educational system called GeoSim is being developed at Virginia Tech; its purpose is to simulate processes related to several geographical subjects. The software consists of six different modules; one of these modules is designed to simulate a field trip for orienteering and position finding. This module uses a database of captured images to pan and zoom from one location to another. However, the original images have overlapping areas which does not allow simulating a continuous panoramic view. To fix this problem a cropping tool was designed and implemented with Intel DVI ActionMedia boards to support the orienteering module of project GeoSim. The tool allows cropping of overlapped areas in the images. In addition, the tool allows the user to minimize differences in intensity and colors between neighboring images. The cropping, color, and intensity values obtained from manipulating images are saved in an ASCII file where they can be read and used in the orienteering module. The images used are captured and stored on the hard disk from a videodisc, with 512x480 resolution and 16 bits per pixel DVI compressed format.
- Modeling Correlated Proxy Web Traffic Using Fourier AnalysisAbdulla, Ghaleb; Nayfeh, Ali H.; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1997-11-01)We analyze the arrival rate of accesses to Web proxy caching servers. The results show that the data display strong periodic autocorrelation. The examined data sets show a consistent behavior in terms of having periods corresponding to daily and weekly cycles that can be explained in terms of daily and weekly cyclic behavior of Web users. While these results confirm the correlation in the network traffic noticed by other researchers, we emphasize that this correlation is periodic. A new approach is introduced to model data that exhibit such characteristics by a combination of Fourier and statistical analysis techniques. The source of high correlation in the data is shown to come from the periodic and hence the deterministic part. Synthesized data that results from this modeling approach is shown to have a long-range dependent and self-similar behavior.
- Multimedia Traffic Analysis Using CHITRA95Abrams, Marc; Williams, Stephen; Abdulla, Ghaleb; Patel, Shashin; Ribler, Randy; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1995-04-01)We describe how to investigate collections of trace data representing network delivery of multimedia information with CHITRA95, a tool that allows a user to visualize, query, statistically analyze and test, transform, and model collections of trace data. CHITRA95 is applied to characterize World Wide Web (WWW) traffic from three workloads: students in a classroom of network-connected workstations, graduate students browsing the Web, undergraduates browsing educational and other materials, as well as traffic on a courseware repository server. We explore the inter-access time of files on a server (i.e., recency), the hit rate from a proxy server cache, and the distributions of file sizes and media types requested. The traffic study also yields statistics on the effectiveness of caching to improve transfer rates. In contrast to past WWW traffic studies, we analyze client as well as server traffic; we compare three workloads rather than drawing conclusions from one workload; and we analyze tcpdump logs to calculate the performance improvement in throughput that an end user sees due to caching.
- NMFS: Network Multimedia File System ProtocolPatel, Sameer H.; Abdulla, Ghaleb; Abrams, Marc; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1992)We describe an on-going project to develop a Network Multimedia File System (NMFS) protocol. The protocol allows "transparent access of shared files across networks" as Sun's NFS protocol does, but attempts to meet a real-time delivery schedule. NMFS is designed to provide ubiquitous service over networks both designed and not designed to carry multimedia traffic.
- Scaling the World-Wide WebAbdulla, Ghaleb; Abrams, Marc; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1996-03-01)
- Web Response Time and Proxy CachingLiu, Binzhang; Abdulla, Ghaleb; Johnson, Tommy; Fox, Edward A. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1998-03-01)It is critical to understand WWW latency in order to design better HTTP protocols. In this paper we characterize Web response time and examine effects of proxy caching on response time. We show that at least a quarter of the total elapsed time is spent in setting up TCP connections. We also characterize the effect of a user's network bandwidth on response time. Average connection time from a client via a 33.6 K modem is two times longer than that from a client via switched Ethernet. Contrary to the typical thought about Web proxy caching, this study finds that a single stand alone proxy cache does not always reduce response time. Implications of these results to the HTTP-NG protocol and Web application design also are discussed in the paper.
- WWW Proxy Traffic Characterization with Application to CachingAbdulla, Ghaleb; Fox, Edward A.; Abrams, Marc; Williams, Stephen (Department of Computer Science, Virginia Polytechnic Institute & State University, 1997-02-01)Characterizing World Wide Web proxy traffic helps identify parameters that affect caching, capacity planning and simulation studies. In this paper we identify invariants that hold across a collection of ten traces representing traffic seen by caching-proxy servers. The traces were collected from governmental, industry, university, high school, and an online service provider environment, with request rates that range from a few accesses to millions of accesses per hour. We also show that the examined traffic is semi-similar. We explore sources of Web self-similarity and we conclude that a strong source is the periodicity in the users behavior. The tests revealed that there is a strong connection between access rate from hour to hour. We also report the hit rate and weighted hit rate obtained by running a trace driven simulation on the workloads to simulate a proxy with infinite cache, similarly, accesses to unique servers and URLs are a small portion of the total. By considering these characteristics of traffic we can improve the utility of caching for WWW clients.