Performance Evaluation of Web Archiving Through In-Memory Page Cache

dc.contributor.authorVishwasrao, Saket Dilipen
dc.contributor.committeechairFox, Edward A.en
dc.contributor.committeememberHou, Yiwei Thomasen
dc.contributor.committeememberXie, Zhiwuen
dc.contributor.departmentElectrical and Computer Engineeringen
dc.date.accessioned2017-06-24T08:00:59Zen
dc.date.available2017-06-24T08:00:59Zen
dc.date.issued2017-06-23en
dc.description.abstractThis study proposes and evaluates a new method for Web archiving. We leverage the caching infrastructure in Web servers for archiving. Redis is used as the page cache and its persistence mechanism is exploited for archiving. We experimentally evaluate the performance of our archival technique using the Greek version of Wikipedia deployed on Amazon cloud infrastructure. We show that there is a slight increase in latencies of the rendered pages due to archiving. Though the server performance is comparable at larger page cache sizes, the maximum throughput the server can handle decreases significantly at lower cache sizes due to more disk write operations as a result of archiving. Since pages are dynamically rendered and the technology stack of Wikipedia is extensively used in a number of Web applications, our results should have broad impact.en
dc.description.abstractgeneralThis study proposes and evaluates a new method for Web archiving. To reduce response time for serving webpages, Web Servers store recently rendered pages in memory. This process is known as caching. We modify this caching mechanism of Web Servers for archival. We then experimentally evaluate the impact of our archival technique on Web Servers. We observe that the time to render a Web page increases slightly as long as the Web Server is under moderate load. Through our experiments, we establish limits on the maximum requests a Web Server can handle without increasing the response time. We ensure our experiments are conducted on Web Servers using technologies that are widely used today. Thus our results should have broad impact.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:11101en
dc.identifier.urihttp://hdl.handle.net/10919/78252en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectInformation Retrievalen
dc.subjectTransactional Web Archivingen
dc.subjectCachingen
dc.subjectBenchmarkingen
dc.subjectWikipediaen
dc.titlePerformance Evaluation of Web Archiving Through In-Memory Page Cacheen
dc.typeThesisen
thesis.degree.disciplineComputer Engineeringen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 2 of 2
Loading...
Thumbnail Image
Name:
Vishwasrao_SD_T_2017.pdf
Size:
1.22 MB
Format:
Adobe Portable Document Format
Name:
Vishwasrao_SD_T_2017_support_2.zip
Size:
309.94 KB
Format:
Description:
Supporting documents

Collections