VT Web Archive Project

dc.contributor.authorRinaldi, Anthonyen
dc.contributor.authorMehta, Deven
dc.date.accessioned2014-05-09T14:52:20Zen
dc.date.available2014-05-09T14:52:20Zen
dc.date.issued2014-05-09en
dc.descriptionIn addition to the report and presentation files, included in this repository is a Heritrix configuration file, 'Heritrix Configuration.xml'. This file contains a customized configuration for crawling the VT.edu domain. Support has been provided through: 1) Virginia Tech's Information Technology organization; 2) Qatar National Research Fund Project No. NPRP 4-029-1-007; 3) NSF IIS - 1319578: Integrated Digital Event Archiving and Library (IDEAL)en
dc.description.abstractVTWebArchive is a project to archive, organize, and make available to the public, historical back-versions of content hosted on vt.edu domains. This system incorporates several open source software packages to design a publicly utilizable tool for searching and discovering historical versions of content hosted on Virginia Tech websites. These tools include Heritrix, a highly customizable spider and crawler, as well as the Apache Tomcat webserver system and the Wayback Machine front-end.en
dc.description.sponsorshipMohamed Magdy, (mmagdy@vt.edu)en
dc.description.sponsorshipTarek Kanan, (tarekk@vt.edu)en
dc.description.sponsorshipVirginia Tech's Information Technology organizationen
dc.description.sponsorshipQatar National Research Fund Project No. NPRP 4-029-1-007en
dc.description.sponsorshipNSF IIS - 1319578: Integrated Digital Event Archiving and Library (IDEAL)en
dc.identifier.urihttp://hdl.handle.net/10919/47935en
dc.language.isoen_USen
dc.rightsCreative Commons CC0 1.0 Universal Public Domain Dedicationen
dc.rights.urihttp://creativecommons.org/publicdomain/zero/1.0/en
dc.subjectArchiveen
dc.subjectInternet archiveen
dc.subjectHeritrixen
dc.subjectWaybacken
dc.subjectCrawlen
dc.subjectCrawleren
dc.subjectwayback machineen
dc.subjectWARCen
dc.subjectWebsite archiveen
dc.subjectvt.eduen
dc.subjectIDEALen
dc.subjectQataren
dc.titleVT Web Archive Projecten
dc.typePresentationen

Files

Original bundle
Now showing 1 - 5 of 7
Name:
VTWebArchiving - Final Report.docx
Size:
433.72 KB
Format:
Microsoft Word XML
Description:
Project Report (Word)
Loading...
Thumbnail Image
Name:
VTWebArchiving - Final Report.pdf
Size:
297.44 KB
Format:
Adobe Portable Document Format
Description:
Project Report (PDF)
Name:
VTWebArchiving - Midterm Presentation.pptx
Size:
158.45 KB
Format:
Microsoft Powerpoint XML
Description:
Project Presentation: 05MAR2014 (PowerPoint)
Loading...
Thumbnail Image
Name:
VTWebArchiving - Midterm Presentation.pdf
Size:
190.21 KB
Format:
Adobe Portable Document Format
Description:
Project Presentation: 05MAR2014 (PDF)
Name:
Heritrix Configuration.xml
Size:
29.66 KB
Format:
Extensible Markup Language
Description:
Heritrix Configuration File (XML)
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description: