Crawling on the World Wide Web

Wang, Li; Fox, Edward A.

Crawling on the World Wide Web

dc.contributor.author	Wang, Li	en
dc.contributor.author	Fox, Edward A.	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2013-06-19T14:36:27Z	en
dc.date.available	2013-06-19T14:36:27Z	en
dc.date.issued	2002	en
dc.description.abstract	As the World Wide Web grows rapidly, a web search engine is needed for people to search through the Web. The crawler is an important module of a web search engine. The quality of a crawler directly affects the searching quality of such web search engines. Given some seed URLs, the crawler should retrieve the web pages of those URLs, parse the HTML files, add new URLs into its buffer and go back to the first phase of this cycle. The crawler also can retrieve some other information from the HTML files as it is parsing them to get the new URLs. This paper describes the design, implementation, and some considerations of a new crawler programmed as an learning exercise and for possible use for experimental studies.	en
dc.format.mimetype	application/pdf	en
dc.identifier	http://eprints.cs.vt.edu/archive/00000572/	en
dc.identifier.sourceurl	http://eprints.cs.vt.edu/archive/00000572/01/LiWangReportAccept.pdf	en
dc.identifier.trnumber	TR-02-10	en
dc.identifier.uri	http://hdl.handle.net/10919/20052	en
dc.language.iso	en	en
dc.publisher	Department of Computer Science, Virginia Polytechnic Institute & State University	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	Information retrieval	en
dc.title	Crawling on the World Wide Web	en
dc.type	Technical report	en
dc.type.dcmitype	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: LiWangReportAccept.pdf
Size:: 261.99 KB
Format:: Adobe Portable Document Format

Download

Collections

Computer Science Technical Reports