Virginia Tech
    • Log in
    View Item 
    •   VTechWorks Home
    • College of Engineering (COE)
    • Department of Computer Science
    • Computer Science Technical Reports
    • View Item
    •   VTechWorks Home
    • College of Engineering (COE)
    • Department of Computer Science
    • Computer Science Technical Reports
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Crawling on the World Wide Web

    Thumbnail
    View/Open
    LiWangReportAccept.pdf (261.9Kb)
    Downloads: 63
    TR number
    TR-02-10
    Date
    2002
    Author
    Wang, Li
    Fox, Edward A.
    Metadata
    Show full item record
    Abstract
    As the World Wide Web grows rapidly, a web search engine is needed for people to search through the Web. The crawler is an important module of a web search engine. The quality of a crawler directly affects the searching quality of such web search engines. Given some seed URLs, the crawler should retrieve the web pages of those URLs, parse the HTML files, add new URLs into its buffer and go back to the first phase of this cycle. The crawler also can retrieve some other information from the HTML files as it is parsing them to get the new URLs. This paper describes the design, implementation, and some considerations of a new crawler programmed as an learning exercise and for possible use for experimental studies.
    URI
    http://hdl.handle.net/10919/20052
    Collections
    • Computer Science Technical Reports [1035]

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us
     

     

    VTechWorks

    AboutPoliciesHelp

    Browse

    All of VTechWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Log inRegister

    Statistics

    View Usage Statistics

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us