Finding Computer Science Syllabi on the World Wide Web

TR Number




Journal Title

Journal ISSN

Volume Title


Department of Computer Science, Virginia Polytechnic Institute & State University


Syllabi contain information useful to students, faculty, and many other people, and given the ubiquity of the WWW many schools are now putting their syllabi online for these people and the general public to access. Even though these syllabi may be available, they might be hard to find. This means that faculty, students, and anyone else who might have an interest in viewing those syllabi might find it useful to be able to browse a collection of reliable syllabi. To build a collection of reliable syllabi it is necessary to find those syllabi on the WWW but this is made easy with a tool like the Google Web API. Once the syllabi are found on the Web it is necessary to examine those syllabi and look for desired characteristics to be sure they are desired syllabi. The syllabi that contain the desired characteristics are kept and the rest are discarded. This elimination process can be accomplished using a tool like a classification tree, more specifically, tools like the Orange Data Mining Library and C4.5. This paper describes the process of finding syllabi on the WWW using the Google Web API, retrieving those syllabi using Python, and filtering them using the Orange Data Mining Library and C4.5 so that a reliable set of syllabi can be constructed.



Digital libraries