SimFusion: A Unified Similarity Measurement Algorithm for Multi-Type Interrelated Web Objects

dc.contributor.authorXi, Wensien
dc.contributor.authorZhang, Benyuen
dc.contributor.authorFox, Edward A.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2013-06-19T14:37:10Zen
dc.date.available2013-06-19T14:37:10Zen
dc.date.issued2004en
dc.description.abstractIn this paper, we use a Unified Relationship Matrix (URM) to represent a set of heterogeneous web objects (e.g., web pages, queries) and their interrelationships (e.g., hyperlink, user click-through relationships). We claim that iterative computations over the URM can help overcome the data sparseness problem (a common situation in the Web) and detect latent relationships among heterogeneous web objects, thus, can improve the quality of various information applications that require the combination of information from heterogeneous sources. To support our claim, we further propose a unified similarity-calculating algorithm, the SimFusion algorithm. By iteratively computing over the URM, the SimFusion algorithm can effectively integrate relationships from heterogeneous sources when measuring the similarity of two web objects. Experiments based on a real search engine query log and a large real web page collection demonstrate that the SimFusion algorithm can significantly improve similarity measurement of web objects over both traditional content based similarity-calculating algorithms and the cutting edge SimRank algorithm.en
dc.format.mimetypeapplication/pdfen
dc.identifierhttp://eprints.cs.vt.edu/archive/00000696/en
dc.identifier.sourceurlhttp://eprints.cs.vt.edu/archive/00000696/01/SimFusion.pdfen
dc.identifier.trnumberTR-04-19en
dc.identifier.urihttp://hdl.handle.net/10919/20143en
dc.language.isoenen
dc.publisherDepartment of Computer Science, Virginia Polytechnic Institute & State Universityen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectInformation retrievalen
dc.titleSimFusion: A Unified Similarity Measurement Algorithm for Multi-Type Interrelated Web Objectsen
dc.typeTechnical reporten
dc.type.dcmitypeTexten

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
SimFusion.pdf
Size:
543.08 KB
Format:
Adobe Portable Document Format