SchemaMapper: A tool for visualization of schema mapping

TR Number




Journal Title

Journal ISSN

Volume Title


Department of Computer Science, Virginia Polytechnic Institute & State University


The world has changed significantly in the past few years with an increasing thrust towards the use of digital information. Every kind of application domain has found reasons to use digital information sources extensively. As a result, different types of data representation models or schemas have been developed. This poses a problem when there is a need for data integration from several sources. Diverse representations must be merged in order to create a single global representation. Hence there is a need for schema mapping tools that will enable amalgamation of heterogeneous data representations. That goal is difficult to achieve today since existing schema mapping tools are domain unaware. SchemaMapper, a new tool we have developed, tries to be domain aware and hence help speed up the schema mapping process. Further, it supports visualization of the mapping process by using a hyperbolic tree representation. This has not been used before in the context of schema mapping. Although the primary motivation for SchemaMapper comes from ETANA-DL (a digital library to promote integration of information and services from diverse archaeological sites), it can potentially be used in any other similar domains in the future, or further extended for different types of schema mappings. This report describes in detail the prototype developed for exploring the feasibility of such a tool, providing architecture and implementation details. Experiments were conducted to evaluate SchemaMapper and the initial results have been very encouraging. All the schemas used during the evaluation process were real life examples taken from ETANA-DL. Analysis of the evaluation results suggests that domain awareness is extremely useful for the schema mapping process. Also, the linear tree representation of schemas which existing tools use appears to have inherent disadvantages which need to be overcome in order to make the process more effective.



Digital libraries