Browsing by Author "Bhasker, Bharat"
Now showing 1 - 2 of 2
Results Per Page
Sort Options
- Cluster Algebra: A Query Language for Heterogeneous DatabasesBhasker, Bharat; Egyhazy, Csaba J.; Triantis, Konstantinos P. (Department of Computer Science, Virginia Polytechnic Institute & State University, 1992)This report describes a query language based on algebra for heterogeneous databases. The database logic is used as a uniform framework for studying the heterogeneous databases. The data model based on the database logic is referred to as cluster data model in this report. Generalized Structured Query Language (GSQL) is used for expressing ad-hoc queries over the relational, hierarchical and network database uniformly. For the purpose of query optimization, a query language that can express the primitive heterogeneous database operations is required. This report describes such a query language for the clusters (i.e., heterogeneous databases). The cluster algebra consists of (a) generalized relational operations such as selection, union, intersection, difference, semi-join, rename and cross-product; (b) modified relational operations such as normal projection and normal join; and (c) new operations such as normalize, embed, and unembed.
- Query processing in heterogeneous distributed database management systemsBhasker, Bharat (Virginia Tech, 1992-06-06)The goal of this work is to present an advanced query processing algorithm formulated and developed in support of heterogeneous distributed database management systems. Heterogeneous distributed database management systems view the integrated data through an uniform global schema. The query processing algorithm described here produces an inexpensive strategy for a query expressed over the global schema. The research addresses the following aspects of query processing: (1) Formulation of a low level query language to express the fundamental heterogeneous database operations; (2) Translation of the query expressed over the global schema to an equivalent query expressed over a conceptual schema; (3) An estimation methodology to derive the intermediate result sizes of the database operations; (4) A query decomposition algorithm to generate an efficient sequence of the basic database operations to answer the query. This research addressed the first issue by developing an algebraic query language called cluster algebra. The cluster algebra consists of the following operations: (a) Selection, union, intersection and difference, which are extensions of their relational algebraic counterparts to heterogeneous databases; (b) Normal-join and normal-projection which replace their counterparts, join and projection, in the relational algebra; (c) Two new operators embed and unembed to restructure the database schema. The second issue of the query translation was addressed by development of an algorithm that translates a cluster algebra query expressed over the virtual views to an equivalent cluster algebra query expressed over the conceptual databases. A non-parametric estimation methodology to estimate the result size of a cluster algebra operation was developed to address the third issue described above. Finally, this research developed a query decomposition algorithm, applicable to the relational and non-relational databases, that decomposes a query by computing all profitable semi-join operations, followed by the determination of the best sequence of join operations per processing site. The join optimization is performed by formulating a zero-one integer linear program that uses the non-parametric estimation technique to compute the sizes of intermediate results. The query processing algorithm was implemented in the context of DAVID, a heterogeneous distributed database management system.