Query processing optimization for distributed relational database systems: an implementation of a heuristic based algorithm

TR Number



Journal Title

Journal ISSN

Volume Title


Virginia Polytechnic Institute and State University


The first step of the program is to input the statistical information concerning the relations of th· database. This information is stored in the log file and the file matrix data structures. Next, the query itself is read and stored in an array called the query matrix. The program examines the various fields of this matrix and decides which relations in the database are necessary to answer the query. For these relations it determines those attributes which should be eliminated and those which should be preserved for further processing. The key attributes are identified and are projected along with the other attributes. After the initial projection is completed the sizes of the new temporary relations are evaluated and stored in the appropriate fields of the file matrix structure. The program then examines that part of the query which contains the various restrictions on the attributes. The values of the attributes are sorted and those values which do not match the restrictions are eliminated from the log file. Again, the sizes of the new relations are estimated according to the method described by Egyhazy et al. [6]. A second projection is performed to eliminate attributes which were required by the selection phase but are not part of the final answer to the query.

The remaining relations are those relations which need to be joined to form a relation with the required information. In order to decide upon which relations to join, a special table, the join matrix, is created. This table contains pairs of relations which have common attributes and common values and therefore are joinable. The LP algorithm is used to determine the least expensive join out of all the possible joins. This process is repeated until all of the relations are joined to form a single relation which answers the query. As in the case of projection and selection the size of the temporary relations after each join is estimated. As a last step, we remove the key attributes which helped in joining the files but are not part of the answer to the query.



Ingres (Computer file)