A Broad Analysis of Tandemly Arrayed Genes in the Genomes of Human, Mouse, and Rat

TR Number

Date

2006-11-10

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Tandemly arrayed genes (TAG) play an important functional and physiological role in the genome. Most previous studies have focused on individual TAG families in a few species, yet a broad characterization of TAGs is not available. We identified all the TAGs in the genomes of human, chimp, mouse, and rat and performed a comprehensive analysis of TAG distribution, TAG sizes, TAG gene orientations and intergenic distances, and TAG gene functions. TAGs account for about 14-17% of all the genomic genes and nearly one third of all the duplicated genes in the four genomes, highlighting the predominant role that tandem duplication plays in gene duplication. For all species, TAG distribution is highly heterogeneous along chromosomes and some chromosomes are enriched with TAG forests while others are enriched with TAG deserts. The majority of TAGs are of size two for all genomes, similar to the previous findings in C. elegans, A. thaliana, and O. sativa, suggesting that it is a rather general phenomenon in eukaryotes.

The comparison with the genome patterns shows that TAG members have a significantly higher proportion of parallel gene orientation in all species, corroborating Graham's claim that parallel orientation is the preferred form of orientation in TAGs. Moreover, TAG members with parallel orientation tend to be closer to each other than all neighboring genes with parallel orientation in the genome. The analysis of GO function indicate that genes with receptor or binding activities are significantly over-represented by TAGs. Simulation reveals that random gene rearrangements have little effect on the statistics of TAGs for all genomes. It is noteworthy to mention that gene family sizes are significantly correlated with the extent of tandem duplication, suggesting that tandem duplication is a preferred form of duplication, especially in large families.

There has not been any systematic study of TAG genes' expression patterns in the genome. Taking advantage of recent large-scale microarray data, we were able to study expression divergence of some of the TAGs of size two in human and mouse for which the expression data is available and examine the effect of sequence divergence, gene orientation, and physical proximity on the divergence of gene expression patterns. Our results show that there is a weak negative correlation between sequence divergence and expression similarity between the two members of a TAG, and also a weak negative correlation between physical proximity of two genes and their expression similarity. No significant relationship was detected between gene orientation and expression similarity. Moreover, we compared the expression breadth of upstream and downstream duplicate copies and found that downstream duplicate does not show significantly narrower expression breadth. We also compared TAG gene pairs with their neighboring non-TAG pairs for both physical proximity and expression similarity. Our results show that TAG gene pairs do not show any distinct differences in the two aspects from their neighboring gene pairs, suggesting that sufficient divergence has occurred to these duplicated genes during evolution and their original similarity conferred by duplication has decayed to a level that is comparable to their surrounding regions.

Description

Keywords

Gene Expression, Tandemly Arrayed Genes, Comparative Genomics, Gene Duplication

Citation

Collections