Fralin Life Sciences Institute
Permanent URI for this community
Note: In 2019, the Biocomplexity Institute became part of the Fralin Life Sciences Institute.
Browse
Browsing Fralin Life Sciences Institute by Issue Date
Now showing 1 - 20 of 844
Results Per Page
Sort Options
- Method for the conversion of cephalomannine to taxol and for the preparation of n-acyl analogs of taxol(United States Patent and Trademark Office, 1995-11-28)The natural product cephalomannine can be converted to the important anticancer natural product taxol by a simple process involving the steps of dihydroxylation to give cephalomannine-diols, diol cleavage, benzoylation at the 2'-position and reaction with a 1,2-diamine. The same process when applied to mixtures of taxol and cephalomannine makes it possible to separate taxol from cephalomannine-diols by simple flash-chromatography after the dihydroxylation step. If the benzoylation step is avoided in the above sequence of conversions, the process leads to the free amine (N-debenzoyltaxol). In addition, the selection of an acylating reagent other than that with the benzoyl group for acylation of the free amine (N-debenzoyltaxol), allows the preparation of taxol analogs with other N-acyl substituents.
- Towards a calculus of biological networksReidys, Christian Michael; Mortveit, Henning S. (2002)In this paper we present a new framework for studying the dynamics of biological networks. A specific class of dynamical systems, Sequential Dynamical Systems (SDS), is introduced. These systems allow one to investigate the interplay between structural properties of the network and its phase space. We will show in detail how to find a reduced system that captures key features of a given system. This reduction is based on a special graph-theoretic relation between the two networks. We will study the reduction of SDS over n-cubes in detail and we will present several examples.
- 2002 Annual ReportVirginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2002-11-03)
- Virginia Bioinformatics Institute 2003 Annual ReportVirginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2004-01-18)
- Reproducibility, bioinformatic analysis and power of the SAGE method to evaluate changes in transcriptomeDinel, S.; Bolduc, C.; Belleau, P.; Boivin, A.; Yoshioka, M.; Calvo, E.; Piedboeuf, B.; Snyder, E. E.; Labrie, F.; St-Amand, J. (2005-01-01)The serial analysis of gene expression (SAGE) method is used to study global gene expression in cells or tissues in various experimental conditions. However, its reproducibility has not yet been definitively assessed. In this study, we have evaluated the reproducibility of the SAGE method and identified the factors that affect it. The determination coefficient (R-2 ) for the reproducibility of SAGE is 0.96. However, there are some factors that can affect the reproducibility of SAGE, such as the replication of concatemers and ditags, the number of sequenced tags and double PCR amplification of ditags. Thus, corrections for these factors must be made to ensure the reproducibility and accuracy of SAGE results. A bioinformatic analysis of SAGE data is also presented in order to eliminate these artifacts. Finally, the current study shows that increasing the number of sequenced tags improves the power of the method to detect transcripts and their regulation by experimental conditions.
- Virginia Bioinformatics Institute 2004 Annual ReportVirginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2005-01-17)
- Mitochondrial-encoded membrane protein transcripts are pyrimidine-rich while soluble protein transcripts and ribosomal RNA are purine-richBradshaw, Patrick C.; Rathi, Anand; Samuels, David C. (2005-09-26)Background Eukaryotic organisms contain mitochondria, organelles capable of producing large amounts of ATP by oxidative phosphorylation. Each cell contains many mitochondria with many copies of mitochondrial DNA in each organelle. The mitochondrial DNA encodes a small but functionally critical portion of the oxidative phosphorylation machinery, a few other species-specific proteins, and the rRNA and tRNA used for the translation of these transcripts. Because the microenvironment of the mitochondrion is unique, mitochondrial genes may be subject to different selectional pressures than those affecting nuclear genes. Results From an analysis of the mitochondrial genomes of a wide range of eukaryotic species we show that there are three simple rules for the pyrimidine and purine abundances in mitochondrial DNA transcripts. Mitochondrial membrane protein transcripts are pyrimidine rich, rRNA transcripts are purine-rich and the soluble protein transcripts are purine-rich. The transitions between pyrimidine and purine-rich regions of the genomes are rapid and are easily visible on a pyrimidine-purine walk graph. These rules are followed, with few exceptions, independent of which strand encodes the gene. Despite the robustness of these rules across a diverse set of species, the magnitude of the differences between the pyrimidine and purine content is fairly small. Typically, the mitochondrial membrane protein transcripts have a pyrimidine richness of 56%, the rRNA transcripts are 55% purine, and the soluble protein transcripts are only 53% purine. Conclusion The pyrimidine richness of mitochondrial-encoded membrane protein transcripts is partly driven by U nucleotides in the second codon position in all species, which yields hydrophobic amino acids. The purine-richness of soluble protein transcripts is mainly driven by A nucleotides in the first codon position. The purine-richness of rRNA is also due to an abundance of A nucleotides. Possible mechanisms as to how these trends are maintained in mtDNA genomes of such diverse ancestry, size and variability of A-T richness are discussed.
- The distribution of SNPs in human gene regulatory regionsGuo, Yongjian; Jamison, D. Curtis (2005-10-06)Background As a result of high-throughput genotyping methods, millions of human genetic variants have been reported in recent years. To efficiently identify those with significant biological functions, a practical strategy is to concentrate on variants located in important sequence regions such as gene regulatory regions. Results Analysis of the most common type of variant, single nucleotide polymorphisms (SNPs), shows that in gene promoter regions more SNPs occur in close proximity to transcriptional start sites than in regions further upstream, and a disproportionate number of those SNPs represent nucleotide transversions. Additionally, the number of SNPs found in the predicted transcription factor binding sites is higher than in non-binding site sequences. Conclusion Current information about transcription factor binding site sequence patterns may not be exhaustive, and SNPs may be actively involved in influencing gene expression by affecting the transcription factor binding sites.
- Virginia Bioinformatics Institute at Virginia Tech: Scientific Annual Report 2005Virginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2005-12-02)
- VMD: a community annotation database for oomycetes and microbial genomesTripathy, Sucheta; Pandey, Varun N.; Fang, Bing; Salas, Fidel; Tyler, Brett M. (2006-01-01)The VBI Microbial Database (VMD) is a database system designed to host a range of microbial genome sequences. At present, the database contains genome sequence and annotation data of two plant pathogens Phytophthora sojae and Phytophthora ramorum. With the completion of the draft genome sequences of these pathogens in collaboration with the DOE Joint Genome Institute (JGI), we have created this resource to make the sequences publicly available. The genome sequences ( 95 MB for P. sojae and 65 MB for P. ramorum) were annotated with similar to 19 000 and similar to 16 000 gene models, respectively. We used two different statistical methods to validate these gene models, Fickett's and a log-likelihood method. Functional annotation of the gene models is based on results from BlastX and InterProScan screens. From the InterProScan results, we could assign putative functions to 17 694 genes in P. sojae and 14 700 genes in P. ramorum. We created an easy-to-use genome browser to view the genome sequence data, which opens to detailed annotation pages for each gene model. A community annotation interface is available for registered community members to add or edit annotations. There are similar to 1600 gene models for P. sojae and similar to 700 models for P. ramorum that have already been manually curated. A toolkit is provided as an additional resource for users to perform a variety of sequence analysis jobs. The database is publicly available at http://phytophthora.vbi.vt.edu/.
- Tomato Expression Database (TED): a suite of data presentation and analysis toolsFei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James (2006-01-01)The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression ( EST abundance) data derived from analysis of the complete public tomato EST collection containing. 150 000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at http://ted.bti.cornell.edu.
- Innovation, Connecting to the Future: 2005 Annual ReportVirginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2006-01-18)
- e-Connections, vol. 1, no. 1; Spring 2006(Virginia Bioinformatics Institute, Virginia Tech, 2006-04-14)
- The statistics of identifying differentially expressed genes in Expresso and TM4: a comparisonSioson, Allan A.; Mane, Shrinivasrao P.; Li, Pinghua; Sha, Wei; Heath, Lenwood S.; Bohnert, Hans J.; Grene, Ruth (2006-04-20)Background Analysis of DNA microarray data takes as input spot intensity measurements from scanner software and returns differential expression of genes between two conditions, together with a statistical significance assessment. This process typically consists of two steps: data normalization and identification of differentially expressed genes through statistical analysis. The Expresso microarray experiment management system implements these steps with a two-stage, log-linear ANOVA mixed model technique, tailored to individual experimental designs. The complement of tools in TM4, on the other hand, is based on a number of preset design choices that limit its flexibility. In the TM4 microarray analysis suite, normalization, filter, and analysis methods form an analysis pipeline. TM4 computes integrated intensity values (IIV) from the average intensities and spot pixel counts returned by the scanner software as input to its normalization steps. By contrast, Expresso can use either IIV data or median intensity values (MIV). Here, we compare Expresso and TM4 analysis of two experiments and assess the results against qRT-PCR data. Results The Expresso analysis using MIV data consistently identifies more genes as differentially expressed, when compared to Expresso analysis with IIV data. The typical TM4 normalization and filtering pipeline corrects systematic intensity-specific bias on a per microarray basis. Subsequent statistical analysis with Expresso or a TM4 t-test can effectively identify differentially expressed genes. The best agreement with qRT-PCR data is obtained through the use of Expresso analysis and MIV data. Conclusion The results of this research are of practical value to biologists who analyze microarray data sets. The TM4 normalization and filtering pipeline corrects microarray-specific systematic bias and complements the normalization stage in Expresso analysis. The results of Expresso using MIV data have the best agreement with qRT-PCR results. In one experiment, MIV is a better choice than IIV as input to data normalization and statistical analysis methods, as it yields as greater number of statistically significant differentially expressed genes; TM4 does not support the choice of MIV input data. Overall, the more flexible and extensive statistical models of Expresso achieve more accurate analytical results, when judged by the yardstick of qRT-PCR data, in the context of an experimental design of modest complexity.
- e-Connections, vol. 1, no. 2; Summer 2006(Virginia Bioinformatics Institute, Virginia Tech, 2006-07-06)
- Virginia Bioinformatics Institute at Virginia Tech: Scientific Annual Report 2006Virginia Bioinformatics Institute (Virginia Bioinformatics Institute, 2006-08-07)
- e-Connections, vol. 1, no. 3; Fall 2006(Virginia Bioinformatics Institute, Virginia Tech, 2006-10-03)
- GenomeBlast: a web tool for small genome comparisonLu, Guoqing; Jiang, Liying; Helikar, Resa M. K.; Rowley, Thaine W.; Zhang, Luwen; Chen, Xianfeng; Moriyama, Etsuko N. (2006-12-12)Background Comparative genomics has become an essential approach for identifying homologous gene candidates and their functions, and for studying genome evolution. There are many tools available for genome comparisons. Unfortunately, most of them are not applicable for the identification of unique genes and the inference of phylogenetic relationships in a given set of genomes. Results GenomeBlast is a Web tool developed for comparative analysis of multiple small genomes. A new parameter called "coverage" was introduced and used along with sequence identity to evaluate global similarity between genes. With GenomeBlast, the following results can be obtained: (1) unique genes in each genome; (2) homologous gene candidates among compared genomes; (3) 2D plots of homologous gene candidates along the all pairwise genome comparisons; and (4) a table of gene presence/absence information and a genome phylogeny. We demonstrated the functions in GenomeBlast with an example of multiple herpesviral genome analysis and illustrated how GenomeBlast is useful for small genome comparison. Conclusion We developed a Web tool for comparative analysis of small genomes, which allows the user not only to identify unique genes and homologous gene candidates among multiple genomes, but also to view their graphical distributions on genomes, and to reconstruct genome phylogeny. GenomeBlast runs on a Linux server with 4 CPUs and 4 GB memory. The online version of GenomeBlast is available to public by using a Web browser with the URL http://bioinfo-srv1.awh.unomaha.edu/genomeblast/.
- Computational prediction of host-pathogen protein–protein interactionsDyer, Matthew D.; Murali, T. M.; Sobral, Bruno (Oxford University Press, 2007)Motivation: Infectious diseases such as malaria result in millions of deaths each year. An important aspect of any host-pathogen system is the mechanism by which a pathogen can infect its host. One method of infection is via protein–protein interactions (PPIs) where pathogen proteins target host proteins. Developing computational methods that identify which PPIs enable a pathogen to infect a host has great implications in identifying potential targets for therapeutics. Results: We present a method that integrates known intra-species PPIs with protein-domain profiles to predict PPIs between host and pathogen proteins. Given a set of intra-species PPIs, we identify the functional domains in each of the interacting proteins. For every pair of functional domains, we use Bayesian statistics to assess the probability that two proteins with that pair of domains will interact. We apply our method to the Homo sapiens – Plasmodium falciparum host-pathogen system. Our system predicts 516 PPIs between proteins from these two organisms. We show that pairs of human proteins we predict to interact with the same Plasmodium protein are close to each other in the human PPI network and that Plasmodium pairs predicted to interact with same human protein are co-expressed in DNA microarray datasets measured during various stages of the Plasmodium life cycle. Finally, we identify functionally enriched sub-networks spanned by the predicted interactions and discuss the plausibility of our predictions.
- A syntactic model to design and verify synthetic genetic constructs derived from standard biological partsCai, Y.; Hartnett, B.; Gustafsson, C.; Peccoud, Jean (2007)Motivation: The sequence of artificial genetic constructs is composed of multiple functional fragments, or genetic parts, involved in different molecular steps of gene expression mechanisms. Biologists have deciphered structural rules that the design of genetic constructs needs to follow in order to ensure a successful completion of the gene expression process, but these rules have not been formalized, making it challenging for non-specialists to benefit from the recent progress in gene synthesis. Results: We show that context-free grammars (CFG) can formalize these design principles. This approach provides a path to organizing libraries of genetic parts according to their biological functions, which correspond to the syntactic categories of the CFG. It also provides a framework for the systematic design of new genetic constructs consistent with the design principles expressed in the CFG. Using parsing algorithms, this syntactic model enables the verification of existing constructs. We illustrate these possibilities by describing a CFG that generates the most common architectures of genetic constructs in Escherichia coli. Availability: A web site allows readers to experiment with the algorithms presented in this article: www.genocad.org