GeneSieve: A Probe Selection Strategy for cDNA Microarrays

dc.contributor.authorShukla, Mauliken
dc.contributor.committeechairHeath, Lenwood S.en
dc.contributor.committeememberGrene, Ruthen
dc.contributor.committeememberRamakrishnan, Narenen
dc.contributor.committeememberMurali, T. M.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2011-08-06T16:06:22Zen
dc.date.adate2004-09-14en
dc.date.available2011-08-06T16:06:22Zen
dc.date.issued2004-08-24en
dc.date.rdate2004-09-14en
dc.date.sdate2004-09-09en
dc.description.abstractThe DNA microarray is a powerful tool to study expression levels of thousands of genes simultaneously. Often, cDNA libraries representing expressed genes of an organism are available, along with expressed sequence tags (ESTs). ESTs are widely used as the probes for microarrays. Designing custom microarrays, rich in genes relevant to the experimental objectives, requires selection of probes based on their sequence. We have designed a probe selection method, called GeneSieve, to select EST probes for custom microarrays. To assign annotations to the ESTs, we cluster them into contigs using PHRAP. The larger contig sequences are then used for similarity search against known proteins in model organism such as Arabidopsis thaliana. We have designed three different methods to assign annotations to the contigs: bidirectional hits (BH), bidirectional best hits (BBH), and unidirectional best hits (UBH). We apply these methods to pine and potato EST sets. Results show that the UBH method assigns unambiguous annotations to a large fraction of contigs in an organism. Hence, we use UBH to assign annotations to ESTs in GeneSieve. To select a single EST from a contig, GeneSieve assigns a quality score to each EST based on its protein homology (PH), cross hybridization (CH), and relative length (RL). We use this quality score to rank ESTs according to seven different measures: length, 3' proximity, 5' proximity, protein homology, cross hybridization, relative length, and overall quality score. Results for pine and potato EST sets indicate that EST probes selected by quality score are relatively long and give better values for protein homology and cross hybridization. Results of the GeneSieve protocol are stored in a database and linked with sequence databases and known functional category schemes such as MIPS and GO. The database is made available via a web interface. A biologist is able to select large number of EST probes based on annotations or functional categories in a quick and easy way.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.otheretd-09092004-152600en
dc.identifier.sourceurlhttp://scholar.lib.vt.edu/theses/available/etd-09092004-152600en
dc.identifier.urihttp://hdl.handle.net/10919/10114en
dc.publisherVirginia Techen
dc.relation.haspartGeneSieve.pdfen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectEST annotationen
dc.subjectcDNA microarraysen
dc.subjectprobe selectionen
dc.titleGeneSieve: A Probe Selection Strategy for cDNA Microarraysen
dc.typeThesisen
thesis.degree.disciplineComputer Scienceen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
GeneSieve.pdf
Size:
1.26 MB
Format:
Adobe Portable Document Format

Collections