MCAT: Motif Combining and Association Tool

dc.contributor.authorYang, Yanshenen
dc.contributor.committeechairHeath, Lenwood S.en
dc.contributor.committeememberZhang, Liqingen
dc.contributor.committeememberHauf, Silkeen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2018-09-12T06:00:28Zen
dc.date.available2018-09-12T06:00:28Zen
dc.date.issued2018-07-02en
dc.description.abstractDe novo motif discovery in biological sequences is an important and computationally challenging problem. A myriad of algorithms have been developed to solve this problem with varying success, but it can be difficult for even a small number of these tools to reach a consensus. Because individual tools can be better suited for specific scenarios, an ensemble tool that combines the results of many algorithms can yield a more confident and complete result. We present a novel and fast tool MCAT (Motif Combining and Association Tool) for de novo motif discovery by combining six state-of-the-art motif discovery tools (MEME, BioProspector, DECOD, XXmotif, Weeder, and CMF). We apply MCAT to data sets with DNA sequences that come from various species and compare our results with two well-established ensemble motif finding tools, EMD and DynaMIT. The experimental results show that MCAT is able to identify exact match motifs in DNA sequences efficiently, and it has a better performance in practice.en
dc.description.abstractgeneralFinding hidden motifs in DNA or protein sequences is an important and computationally challenging problem. A motif is a short patterned DNA/protein sequence that has biological functions. Motifs regulate the process of gene expression, which is the fundamental biological process in which DNA is transcribed into RNA which is then translated to protein. In the past 20 years, a myriad of algorithms have been developed to solve the motif finding problem with varying success, but it can be difficult for even a small number of these tools to reach a consensus. Because individual tools can be better suited for specific scenarios, an ensemble tool that combines the results of many algorithms can yield a more confident and complete result. I present a novel and fast tool MCAT (Motif Combining and Association Tool) for motif discovery by combining six state-of-the-art motif discovery tools (MEME, BioProspector, DECOD, XXmotif, Weeder, and CMF). I apply MCAT to data sets with DNA sequences that come from various species and compare our results with two well-established ensemble motif finding tools, EMD and DynaMIT. The experimental results show that MCAT is able to identify exact match motifs in DNA sequences efficiently, and it has an improved performance in practice.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:15877en
dc.identifier.urihttp://hdl.handle.net/10919/84999en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectMotif findingen
dc.titleMCAT: Motif Combining and Association Toolen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Yang_Y_T_2018.pdf
Size:
1.35 MB
Format:
Adobe Portable Document Format

Collections