Virginia Tech
    • Log in
    View Item 
    •   VTechWorks Home
    • Fralin Life Sciences Institute
    • Scholarly Works, Fralin Life Sciences Institute
    • View Item
    •   VTechWorks Home
    • Fralin Life Sciences Institute
    • Scholarly Works, Fralin Life Sciences Institute
    • View Item
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Named Entity Recognition for Bacterial Type IV Secretion Systems

    Thumbnail
    View/Open
    journal_pone_0014780.pdf (316.3Kb)
    Downloads: 301
    Date
    2011-03-29
    Author
    Ananiadou, Sophia
    Sullivan, Dan
    Black, William
    Levow, Gina-Anne
    Gillespie, Joseph J.
    Mao, Chunhong
    Pyysalo, Sampo
    Kolluru, BalaKrishna
    Tsujii, Junichi
    Sobral, Bruno
    Metadata
    Show full item record
    Abstract
    Research on specialized biological systems is often hampered by a lack of consistent terminology, especially across species. In bacterial Type IV secretion systems genes within one set of orthologs may have over a dozen different names. Classifying research publications based on biological processes, cellular components, molecular functions, and microorganism species should improve the precision and recall of literature searches allowing researchers to keep up with the exponentially growing literature, through resources such as the Pathosystems Resource Integration Center (PATRIC, patricbrc.org). We developed named entity recognition (NER) tools for four entities related to Type IV secretion systems: 1) bacteria names, 2) biological processes, 3) molecular functions, and 4) cellular components. These four entities are important to pathogenesis and virulence research but have received less attention than other entities, e.g., genes and proteins. Based on an annotated corpus, large domain terminological resources, and machine learning techniques, we developed recognizers for these entities. High accuracy rates (>80%) are achieved for bacteria, biological processes, and molecular function. Contrastive experiments highlighted the effectiveness of alternate recognition strategies; results of term extraction on contrasting document sets demonstrated the utility of these classes for identifying T4SS-related documents.
    URI
    http://hdl.handle.net/10919/48981
    Collections
    • Scholarly Works, Fralin Life Sciences Institute [542]

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us
     

     

    VTechWorks

    AboutPoliciesHelp

    Browse

    All of VTechWorksCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

    My Account

    Log inRegister

    Statistics

    View Usage Statistics

    If you believe that any material in VTechWorks should be removed, please see our policy and procedure for Requesting that Material be Amended or Removed. All takedown requests will be promptly acknowledged and investigated.

    Virginia Tech | University Libraries | Contact Us