Joint Biomedical Event Extraction and Entity Linking via Iterative Collaborative Training

dc.contributor.authorLi, Xiaochuen
dc.contributor.committeechairHuang, Lifuen
dc.contributor.committeememberReddy, Chandanen
dc.contributor.committeememberZhang, Liqingen
dc.contributor.departmentComputer Sciencesen
dc.date.accessioned2023-07-24T18:50:40Zen
dc.date.available2023-07-24T18:50:40Zen
dc.date.issued2023-05en
dc.description.abstractBiomedical entity linking and event extraction are two crucial tasks to support text understanding and retrieval in the biomedical domain. These two tasks intrinsically benefit each other: entity linking disambiguates the biomedical concepts by referring to external knowledge bases and the domain knowledge further provides additional clues to understand and extract the biological processes, while event extraction identifies a key trigger and entities involved to describe each biological process which also captures the structural context to better disambiguate the biomedical entities. However, previous research typically solves these two tasks separately or in a pipeline, leading to error propagation. What's more, it's even more challenging to solve these two tasks together as there is no existing dataset that contains annotations for both tasks. To solve these challenges, we propose joint biomedical entity linking and event extraction by regarding the event structures and entity references in knowledge bases as latent variables and updating the two task-specific models in an iterative training framework: (1) predicting the missing variables for each partially annotated dataset based on the current two task-specific models, and (2) updating the parameters of each model on the corresponding pseudo completed dataset. Experimental results on two benchmark datasets: Genia 2011 for event extraction and BC4GO for entity linking, show that our joint framework significantly improves the model for each individual task and outperforms the strong baselines for both tasks. We will make the code and model checkpoints publicly available once the paper is accepted.en
dc.description.abstractgeneralBiomedical entity linking and event extraction are essential tasks in understanding and retrieving information from biomedical texts. These tasks mutually benefit each other, as entity linking helps disambiguate biomedical concepts by leveraging external knowledge bases, while domain knowledge provides valuable insights for understanding and extracting biological processes. Event extraction, on the other hand, identifies triggers and entities involved in describing biological processes, capturing their contextual relationships for improved entity disambiguation. However, existing approaches often address these tasks separately or in a sequential manner, leading to error propagation. Furthermore, the joint solution becomes even more challenging due to the lack of datasets with annotations for both tasks. To overcome these challenges, we propose a novel approach for jointly performing biomedical entity linking and event extraction. Our method treats the event structures and entity references in knowledge bases as latent variables and employs an iterative training framework. This framework involves predicting missing variables in partially annotated datasets based on the current task-specific models and updating the model parameters using the completed datasets. Experimental results on benchmark datasets, namely Genia 2011 for event extraction and BC4GO for entity linking, demonstrate the effectiveness of our joint framework. It significantly improves the performance of each individual task and outperforms strong baselines for both tasks.en
dc.description.degreeM.S.en
dc.format.mediumETDen
dc.format.mimetypeapplication/pdfen
dc.identifier.urihttp://hdl.handle.net/10919/115831en
dc.language.isoenen
dc.publisherVirginia Techen
dc.subjectInformation extractionen
dc.titleJoint Biomedical Event Extraction and Entity Linking via Iterative Collaborative Trainingen
dc.typeThesisen
thesis.degree.disciplineComputer Science Applicationen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameM.S.en

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Li_X_T_2023.pdf
Size:
784.84 KB
Format:
Adobe Portable Document Format
License bundle
Now showing 1 - 1 of 1
Name:
license.txt
Size:
1.5 KB
Format:
Item-specific license agreed upon to submission
Description:

Collections