Information Extraction of Technical Details From Scholarly Articles

Kaushal, Kulendra Kumar

Information Extraction of Technical Details From Scholarly Articles

Files

Kaushal_K_T_2021.pdf (2.1 MB)

Downloads: 302

Date

2021-06-16

Authors

Kaushal, Kulendra Kumar

Publisher

Virginia Tech

Abstract

Researchers have made significant progress in information extraction from short documents in the last few years, including social media interaction, news articles, and email excerpts. This research aims to extract technical entities like hardware resources, computing platforms, compute time, programming language, and libraries from scholarly research articles. Research articles are generally long documents having both salient as well as non-salient entities. Analyzing the cross-sectional relation, filtering the relevant information, measuring the saliency of mentioned entities, and extracting novel entities are some of the technical challenges involved in this research. This work presents a detailed study about the performance, effectiveness, and scalability of rule-based weakly supervised algorithms. We also develop an automated end-to-end Research Entity and Relationship Extractor (E2R Extractor). Additionally, we perform a comprehensive study about the effectiveness of existing deep learning-based information extraction tools like Dygie, Dygie++, SciREX. The research also contributes a dataset containing novel entities annotated in BILUO format and represents the baseline results using the E2R extractor on the proposed dataset. The results indicate that the E2R extractor successfully extracts salient entities from research articles.

Keywords

Information Extraction, Long Documents, Research Articles, Named Entity Recognition, Hardware Resources, Compute Platform, Programming Language and Libraries

Persistent link

http://hdl.handle.net/10919/112825

Collections

Masters Theses

Full item page

Information Extraction of Technical Details From Scholarly Articles

Files

TR Number

Date

Authors

Journal Title

Journal ISSN

Volume Title

Publisher

Abstract

Description

Keywords

Citation

Persistent link

Collections