Improving Deposition Summarization using Enhanced Generation and Extraction of Entities and Keywords

dc.contributor.authorSumant, Aarohi Milinden
dc.contributor.committeechairFox, Edward A.en
dc.contributor.committeememberMeng, Naen
dc.contributor.committeememberEldardiry, Hodaen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2022-08-09T06:00:07Zen
dc.date.available2022-08-09T06:00:07Zen
dc.date.issued2021-06-01en
dc.description.abstractIn the legal domain, depositions help lawyers and paralegals to record details and recall relevant information relating to a case. Depositions are conversations between a lawyer and a deponent and are generally in Question-Answer (QA) format. These documents can be lengthy, which raises the need for applying summarization methods to the documents. Though many automatic summarization methods are available, not all of them give good results, especially in the legal domain. This creates a need to process the QA pairs and develop methods to help summarize the deposition. For further downstream tasks like summarization and insight generation, converting QA pairs to canonical or declarative form can be helpful. Since the transformed canonical sentences are not perfectly readable, we explore methods based on heuristics, language modeling, and deep learning, to improve the quality of sentences in terms of grammaticality, sentence correctness, and relevance. Further, extracting important entities and keywords from a deposition will help rank the candidate summary sentences and assist with extractive summarization. This work investigates techniques for enhanced generation of canonical sentences and extracting relevant entities and keywords to improve deposition summarization.en
dc.description.abstractgeneralIn the legal domain, depositions help lawyers and paralegals to record details and recall relevant information relating to a case. Depositions are conversations between a lawyer and a deponent and are generally in Question-Answer format. These documents can be lengthy, which raises the need for applying summarization methods to the documents. Typical automatic summarization techniques perform poorly on depositions since the data format is very different from standard text documents such as news articles, blogs. To standardize the process of summary generation, we convert the Question-Answer pairs from the deposition document to their canonical or declarative form. We apply techniques to improve the readability of these transformed sentences. Further, we extract entities such as person names, locations, organization and keywords from the deposition to retrieve important sentences and help in summarization. This work describes the techniques used to correct transformed sentences and extract important entities and keywords to improve the summarization of depositions.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:31313en
dc.identifier.urihttp://hdl.handle.net/10919/111488en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectlegal depositionen
dc.subjectsummarizationen
dc.subjectsentence correctionen
dc.subjectinformation extractionen
dc.titleImproving Deposition Summarization using Enhanced Generation and Extraction of Entities and Keywordsen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Sumant_AM_T_2021.pdf
Size:
3.21 MB
Format:
Adobe Portable Document Format

Collections