Generating Canonical Sentences from Question-Answer Pairs of Deposition Transcripts

dc.contributor.authorMehrotra, Maanaven
dc.contributor.committeechairFox, Edward A.en
dc.contributor.committeechairHsiao, Michael S.en
dc.contributor.committeememberEldardiry, Hodaen
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2022-02-18T07:00:26Zen
dc.date.available2022-02-18T07:00:26Zen
dc.date.issued2020-09-15en
dc.description.abstractIn the legal domain, documents of various types are created in connection with a particular case, such as testimony of people, transcripts, depositions, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between the different parties in the legal proceedings that are recorded by a court reporter. Court reporting has been traced back to 63 B.C. It has transformed from the initial scripts of ``Cuneiform", ``Running Script", and ``Grass Script" to Certified Access Real-time Translation (CART). Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy for common people to read. This gives us a need to develop some automatic text-summarization method for the same. The present-day summarization systems do not support this form of text, entailing a need to process them. This creates a need to parse such documents and extract QA pairs as well as any relevant supporting information. These QA pairs can then be converted into complete canonical sentences, i.e., in a declarative form, from which we could extract some insights and use for further downstream tasks. This work investigates the same, as well as using deep-learning techniques for such transformations.en
dc.description.abstractgeneralIn the legal domain, documents of various types are created in connection with a particular case, such as the testimony of people, transcripts, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between a lawyer and one of the parties in the legal proceedings, captured by a court reporter. Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy. Though automatic summarization could help, present-day systems do not work well with such texts. This creates a need to parse these documents and extract QA pairs as well as any relevant supporting information. The QA pairs can then be converted into canonical sentences, i.e., in a declarative form, from which we could extract some insights and support downstream tasks. This work describes these conversions, as well as using deep-learning techniques for such transformations.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:27311en
dc.identifier.urihttp://hdl.handle.net/10919/108405en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectNatural Language Processingen
dc.subjectDeep learning (Machine learning)en
dc.subjectLegal Techen
dc.subjectLegal Depositionsen
dc.titleGenerating Canonical Sentences from Question-Answer Pairs of Deposition Transcriptsen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Mehrotra_M_T_2020.pdf
Size:
4.56 MB
Format:
Adobe Portable Document Format

Collections