Generating Canonical Sentences from Question-Answer Pairs of Deposition Transcripts

Mehrotra, Maanav2022-02-182022-02-182020-09-15vt_gsexam:27311http://hdl.handle.net/10919/108405In the legal domain, documents of various types are created in connection with a particular case, such as testimony of people, transcripts, depositions, memos, and emails. Deposition transcripts are one such type of legal document, which consists of conversations between the different parties in the legal proceedings that are recorded by a court reporter. Court reporting has been traced back to 63 B.C. It has transformed from the initial scripts of ``Cuneiform", ``Running Script", and ``Grass Script" to Certified Access Real-time Translation (CART). Since the boom of digitization, there has been a shift to storing these in the PDF/A format. Deposition transcripts are in the form of question-answer (QA) pairs and can be quite lengthy for common people to read. This gives us a need to develop some automatic text-summarization method for the same. The present-day summarization systems do not support this form of text, entailing a need to process them. This creates a need to parse such documents and extract QA pairs as well as any relevant supporting information. These QA pairs can then be converted into complete canonical sentences, i.e., in a declarative form, from which we could extract some insights and use for further downstream tasks. This work investigates the same, as well as using deep-learning techniques for such transformations.ETDIn CopyrightNatural Language ProcessingDeep learning (Machine learning)Legal TechLegal DepositionsGenerating Canonical Sentences from Question-Answer Pairs of Deposition TranscriptsThesis