Study of Pretraining Bias and Frequencies

dc.contributor.authorTaware, Rutuja Murlidharen
dc.contributor.committeechairRamakrishnan, Narendranen
dc.contributor.committeememberLourentzou, Isminien
dc.contributor.committeememberLu, Chang Tienen
dc.contributor.departmentComputer Science and Applicationsen
dc.date.accessioned2023-07-11T08:00:33Zen
dc.date.available2023-07-11T08:00:33Zen
dc.date.issued2023-07-10en
dc.description.abstractUsage of language models in an in-context learning environment has been adapted for a wide range of tasks. Recent works have showcased the impact of pretraining data on the in-context performance of language models. In this work, we experiment with numbers having high and low frequencies in the pretraining data to understand the impact of term frequencies on the model's performance. We also experiment with random and adversarial demonstrations to understand the pretraining bias present in the model. Through these experiments, we showcase the importance of pretraining frequencies of the numbers present in the demonstrations and explain how highly frequent terms can be used in the demonstrations to achieve better task performance. Moreover, we also show the impact of pretraining bias on the model's performance and explain how the model overcomes this bias with more demonstrations.en
dc.description.abstractgeneralRecent works focus on understanding and improving the arithmetic capabilities of the state-of-the-art (SOTA) systems in the domain of Natural Language Processing (NLP). This work focuses on designing and performing novel experiments to analyze the impact of training data on the performance of such systems. Through these experiments, this work showcases interesting properties of the SOTA systems which will promote future research to understand them better as well as help in creating better downstream applications.en
dc.description.degreeMaster of Scienceen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:37728en
dc.identifier.urihttp://hdl.handle.net/10919/115712en
dc.language.isoenen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectIn-Context Learningen
dc.subjectPretrainingen
dc.subjectFrequencyen
dc.subjectBiasen
dc.subjectLanguage Modelen
dc.titleStudy of Pretraining Bias and Frequenciesen
dc.typeThesisen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.levelmastersen
thesis.degree.nameMaster of Scienceen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Taware_R_T_2023.pdf
Size:
2.21 MB
Format:
Adobe Portable Document Format

Collections