Leverage Fusion of Sentiment Features and Bert-based Approach to Improve Hate Speech Detection

Cheng, Kai Hsiang

Leverage Fusion of Sentiment Features and Bert-based Approach to Improve Hate Speech Detection

dc.contributor.author	Cheng, Kai Hsiang	en
dc.contributor.committeechair	Lu, Chang Tien	en
dc.contributor.committeemember	Chen, Ing Ray	en
dc.contributor.committeemember	Cho, Jin-Hee	en
dc.contributor.department	Computer Science	en
dc.date.accessioned	2022-06-24T08:01:22Z	en
dc.date.available	2022-06-24T08:01:22Z	en
dc.date.issued	2022-06-23	en
dc.description.abstract	Social media has become an important place for modern people to conveniently share and exchange their ideas and opinions. However, not all content on the social media have positive impact. Hate speech is one kind of harmful content that people use abusive speech attacking or promoting hate towards a specific group or an individual. With online hate speech on the rise these day, people have explored ways to automatically recognize the hate speech, and among the ways people have studied, the Bert-based approach is promising and thus dominates SemEval-2019 Task 6, a hate speech detection competition. In this work, the method of fusion of sentiment features and Bert-based approach is proposed. The classic Bert architecture for hate speech detection is modified to fuse with additional sentiment features, provided by an extractor pre-trained on Sentiment140. The proposed model is compared with top-3 models in SemEval-2019 Task 6 Subtask A and achieves 83.1% F1 score that better than the models in the competition. Also, to see if additional sentiment features benefit the detectoin of hate speech, the features are fused with three kind of deep learning architectures respectively. The results show that the models with sentiment features perform better than those models without sentiment features.	en
dc.description.abstractgeneral	Social media has become an important place for modern people to conveniently share and exchange their ideas and opinions. However, not all content on the social media have positive impact. Hate speech is one kind of harmful content that people use abusive speech attacking or promoting hate towards a specific group or an individual. With online hate speech on the rise these day, people have explored ways to automatically recognize the hate speech, and among the ways people have studied, Bert is one of promising approach for automatic hate speech recognition. Bert is a kind of deep learning model for natural language processing (NLP) that originated from Transformer developed by Google in 2017. The Bert has applied to many NLP tasks and achieved astonished results such as text classification, semantic similarity between pairs of sentences, question answering with given paragraph, and text summarization. So in this study, Bert will be adopted to learn the meaning of given text and distinguish the hate speech from tons of tweets automatically. In order to let Bert better capture hate speech, the approach in this work modifies Bert to take additional source of sentiment-related features for learning the pattern of hate speech, given that the emotion will be negative when people trying to put out abusive speech. For evaluation of the approach, our model is compared against those in SemEval-2019 Task 6, a famous hate speech detection competition, and the results show that the proposed model achieves 83.1\% F1 score better than the models in the competition. Also, to see if additional sentiment features benefit the detection of hate speech, the features are fused with three different kinds of deep learning architectures respectively, and the results show that the models with sentiment features perform better than those without sentiment features.	en
dc.description.degree	Master of Science	en
dc.format.medium	ETD	en
dc.identifier.other	vt_gsexam:35115	en
dc.identifier.uri	http://hdl.handle.net/10919/110929	en
dc.language.iso	en	en
dc.publisher	Virginia Tech	en
dc.rights	In Copyright	en
dc.rights.uri	http://rightsstatements.org/vocab/InC/1.0/	en
dc.subject	hate speech detection	en
dc.subject	sentiment features	en
dc.subject	BERT	en
dc.title	Leverage Fusion of Sentiment Features and Bert-based Approach to Improve Hate Speech Detection	en
dc.type	Thesis	en
thesis.degree.discipline	Computer Science and Applications	en
thesis.degree.grantor	Virginia Polytechnic Institute and State University	en
thesis.degree.level	masters	en
thesis.degree.name	Master of Science	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Cheng_K_T_2022.pdf
Size:: 3.26 MB
Format:: Adobe Portable Document Format

Download

Collections

Masters Theses