Weakly Supervised Machine Learning for Cyberbullying Detection
The advent of social media has revolutionized human communication, significantly improving individuals' lives. It makes people closer to each other, provides access to enormous real-time information, and eases marketing and business. Despite its uncountable benefits, however, we must consider some of its negative implications such as online harassment and cyberbullying. Cyberbullying is becoming a serious, large-scale problem damaging people's online lives. This phenomenon is creating a need for automated, data-driven techniques for analyzing and detecting such behaviors. In this research, we aim to address the computational challenges associated with harassment-based cyberbullying detection in social media by developing machine-learning framework that only requires weak supervision. We propose a general framework that trains an ensemble of two learners in which each learner looks at the problem from a different perspective. One learner identifies bullying incidents by examining the language content in the message; another learner considers the social structure to discover bullying.
Each learner is using different body of information, and the individual learner co-train one another to come to an agreement about the bullying concept. The models estimate whether each social interaction is bullying by optimizing an objective function that maximizes the consistency between these detectors.
We first developed a model we referred to as participant-vocabulary consistency, which is an ensemble of two linear language-based and user-based models. The model is trained by providing a set of seed key-phrases that are indicative of bullying language. The results were promising, demonstrating its effectiveness and usefulness in recovering known bullying words, recognizing new bullying words, and discovering users involved in cyberbullying. We have extended this co-trained ensemble approach with two complementary goals: (1) using nonlinear embeddings as model families, (2) building a fair language-based detector. For the first goal, we incorporated the efficacy of distributed representations of words and nodes such as deep, nonlinear models. We represent words and users as low-dimensional vectors of real numbers as the input to language-based and user-based classifiers, respectively. The models are trained by optimizing an objective function that balances a co-training loss with a weak-supervision loss. Our experiments on Twitter, Ask.fm, and Instagram data show that deep ensembles outperform non-deep methods for weakly supervised harassment detection. For the second goal, we geared this research toward a very important topic in any online automated harassment detection: fairness against particular targeted groups including race, gender, religion, and sexual orientations. Our goal is to decrease the sensitivity of models to language describing particular social groups. We encourage the learning algorithm to avoid discrimination in the predictions by adding an unfairness penalty term to the objective function. We quantitatively and qualitatively evaluate the effectiveness of our proposed general framework on synthetic data and data from Twitter using post-hoc, crowdsourced annotation. In summary, this dissertation introduces a weakly supervised machine learning framework for harassment-based cyberbullying detection using both messages and user roles in social media.