Anomalous Information Detection in Social Media

dc.contributor.authorTao, Rongrongen
dc.contributor.committeechairRamakrishnan, Narendranen
dc.contributor.committeememberLu, Chang-Tienen
dc.contributor.committeememberChen, Fengen
dc.contributor.committeememberReddy, Chandan K.en
dc.contributor.committeememberNorth, Christopher L.en
dc.contributor.departmentComputer Scienceen
dc.date.accessioned2021-03-12T09:00:53Zen
dc.date.available2021-03-12T09:00:53Zen
dc.date.issued2021-03-10en
dc.description.abstractThis dissertation focuses on identifying various types of anomalous information pattern in social media and news outlets. We focus on three types of anomalous information, including (1) media censorship in news outlets, which is information that should be published but is actually missing, (2) fake news in social media, which is unreliable information shown to the public, and (3) media propaganda in news outlets, which is trustworthy information but being over-populated. For the first problem, existing approaches on censorship detection mostly rely on monitoring posts in social media. However, media censorship in news outlets has not received nearly as much attention, mostly because it is difficult to systematically detect. The contributions of our work include: (1) a hypothesis testing framework to identify and evaluate censored clusters of keywords, (2) a near-linear-time algorithm to identify the highest scoring clusters as indicators of censorship, and (3) extensive experiments on six Latin American countries for performance evaluation. For the second problem, existing approaches studying fake news in social media primarily focus on topic-level modeling or prediction based on a set of aggregated features from a col- lection of posts. However, the credibility of various information components within the same topic can be quite different. The contributions of our work in this space include: (1) a new benchmark dataset for fake news research, (2) a cluster-based approach to improve instance- level prediction of information credibility, and (3) extensive experiments for performance evaluations. For the last problem, existing approaches to media propaganda detection primarily focus on investigating the pattern of information shared over social media or evaluation from domain experts. However, these approaches cannot be generalized to a large-scale analysis of media propaganda in news outlets. The contributions of our work include: (1) non- parametric scan statistics to identify clusters of over-populated keywords, (2) a near-linear-time algorithm to identify the highest scoring clusters as indicators of propaganda, and (3) extensive experiments on two Latin American countries for performance evaluation.en
dc.description.abstractgeneralNowadays, massive information is available through a variety of social media platforms. However, the information accessed by the audience might be not exactly correct in different ways. In order for the audience being able to get access to the correct information, we develop various machine learning algorithms to uncover the anomalous information pattern in social media and explain the reason behind this behavior. Our algorithms can be used to learn what different information patterns can exist in the open data source.en
dc.description.degreeDoctor of Philosophyen
dc.format.mediumETDen
dc.identifier.othervt_gsexam:29181en
dc.identifier.urihttp://hdl.handle.net/10919/102665en
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.rights.urihttp://rightsstatements.org/vocab/InC/1.0/en
dc.subjectSocial Mediaen
dc.subjectAnomalous Informationen
dc.subjectEvent Detectionen
dc.titleAnomalous Information Detection in Social Mediaen
dc.typeDissertationen
thesis.degree.disciplineComputer Science and Applicationsen
thesis.degree.grantorVirginia Polytechnic Institute and State Universityen
thesis.degree.leveldoctoralen
thesis.degree.nameDoctor of Philosophyen

Files

Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Tao_R_D_2021.pdf
Size:
2.2 MB
Format:
Adobe Portable Document Format