Addressing Challenges of Modern News Agencies via Predictive Modeling, Deep Learning, and Transfer Learning

dc.contributor.authorKeneshloo, Yaseren
dc.contributor.committeechairRamakrishnan, Narenen
dc.contributor.committeechairReddy, Chandan K.en
dc.contributor.committeememberYao, Danfeng (Daphne)en
dc.contributor.committeememberPrakash, B. Adityaen
dc.contributor.committeememberHan, Eui-Hongen
dc.contributor.departmentComputer Scienceen
dc.description.abstractToday's news agencies are moving from traditional journalism, where publishing just a few news articles per day was sufficient, to modern content generation mechanisms, which create more than thousands of news pieces every day. With the growth of these modern news agencies comes the arduous task of properly handling this massive amount of data that is generated for each news article. Therefore, news agencies are constantly seeking solutions to facilitate and automate some of the tasks that have been previously done by humans. In this dissertation, we focus on some of these problems and provide solutions for two broad problems which help a news agency to not only have a wider view of the behaviour of readers around the article but also to provide an automated tools to ease the job of editors in summarizing news articles. These two disjoint problems are aiming at improving the users' reading experience by helping the content generator to monitor and focus on poorly performing content while allow them to promote the good-performing ones. We first focus on the task of popularity prediction of news articles via a combination of regression, classification, and clustering models. We next focus on the problem of generating automated text summaries for a long news article using deep learning models. The first problem aims at helping the content developer in understanding of how a news article is performing over the long run while the second problem provides automated tools for the content developers to generate summaries for each news article.en
dc.description.abstractgeneralNowadays, each person is exposed to an immense amount of information from social media, blog posts, and online news portals. Among these sources, news agencies are one of the main content providers for each person around the world. Contemporary news agencies are moving from traditional journalism to modern techniques from different angles. This is achieved either by building smart tools to track the behaviour of readers’ reaction around a specific news article or providing automated tools to facilitate the editor’s job in providing higher quality content to readers. These systems should not only be able to scale well with the growth of readers but also they have to be able to process ad-hoc requests, precisely since most of the policies and decisions in these agencies are taken around the result of these analytical tools. As part of this new movement towards adapting new technologies for smart journalism, we have worked on various problems with The Washington Post news agency on building tools for predicting the popularity of a news article and automated text summarization model. We develop a model that monitors each news article after its publication and provide prediction over the number of views that this article will receive within the next 24 hours. This model will help the content creator to not only promote potential viral article in the main page of the web portal or social media, but also provide intuition for editors on potential poorly performing articles so that they can edit the content of those articles for better exposure. On the other hand, current news agencies are generating more than a thousands news articles per day and generating three to four summary sentences for each of these news pieces not only become infeasible in the near future but also very expensive and time-consuming. Therefore, we also develop a separate model for automated text summarization which generates summary sentences for a news article. Our model will generate summaries by selecting the most salient sentence in the news article and paraphrase them to shorter sentences that could represent as a summary sentence for the entire document.en
dc.description.degreeDoctor of Philosophyen
dc.publisherVirginia Techen
dc.rightsIn Copyrighten
dc.subjectText Summarizationen
dc.subjectPredictive Modelingen
dc.subjectDeep learning (Machine learning)en
dc.subjectTransfer Learningen
dc.subjectReinforcement Learningen
dc.titleAddressing Challenges of Modern News Agencies via Predictive Modeling, Deep Learning, and Transfer Learningen
dc.typeDissertationen Science and Applicationsen Polytechnic Institute and State Universityen of Philosophyen


Original bundle
Now showing 1 - 1 of 1
Thumbnail Image
4.29 MB
Adobe Portable Document Format