CS 6604 Digital Libraries Spring 2017 Final Report Sentiment and Topic Analysis Team Members: Abigail Bartolome Matthew Bock Radha Krishnan Vinayagam Rahul Krishnamurthy Project Advisor: Prof. Edward A. Fox May 3, 2017 Virginia Tech, Blacksburg, VA 24061 Abstract The IDEAL (Integrated Digital Event Archiving and Library) and Global Event and Trend Archive Research (GETAR) projects have collected over 1.5 billion tweets, and webpages from social media and the World Wide Web and indexed them to be easily retrieved and analyzed. This gives researchers an extensive library of documents that reflect the interests and sentiments of the public in reaction to an event. By applying topic analysis to collections of tweets, researchers can learn the topics of most interest or concern to the general public. Adding a layer of sentiment analysis to those topics will illustrate how the public felt in relation to the topics that were found. The Sentiment and Topic Analysis team has designed a system that joins topic analysis and sentiment analysis for researchers who are interested in learning more about public reaction to global events. The tool runs topic analysis on a collection of tweets, and the user can select a topic of interest and assess the sentiments with regard to that topic (i.e., positive vs. negative). This report covers the background, requirements, design and implementation of our contributions to this project. Furthermore, we include a user manual and a developer manual to assist in any future work. Contents Abstract i List of Figures iii List of Tables iv 1 Introduction 1 2 Literature Review 2 2.1 Textbook . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2 Papers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2.1 Topic Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 2.2.2 Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 2.2.3 Lexicon-Based Methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 2.2.4 Sentiment-LDA Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 3 Requirements, Design, and Implementation 6 3.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.1 System Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 3.1.2 User Interface Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.1 System Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.2 User Interaction Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.2.3 Design Deliverables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 3.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 4 Lexicon Based Sentiment Analysis 11 4.1 Lexicon Based Sentiment Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.1 Analysis of Tweet structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.2 Calculation of sentiment using lexicon . . . . . . . . . . . . . . . . . . . . . . . . . 11 4.1.3 Lexicon Used . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 4.1.4 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 4.1.5 Our Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 5 User Manual 17 5.1 Scala User Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 i Contents ii 6 Developer Manual 23 6.1 Scala Interface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.1.1 Scala UI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.2 Interfacing with Twitter Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 6.2.1 Submitted Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 7 Plan 28 7.1 Team member specializations and responsibilities . . . . . . . . . . . . . . . . . . . . . . . 28 7.2 Timeline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28 8 Future Work 30 Bibliography 32 Appendix A: Labeled Emojis 34 List of Figures 3.1 The planned flow of our system’s tools . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 3.2 The planned flow of user interaction with our system’s tools . . . . . . . . . . . . . . . . . 8 4.1 Dependency tree of an English sentence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 4.2 Tree data structure of dependency tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 5.1 Topic analysis UI- start screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.2 Topic analysis UI- first round results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5.3 Topic analysis UI- second round results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.4 Topic analysis UI- third round results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.5 Topic analysis UI- fourth round results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 5.6 Updated topic results window design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 6.1 Project in Scala Eclipse IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 6.2 Creating run configuration in Eclipse . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 iii List of Tables 3.1 Emoji’s before and after being processed . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Sample lookup table for positive and negative emoji’s . . . . . . . . . . . . . . . . . . . . . 10 6.1 The fields represented in the Tweet data structure. . . . . . . . . . . . . . . . . . . . . . . . 25 6.2 The functions provided by the Tweet data structure. . . . . . . . . . . . . . . . . . . . . . . 25 6.3 The cleaning done in each step of the system. . . . . . . . . . . . . . . . . . . . . . . . . . 26 6.4 Breakdown of Submitted Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 7.1 Breakdown of roles and interests of each team member . . . . . . . . . . . . . . . . . . . . 28 7.2 Weekly breakdown of work. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 A.1 Labeled emojis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 A.2 Labeled emojis, continued . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 iv Chapter 1 Introduction Over the last decade, social networking has played a major role in archiving global events. Nowadays, many members of diverse communities contribute to archiving events by microblogging (i.e., Twitter). Collections of Internet archives have grown over the last decade with the increase of stream-oriented communication on Twitter. With such large collections of text, linguists need a tool that will help them analyze language trends. The Digital Library Research Laboratory has a collection of webpages and over 1.5 billion tweets that were collected from the Integrated Digital Event Archiving and Library (IDEAL) and Global Event and Trend Archive Research (GETAR) projects [7] [6]. This collection began in 2007 and is continuously updated as global events occur and become the subject of tweets. The tweets are grouped into collections based on the real-world events for which they are tweeted. Linguists can assess topics of interest within a community by running topic analysis on a collection of tweets written by that community. Sentiment analysis can be made on the tweets corresponding to each topic to determine if the community has, for example, more positive or more negative sentiments associated with the topic. The Sentiment Team has built a user-friendly tool that will allow linguists and sociological researchers to find topics of interest within a collection of tweets. The team has produced a workflow that uses Latent Dirichlet Allocation (LDA) to extract topics of interest from a collection of tweets. The user can interact with the topic analysis results in such a way that if LDA yields a topic that the researcher knows is of little interest, the researcher can omit the topic and re-run LDA. The user is also able to select a topic of interest and read through the tweets and filter out specific sentiments that they would like to study. For example, in a collection of tweets regarding the Newtown school shooting, the user can select a topic such as "gun control" and read through the more positive or more negative tweets about gun control. This semester, we have completed an extensive literature review, developed tools to clean tweets, ex- tracted key features from each tweet, run LDA on a collection of tweets, use self-labeled data to build training sets for sentiment classification, built a binary sentiment classifier, performed lexicon based senti- ment analysis, and created a GUI for the topic analysis component of our tool. For our final product, we integrated all of the aforementioned components into a system with a user interface. 1 Chapter 2 Literature Review 2.1 Textbook Chapter 17 of Text Data Management and Analysis [25] focuses on topic analysis and was a very helpful guide to Latent Dirichlet Allocation. We define a topic as a main idea discussed in a document, which is represented as a distribution of words. We can look at the task of topic analysis as having two distinct parts: discovering topics and seeing which documents fit which topics. Topics can be defined as terms such as science or sports; when defined this way, we can see the occurrence of the terms within the document. We can score the terms we choose as topics by using TF-IDF weighting so that the topics chosen will be terms that are fairly frequent but not too frequent. Another way to represent the topics is as a word distribution that allows the topics to be more expressive and deal with more complicated topics. This is normally done by making a probability distribution over the words. By using LDA we represent the topics that are representative of this set as a probability distribution. Additionally, LDA can be used as a generative model to apply to new unseen documents. Chapter 18 of Text Data Management and Analysis [25] gives an overview of opinion mining and senti- ment analysis, as well as two sentiment mining algorithms that can be used in different situations. The first mines for sentiment polarity (on a 1-k scale from negative to positive) and the second looks for latent aspects of opinionated text where the sentiment polarity is already known. The chapter also gives good definitions for what constitutes an opinion and which of its components are most important. Specifically, it says that opinions consist of three components: the holder of the opinion, the subject of the opinion, and the content of the opinion, from which context and sentiment can be mined. The chapter also explains an approach to sentiment analysis that considers it as a classification problem, where you build classifiers for each level of polarity and analyze from most to least positive. The other useful piece of information is the discussion of feature selection in the context of human-written text. It gives a few examples of features like n-grams of varying sizes, syntactic and semantic word classes, and recognized name entities. 2.2 Papers 2.2.1 Topic Analysis Latent Dirichlet allocation is a popular topic modeling approach in natural language processing, so we decided to read “Latent Dirichlet Allocation” by David M. Blei, Andrew Y. Ng, and Michael I. Jordan [3]. It was useful in giving us an in-depth understanding of how LDA works. We define a topic to be a distribution of words. LDA is a probabilistic model of a corpus and treats the documents as random mixtures over latent topics. LDA uses a three-level model with a topic node that is repeatedly sampled, thus allowing documents 2 2.2. Papers 3 to be associated with multiple topics. 2.2.2 Sentiment Analysis Aspect extraction is an important component of sentiment analysis. Because we would like to find the sentiments of specific topics within collections of tweets, we looked to "Aspect Extraction with Automated Prior Knowledge Learning" by Zhiyuan Chen, Arjun Mukherjee, and Bing Liu [12]. Many unsupervised topic models extract latent topics in collections of documents, where a topic is a distribution of words. Unfortunately, these unsupervised topic models do not always produce coherent topics that are intuitive to humans. This paper describes a method of automating prior knowledge learning by mining prior knowledge from a large corpus of relevant data. The research hypothesized that this prior knowledge learning could be attained by focusing on the aspects shared across multiple domains. Since our project focuses on processing documents from Twitter, our documents are no longer than 140 characters. Since authors of tweets have to communicate their ideas in 140 characters or less, it can be difficult for tweet authors to use standard language practices (i.e., slang and abbreviations are more common in tweets than other traditional documents). This adds a challenge to tweet sentiment classification, because it is difficult to to use standard natural language processing techniques. This challenge led us to reading "Enhanced Twitter Sentiment Classification Using Contextual Information" [22]. This paper studied the relationship between tweet metadata (e.g., author, time tweet was written, location) and tweet sentiment. The study hypothesized that "people are generally happier on weekends and certain hours of the day, more depressed at the end of summer holidays, and happier in certain states of the United States" [22]. While many of our tweets are reactions to real-world events, it could be interesting to use our tweet metadata to explore correlations between sentiment and tweet metadata, and compare our results to this hypothesis. Sentiment Word Identification (SWI) is a sentiment analysis technique that can determine critic opin- ion, tweeter classification, review summary, etc. Most sentiment analysis strategies employ seed words to determine positive or negative sentiments within a document. However, studies have shown that using seed words can be unreliable across multiple domains, and that missing a key word in a set of seed words can inhibit sentiment analysis performance. "Identifying Sentiment Words Using an Optimization-based Model without Seed Words" proposed a method of SWI, called WEED, that identifies sentiment words without seed words [24]. It exploits a phenomenon, referred to as "sentiment matching", that means that the polarities of a document and its most component sentiment words are the same. That is to say that if a word is found mostly in positive documents, it likely is a positive word; if a word is found mostly in negative documents, it is a negative word. This process does require pre-labeled documents, so it could be difficult to include across tweet collections of different domains. Since tweets can not be longer than 140 characters, it would be beneficial to have a large-scale sen- timent lexicon from Twitter that could be adapted to any collection of tweets in the GETAR project. We read "Building Large-Scale Twitter-Specific Sentiment Lexicon: A Representation Learning Approach" to understand how to handle the niche language used in Twitter documents [20]. A sentiment lexicon is a list of terms, such as "excellent" and "terrible", where each term is "assigned a positive or negative score reflecting its sentiment polarity and strength" [20]. The paper proposes employing a representation learning approach to build a large-scale sentiment lexicon from Twitter. The proposed method involves finding the continuous representation of phrases and using them as features for sentiment classification. The method then adds these phrases to a smaller list of seed words to collect training data for sentiment classification. Another particularly interesting approach was to use emoticon-annotated tweets as a self-labeled train- ing set. "Positive, Negative, or Neutral: Learning an Expanded Opinion Lexicon from Emoticon-annotated Tweets" suggested a supervised framework for expanding a sentiment lexicon for tweets [4]. This method- ology treats tweets with emoticons as self-labeled data, where a tweet with ":-)" is a positive document, and a tweet with ":-(" is a negative document. Each term in the lexicon has a probability distribution to describe 2.2. Papers 4 how positive, negative or neutral the term is. For example, "terrible" is more negative than "unsatisfactory". Each entry in the lexicon is associated with a corresponding part-of-speech so as to differentiate homographs with different parts of speech. 2.2.3 Lexicon-Based Methods Lexicon based techniques take advantage of annotated lexicons. Entries in lexicons are preassigned a polar- ity score. Lexical entries form the building blocks of tweets, so from the known polarity scores in lexicons we can predict the polarity of tweets or any target entities within the tweet. One of the drawbacks of the lexicon based techniques is that they do not consider the context in which the word is used. The sentiment polarity of a word may be different in the presence of other terms. Contextual polarity of the word should be considered while deciding the sentiment of the overall tweet. SentiCircle [17] is one of the lexicon based techniques, which builds a dynamic representation of words that consider the contextual semantics. This scheme uses external lexical resources such as sentiwordnet or MPQA [23]. However, the sentiment values of each term are not static. The sentiment value changes according to the contextual vector, which comprises of all terms that co-occur with the target term. The paper introduces a degree of correlation metric which influences the sentiment value of terms. The scheme allows measurement of impact of context words on the sentiment orientation and on the sentiment strength of a target-word separately. This paper handles negation by reversing the sentiment value of terms that are near to a negation term. Finally, to compute entity-level sentiment polarity, a median of all the sentiment values of terms that occur together with the target entity in tweets is calculated. In order to measure tweet level sentiment, two approaches are proposed. In the median method, first the sentiment median of all terms that appear in that tweet is computed. Then a median of all those sentiment medians is computed. The final median value is used to predict the sentiment of the tweet. The second technique proposed in the paper to find a tweet’s sentiment value is called the pivot method. In the pivot method, the terms tagged as common noun, proper noun and pronoun are considered pivot terms. One of these terms is the target of sentiment expressed in the tweet. For each sentiment label, the sentiment impact that each pivot term receives from another is accumulated. The sentiment label with the highest value is assigned as the sentiment of the tweet. Natural language processing (NLP) is another field which has proven to be effective in extracting in- formation from tweets. In [15], work has been done to leverage Twitter for providing people necessary information during a natural disaster. In [15], NLP has been successful in creating a system which enabled quick response during natural disasters. Important information such as person names and locations were fil- tered from a highly unorganized and difficult-to-process information source using NLP techniques. To make effective use of Twitter, a labeled corpus of tweets was created which would help in information extraction of unlabeled tweets. NLP techniques such as word segmentation, morphological analysis, part-of-speech tagging and name entity recognition were very important in building this Twitter based disaster response system. This work shows that NLP has the potential to help understand tweets and it would be interesting to explore its potential in the field of Twitter sentiment analysis. Performance of supervised sentiment classifiers depends on amount of training data and hence they require a large amount of annotated data to obtain higher performance. Moreover, these classifiers are domain dependent. They need to be re-trained on a different data set in order to do well in a separate domain. This forms the motivation to explore unsupervised classifiers. One such unsupervised classification scheme proposed in [21] employs a metric called semantic orientation of a phrase to identify polarity of movie reviews. A movie review is classified positive if the average semantic orientation of the phrases which contain an adjective or adverb in the review is positive else that review is predicted as negative. The scheme makes use of reference words: one each for positive and negative reference. The semantic orientation of a word is then found to be either closer to positive reference words or negative reference 2.2. Papers 5 words. More specifically, semantic orientation was computed by using the equation given below. SO(p) = PMI(p, excellent)− PMI(p, poor) (2.1) Here p is a phrase whose semantic orientation is being computed. A positive seed word is ‘excellent’ and negative seed word is ‘poor’. PMI refers to pointwise mutual information. The technique was applied to 410 reviews and obtained an accuracy of 74%. 2.2.4 Sentiment-LDA Model Sentiment analysis of text data, especially Twitter content, is a well-researched topic. Since the sentiments of words are also domain specific, in the past decade, many researchers have explored how to combine the approaches of topic analysis and sentiment analysis. Li et al. [11] have developed a joint Sentiment-LDA model, by adding a sentiment layer to the regular LDA model. The regular LDA model has three layers: document layer, topic layer, and word layer. Sentiment-LDA model is a four layer topic model which contains a sentiment layer between the topic layer and the word layer. In this way, they are associating words in the documents with both topics and sentiment labels. The authors claim that this model will also be helpful in determining the sentiment of the sub-topics in a document or a collection. Chapter 3 Requirements, Design, And Implementation 3.1 Requirements This section outlines the requirements that we laid out for our project after much planning and discussion with our team members, Dr. Fox, and the rest of the class. We divided our system requirements into three components: A tool that successfully performs topic analysis, a tool that successfully performs sentiment analysis, and a system that efficiently uses the results from topic analysis to yield focused sentiment analysis within topics on the tweet level. We also identified requirements related to user interaction with our system. These requirements included a way to interface with the system without needing programming knowledge, the ability for the user to refine results if they are looking for something specific, and the ability to let the system run automatically and find general results if the researcher does not know what topics to look for or is generally surveying the collection of tweets. 3.1.1 System Requirements Topic Analysis The first of the three requirements for our system was to be able to run topic analysis on an arbitrary collection of tweets. We were able to use LDA [3] to extract the most popularly discussed topics in a collection. This tool has successfully been used to extract topics from several collections, but it was refined to handle collections of larger size. The next major step in satisfactorily extracting topic models from a collection of tweets was that the user be able to interfere if they encountered a topic they they believed was not meaningful or reflective of the collection. We successfully delivered this requirement, providing the user the ability to eliminate the topic(s) that are not meaningful and rerun LDA so that the results are more useful to the user. Sentiment Analysis The second of the three requirements for our system was to have a tool that could perform sentiment analysis on an arbitrary collection of tweets about a specific event or topic. Our primary goal to achieve from this requirement was to be able to run polarity analysis on all of the tweets and determine whether they expressed positive or negative emotion. We would have liked to move on to full emotion analysis, where we would identify emotions being expressed in the tweets (such as happiness, anger, disappointment, etc.). This would have been an area for users to be able to interact with the system. They could specify emotions that they are 6 3.1. Requirements 7 Figure 3.1: The planned flow of our system’s tools interested in analyzing and tailor the results accordingly. However, during our work on this particular effort, we found some interesting results in calculating sentiment scores using dependency trees. Understanding this technique for sentiment analysis became the focal point of our efforts, and we chose to maintain our initial focus on positive and negative sentiments. This is discussed further in Chapter 4. Combined Sentiment-Topic Analysis A key component of our system was to have topic and sentiment analysis working together. The goal here was for researchers to be able to search for sentiments within specific topics. We set the input to be a tweet collection where every tweet is tagged as belonging to a certain topic (as determined by the LDA approach explained above). The output sentiment-tagged sub-collections of tweets belonging to each topic. Our data flow model is presented in Figure 3.1. We provided functionality for users to interact with the system to guide the process along, either by focusing the results towards something specific that they are looking for or applying their domain knowledge to help enrich the process. More specifically, the users are able to intervene throughout the topic analysis process by disregarding topic words that they determined are not meaningful to the domain. The user may also choose not to intervene in the system, and allow LDA to determine topics uninterrupted. This flexibility enables users to be able to search for specific topics and sentiments that they are interested in, or get a broad set of results if they are interested in general behavior. Figure 3.2 shows a modified version of our tool flow which includes optional user interaction. 3.1.2 User Interface Requirements One of the other goals for this project was to provide a way for users who do not have knowledge of programming to be able to run topic analysis on data sets that interested them. This made the tool very useful to researchers without computational backgrounds, who were still interested in leveraging the GETAR and IDEAL projects to perform research in their own research areas. 3.2. Design 8 Figure 3.2: The planned flow of user interaction with our system’s tools 3.2 Design Our design aimed to meet all of our requirements effectively and efficiently. Section 3.2.1 explains how our system flow works, and it is illustrated in Figure 3.1. Additionally, we drafted how our users can interact with the system, as described in Section 3.2.2 and illustrated in Figure 3.2. 3.2.1 System Flow 1. Perform topic analysis on a cleaned collection of tweets. 2. Optionally refine topic analysis results by choosing topics to discard (re-run with more focus). At this stage the user will be able to interact with the workflow. 3. Perform sentiment analysis within sub-groupings of tweets that are within topics of interest. 3.2.2 User Interaction Flow 1. User selects a collection for using our linguistic analysis tool. 2. Optionally reject resulting topics produced by the tool and rerun. Note that the user can be satisfied with the resulting topics and continue on with the tool’s flow. 3. View the tweets by topic and sentiment. 3.2.3 Design Deliverables We successfully built a tool that processes datasets and performs topic analysis using LDA. The tool also allows users to exclude words that are not of interest. We are able to produce k topics where k is deter- mined by the user, as well as a result file containing each tweet’s ID and its corresponding topic probability 3.3. Implementation 9 distributions. We have also developed a sentiment analysis tool that can extract tweets that contain the ":)" emoticon, label those tweets as a positive training dataset, and so classify tweets that are positive. Similarly, we have a sentiment analysis tool that can extract tweets that contain the ":(" emoticon, label those tweets as a negative training dataset, and so classify tweets that are negative. We learned about this process of using self-labeled data through emoticon-annotated tweets through Bravo-Marquez, Frank, and Pfahringer’s paper [4] on expanding lexicons through emoticon-annotated tweets. Our efforts built on this idea by using tweets with emojis as self-labeled tweets. We labeled selected emojis as being positive or negative, as demonstrated in Appendix A. We have a script that extracts tweets with the emojis of interest and places them into respective CSV files for positive and negative tweets. We integrated the workflow used in this script with the sentiment analysis tool that we created. We implemented a basic Scala user interface for topic analysis. Using the GUI, a user can input all the required parameters for LDA. We also designed a basic interface to present the LDA results to the user. Through this interface, the user will be able to tweak the model to exclude irrelevant topics if any. We were able to successfully integrate it with our current LDA wrapper code. 3.3 Implementation Our project was implemented using the tools provided by the Apache Hadoop project [1]. Specifically, we used Spark for massively parallel data processing, and HDFS for distributed data storage. We tried to restrict ourselves to these tools because they are supported by the cluster maintained by the Digital Library Research Lab (DLRL). We had intended to use the cluster for all of our data processing, and we had used it for testing our topic analysis and one of our sentiment classifiers. However, because we wanted to include a GUI component, we used a Cloudera virtual machine for our GUI development and testing. Our project is partly based on API code that was developed for Matthew’s thesis project. A large part of his thesis is focusing on enabling developers to more easily access data that is stored on the cluster. The tool provides us with data structures that will abstract away the complex Spark code that is involved in reading data off of the cluster, decoding it from binary format into readable text, parsing the fields, and storing them in a collection to be processed. In addition, the tool provides us with a large suite of functions that we can use to clean, filter, and otherwise modify the data set to suit our needs. Finally, it also includes wrappers for, among other things, Spark’s LDA implementation that take the tweet collection data structure and run LDA analysis on it. These three components are the parts that are of use to us. We built tools that could stand on their own, before integrating them with each other. We built a stan- dalone sentiment classification component, a standalone topic analysis component, and a prototype user interface. The sentiment classification component works by extracting tweets which contain a positive or negative emoticon (currently either ":)" or ":(" respectively) and building a word2vec vector space with them using Spark’s word2vec implementation. We then use the word2vec vector space as training data for a lo- gistic regression classification model. This model will use the training data to classify new tweets as either positive or negative, and return the tweet collection with each tweet labeled as positive or negative. The topic analysis component uses Matthew’s library to run LDA topic analysis on the tweet collection, and returns a list of topics and the most likely terms in each of those topics. The user interface is implemented in Scala using the Swing library. We are currently working on integrating the wireframe GUI with the functional back-end code we have developed. The tweets extracted from the database had emojis that were encoded in a UTF-8 format that contained non-alphanumeric codes. We wrote a script to convert emojis in that format to the standard UTF-8. Both the raw and formated emojis are shown in Figure 3.1. Figure 3.2 shows a part of a lookup table that was used to identify positive and negative emojis. The intent was that we would incorporate this extraction method 3.3. Implementation 10 Table 3.1: Emoji’s before and after being processed Table 3.2: Sample lookup table for positive and negative emoji’s with our sentiment classifier. However, it became evident that the self-labeled data would not suffice for developing "not negative" and "not positive" training sets. Our tests showed an alarming number of false- negatives, so we took another approach in our sentiment analysis, which we discuss in Chapter 4. Chapter 4 Lexicon Based Sentiment Analysis 4.1 Lexicon Based Sentiment Analysis In lexicon based sentiment analysis, we employ one or more sentiment dictionaries to perform sentiment analysis. The performance of this technique depends not only on the quality of the sentiment lexicon used, but also on how this lexicon was used in performing sentiment analysis. The task of performing sentiment analysis using a lexicon can be subdivided in two sub-tasks as follows: 1. Analysis of tweet structure 2. Calculation of sentiment using lexicon 4.1.1 Analysis of Tweet structure In this task, we look for specific features in the tweet that have influence in deciding the sentiment of the overall tweet. After these features are identified, the lexicon is then used to find the sentiment type and raw score of features. The scores are raw and can be further modified depending on the neighboring words in that tweet. The following are the features that we extracted in a tweet structure: • Negation word (example: couldn’t, wouldn’t) • Polarity reversal words (example: prevent, diminish, subside) • Analysis of syntactic structure of the tweet, by generating its parse tree – Identify head words – Identify the modifiers of head words • Emoticons 4.1.2 Calculation of sentiment using lexicon In this task, we find the scores for the n-grams that were extracted from the tweet. The way we combine the individual scores to compute the overall score determines the accuracy of sentiment analysis. A naïve approach to perform this task is to find the overall sentiment of the tweet by adding the sentiment score of individual unigrams. This approach will have the least accuracy since it does not consider the structure in which different words in the tweet are connected to each other. In a sentence structure, the sentiment score of individual words vary depending on the word’s proximity to a special category of words. Examples 11 4.1. Lexicon Based Sentiment Analysis 12 Figure 4.1: Dependency tree of an English sentence of special categories of words include negation words and intensifiers. Once we determine the technique to compute an overall sentiment, we have to then set thresholds for positive and negative sentiments. A tweet is classified positive if the overall score of that tweet is above the set positive threshold. If the overall sentiment score of the tweet is below a negative threshold then it is classified as a negative polarity tweet. A neutral tweet will have a score between the positive and negative threshold. Following are score computation approaches that we followed: • Add sentiment scores of each token in a tweet. The result of this addition will give the sentiment score of the overall tweet. • Computation of score based on the dependency of words in a parse tree. Sentiment Score Using Dependency Tree A dependency tree of a sentence represents the connection between words in a sentence. The tree consists of two types of nodes. One node is called the head and the other node is called the modifier of head node. The modifier node is also called the dependent node. Figure 4.1 shows an example of a dependency tree. In Figure 4.1, the word "bombings" is a head node of words "the" and "marathon". After forming the dependency tree, we compute the sentiment score of the sentence based on the con- nection of words in the tree. In the literature, we found two approaches which have been used to compute sentiment scores from a dependency tree. These rules are as follows: 1. Voting with Polarity reversal [14] In this rule we compute the polarity of each word from a lexicon. We then reverse the polarity score of a word if there are an odd number of polarity reversal words in the ancestor list of that word. For example, in Figure 4.1, if we want to compute the polarity score of a word "bombings", then we will first look up the word "bombing" in a lexicon. Then we will compute a list of ancestors for the word "bombing", which in this case consist of words: {"by","effected","to","go"}. If any of the ancestor words are a polarity reversal word, then we will reverse the polarity score of word "bombings". Mathematically this rule can be written as : polarity = sgn( n∑ i=1 scorei ∏ j∈Hi (−1)rj ) (4.1) Here, polarity refers to the polarity of the whole tweet. scorei refers to the polarity score of ith word 4.1. Lexicon Based Sentiment Analysis 13 Figure 4.2: Tree data structure of dependency tree node in a tree. Hi means the ancestor list of word node i. rj is 1 if the jth word in the ancestor list is a polarity reversal word. In all other cases the value of rj is 0. 2. Deterministic Rule [14]: The polarity of a subjective sentence is deterministically decided based on rules, by considering the sentimental polarities of dependency subtrees. The polarity of the dependency subtree whose root is the ith word node is decided by voting the prior polarity of the ith word node and polarities of dependency subtrees whose root nodes are the modifiers of the ith word node. Also, the polarities of the modifiers are reversed if their head node has a reversal word. Based on the above rule we compute the polarity of subtrees recursively starting from the leaf node till we reach a root node. The polarity of the tweet is then determined by the polarity of the root node. This rule is mathematically represented as follows: polarity_score_i = sgn(root_i ∑ j:hj=i polarity_score_j(−1)ri) (4.2) Here we are computing polarity of the ith subtree which is denoted by polarity_score_i. The term root_i represents the polarity score from the lexicon for node i which is the root of this subtree. Here, hj represents the head of the jth node. Figure 4.2 represents the data structure that we are using to store the dependency tree. Each node in the tree stores the word, a pointer to head node, a list of ancestors and polarity score for that node. This polarity score is the score of the dependency subtree which is rooted at this node. 4.1.3 Lexicon Used In order to create a sentiment analysis tool for Twitter, we needed a social media specific lexicon. Content observed on Twitter and Facebook poses serious challenges to practical applications of sentiment analysis. Contextual sparseness resulting from shortness of the text and a tendency to use abbreviated language con- ventions to express sentiments makes sentiment analysis using standard English lexicons difficult. In our 4.1. Lexicon Based Sentiment Analysis 14 initial experiments, we employed Sentiwordnet [2] and found that the sentiment of most of the tweets were classified as neutral. We got such a result because the words that are used in Twitter were not found in Sentiwordnet and hence the polarity score of such words was considered to be neutral (a score of zero). This observation prompted us to search for lexicons which have Twitter specific words. We found VADER [10] which uses a high quality lexicon specifically made for sentiment analysis of text in social media. 4.1.4 Experiments Our objective was to determine whether or not we could use dependency parse trees to determine the senti- ment of tweets. We started our work by going through rule based sentiment analysis that was given in [14]. This paper [14] focuses on polarity reversal words and hence we started our experiment by analyzing such words.We used polarity reversal words that were given in General Inquirer [8]. We integrated VADER’s lexicon, polarity reversal words and the sentiment analysis rules with dependency parse tree to determine the sentiment polarity of tweets. We focused on very limited words because we were testing the polarity of tweets manually. We focused on words such as ‘depression’, ‘anxiety’, and ‘stress’. These words have negative sentiment scores. Then we started looking for tweets which had these negative words along with polarity reversal words like ‘abate’, ‘diminish’, ‘reduce’ and ‘decrease’. The following are some of the tweets and the corresponding output of VADER: 1. "study shows a significant decrease in depression after taking psilocybin" • Vader score = -.4404 2. "CBD help reduce depression" • Vader score = -.25 3. "Learning math games to decrease math anxiety." • Vader score = -.1779 4. "When people are doing what they love, depression and anxiety will decrease." • Vader score = -.0516 5. "Singing helps reduce feelings of depression and anxiety, increases oxygen to your lungs." • Vader score = -.4215 6. Escape to nature, even if just for a 30 minute walk.. it will greatly lower your stress levels and reduce risk of depression • Vader score = -.309 7. Listening to music for an hour every day can reduce chronic pain by up to 21% and depression by up to 25% • Vader score = -.7906 The observation that we found in the above tweets was that when we have only polarity reversal words in tweets with a negative sentiment word, then the output of VADER is different from the expected value. This list of tweets is not sufficient to draw any concrete conclusion, and hence we cannot make any claims 4.1. Lexicon Based Sentiment Analysis 15 about the accuracy of VADER. Our objective of such a test was to find a category of tweets for which it is difficult to predict sentiment. The following are tweets that we tested further and found that when we have a negation word in close proximity or higher positive polarity word close to a polarity reversal word, then these words dominated in deciding the polarity of tweet. 1. "I know but that doesn’t help abate the anxiety spiral that has me nauseous and upset." 2. My brain hurts. Anxiety probably won’t abate until after Monday. 3. At School for the Dogs, Halloween cone craft workshop helps abate doggy costume anxiety " 4. My claustrophobic and lacerating loops of extreme anxiety, fear, and worry typically do not ever abate. 5. How my gratitude journal helped abate my ANXIETY! 6. "Thanks to the @united woman who helped me get a window seat on an almost full flight to help abate my flight anxiety." 7. Bless Frank Ocean’s music for its ability to abate my anxiety. 8. Can’t wait to get to the gym. Hoping it will abate this anxiety. 9. Even free cookies can not abate the anxiety of startups in post-Brexit London http://wapo.st/29lga1E @ylanmui 10. Making Weight: Exercise, medicine abate anxiety 11. U.S. Gas Pipeline Capacity Remains Short, But Anxiety Levels abate 12. I hate anxiety 13. You will fall asleep faster, if you sleep next to someone you love. This way you can also reduce depression. We found that when we applied the dependency tree generated by Syntaxnet [9], the rules failed to give accurate sentiments in certain tweets. This may be due to the parse tree structure that is generated by Syntaxnet. The parse tree that was used in research paper [14] to test these rules was different. For tweets we can have more than one parse tree possible and hence we have probabilities associated with each parse tree that is generated. In this work we focused on using parse trees generated by Syntaxnet and hence we modified these rules so that they can be applied with our parse tree. The following section gives details of our approach to identify sentiment polarity of tweet using Syntaxnet parse trees. 4.1.5 Our Approach We found that the sentiment of tweets which have negation in leaf nodes of a parse tree could not be predicted correctly by these rules because the rules only search for polarity reversal word and negation words in the parent node. Secondly, we found that when we have two independent phrases in a tweet, the parse tree generated by Syntaxnet could not be integrated with these rules. For example, the tweet : "Conversations reduce stigma and increase understanding" gave a negative sentiment polarity, when we used rules. On further analysis of that tweet we found that the word ‘reduce’ is incorrectly connected as the head node of the word ‘increase’ in the parse tree. Since word reduce is a reverse polarity word, it changes the 4.1. Lexicon Based Sentiment Analysis 16 polarity of word ‘increase’ from positive to negative and hence we get a negative score as the sentiment of this tweet. We came to the conclusion that we can not apply these rules in a straight forward manner with Syntaxnet. Our approach was as follows: We wanted to detect two independent clauses from a parse tree. We can do this by checking the nodes connected by a conjunction. If these nodes are leaf nodes, then we are sure that the conjunction is connecting two words and not two clauses. If the conjunction was connecting two clauses then we decided to do further analysis. The objective of our analysis was to find whether or not to reverse the polarity of a word1 that is connected by a word2. The word1 and word2 belonged to two different clauses in a sentence. We came up with the following modified rule : Rule: Do not reverse the polarity of a word due to a polarity reversal word in head node, if 1. The child is a verb and accompanied by a subject in its neighbor 2. The child has an object as its dependent Also if there is a ‘neg’ part of speech word in a sub-tree then we would reverse the polarity of all the words in that sub-tree. Integration of rule with Syntaxnet In the topic analysis stage we get tweets for various topics in text files. We provide these text files as input to Syntaxnet. Syntaxnet creates a parse tree for each tweet and stores it in a data structure that is shown in Figure 4.2. We parse this data structure by Depth-first scheme and simultaneously apply the rules to find the value of ‘Polarity_score’ attribute for each node in the tree data structure. Finally, the polarity of the tweet is given by the ‘Polarity_score’ attribute value of the root node. Chapter 5 User Manual 5.1 Scala User Interface Scala’s user interface for topic analysis provided us with a way to give users the ability to run LDA without using the command line. Additionally, it also helped users to interact with the underlying topic model. To run the LDA program, the user needs to select the input data file, specify various parameters (i.e., number of topics, number of iterations, etc.). All of these parameters can be specified through the UI win- dow given in Figure 5.1. Results from the topic analysis are presented to the user through a separate window, as given in Figure 5.2. In the results window, the user can also select words which are irrelevant and re-run the LDA to modify the underlying topic model. To do this, the user should highlight the words that are not meaningful, press the "»" button, so moving the word out of the topic words box (bottom left box). Once the user is satisfied with the words in the topic words box, the user can re-run the LDA by clicking the "Re-Run" button. Once the user is satisfied with the LDA results, the user can click the "Finish" button. Thus, using our topic analysis interface, a user can interactively create a topic model according to his/her preferences. Figures 5.3, 5.4, and 5.5 illustrate this process of topic refinement. Since demoing the tool for the screenshots shown in the aforementioned figures, we have made a design enhancement to label the bottom two boxes in the results window. The bottom left box shows the topic words that were part included in the word distributions for each topic. The bottom right box is for the words that should not be included in the topics’ word distributions. Figure 5.6 shows this enhancement. 17 5.1. Scala User Interface 18 Figure 5.1: Topic analysis UI- start screen Figure 5.2: Topic analysis UI- first round results 5.1. Scala User Interface 19 Figure 5.3: Topic analysis UI- second round results 5.1. Scala User Interface 20 Figure 5.4: Topic analysis UI- third round results 5.1. Scala User Interface 21 Figure 5.5: Topic analysis UI- fourth round results 5.1. Scala User Interface 22 Figure 5.6: Updated topic results window design Chapter 6 Developer Manual 6.1 Scala Interface To implement a Scala interface, we could either have used the ScalaFX library [19] or the Scala Swing package. We chose to use the Scala Swing framework to implement the Topic Analysis UI. Scala Swing is a GUI toolkit written in Scala and based on the Java Swing library. It provides wrappers around the Java Swing classes. Thus, to someone who has basic knowledge in Scala and exposure to the Java Swing library, the Scala Swing framework is ideal choice to develop GUIs in Scala. For the GUI programming in Scala, it is beneficial to use an IDE, such as the Scala Eclipse plugin. It enables the developer to run the interfaces interactively. Here is a list of resources that provide a good introduction to Scala GUI programming: • Introduction to scala.swing [13] • Programming in Scala - Chapter 32 [16] • Scala Swing library functions - [18] 6.1.1 Scala UI Before running the Scala UI project given in our code repository, follow these preparation steps: • Ensure that JDK8 is installed in your system. If not, you can download it from: http://www. oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151. html • Install Scala Eclipse IDE from http://scala-ide.org/download/sdk.html The Scala UI code was developed and tested in Scala 2.11 and with a JDK 1.8 compiler. To run the code, from our repository, import the Scala interface project into your Eclipse projects. Then, open this project in your Scala Eclipse IDE (Figure 6.1). Before running the project, create a run configuration that points to the .scala file/class name (Figure 6.2). Once everything is set up, you can run the code by pressing the run button in Eclipse. 6.2 Interfacing with Twitter Data Matthew’s thesis framework provides us with a simple and straightforward way to interface with Twitter data stored on the cluster, clean the data, filter the data, and run several different analyses on the data. 23 6.2. Interfacing with Twitter Data 24 Figure 6.1: Project in Scala Eclipse IDE Figure 6.2: Creating run configuration in Eclipse 6.2. Interfacing with Twitter Data 25 Table 6.1: The fields represented in the Tweet data structure. Table 6.2: The functions provided by the Tweet data structure. Excerpts from his technical documentation that were relevant to our project are replicated below, with some discussion about how we used the framework to our advantage. Data I/O We made use of the framework to handle reading data out of the Avro files stored on the cluster and our own local test files. The framework handles pulling the data, decoding it from the binary format, and turning it into collections of Tweet data structures. The Tweet data structure is a data structure designed to represent a tweet pulled from Twitter by the tools used in the Digital Library Research Lab. It contains fields to represent each of the types of data stored on the cluster in the Avro files, as well as an extra payload field to hold arbitrary extra data (e.g., topic labels, sentiment tags, etc.). It also breaks the tweet text into an array of tokens for future convenience. The fields are detailed in Table 6.1, and the functions are detailed in Table 6.2. Future developers may wish to leverage the other fields and functions provided here when designing a new approach to the system. The framework handles creating these Tweet data structures and storing them in TweetCollection data structures. There is no need for the developer to interface with Spark at all here. Instead, he or she can simply specify a data source to the appropriate TweetCollectionFactory function and it will handle the construction of the data structure. 6.2. Interfacing with Twitter Data 26 Table 6.3: The cleaning done in each step of the system. Data Pre-Processing The framework provides a suite of ways to clean, filter, and otherwise modify the tweet collection in ways that will be beneficial to the kinds of digital libraries research that will need this kind of data structure. Cleaning is a very broad topic, and it is likely that future developers may need to clean the data differently to suit their needs. Currently, we have separate cleaning functionalities for the topic analysis and sentiment analysis portions due to the needs of the implementations. These functions are defined in small container classes at the top of the GUI classes in the MainWindow.scala file. All of the cleaning functionalities utilized by these cleaning functions are provided to us by the framework. Our cleaning approaches are summarized in Table 6.3. Topic Analysis For the topic analysis portion of our project, we are using LDA. We make use of the framework’s LDAWrapper to run Spark’s built-in LDA tool on the tweet collection we have created and cleaned. LDAWrapper runs LDA topic analysis on the collection using a set of default parameters. These parameters include the num- ber of iterations to run, the number of topics to search for, which optimizer to use, and a set of terms to ignore. The parameters can be customized before LDA is run, but we are currently using all of the default parameters. We make use of this flexibility in our user interface by changing the parameters based on user input. If future developers chose to use a different algorithm for topic analysis, it would require a lot of re-implementation, including radically changing the user interface to accommodate interfacing with the new approach. 6.2.1 Submitted Files See Table 6.4 for a description of each file submitted with the project in 6604Files.tar. 6.2. Interfacing with Twitter Data 27 Table 6.4: Breakdown of Submitted Files File Description Sentiment Analysis/ Final_sentiment_analysis.py Reads tweet collection and calls syntaxnet using "first3.sh". Sentiment Analysis/ first3.sh Takes tweet as input and then passes this tweet to syntaxnet. Sentiment Analysis/ parse_tree.py Data structure to represent the file returned by syntaxnet Sentiment Analysis/ reverse_polarity_file Polarity reversal and negation words from General Inquirer. Topic Analysis/ MainWindow.scala Contains the code for our Scala-based User Interface Topic Analysis/pom.xml The Maven build file to launch our project in Eclipse Word2VecSentimentAnalysis.scala Our initial attempt at Sentiment Analysis with emojis AT0412.txt An example test file of Tweet data topics1 Example output file from the Topic Analysis tool topics2 Example output file from the Topic Analysis tool topics3 Example output file from the Topic Analysis tool topics4 Example output file from the Topic Analysis tool Chapter 7 Plan 7.1 Team member specializations and responsibilities Table 7.1 explains all of the overall roles that the team members will play in the project, as well as their areas of interest/specialization that they can lend to the project. 7.2 Timeline Table 7.2 explains a rough weekly breakdown of our work done so far and our plan for the rest of the semester. Name Interests Project Focus Abigail NLP Sentiment Analysis, LDA Matthew Spark, Scala Implementation, Sentiment Analysis Rahul NLP Sentiment Analysis Radha Spark, Scala LDA, UI Table 7.1: Breakdown of roles and interests of each team member 28 7.2. Timeline 29 Weeks End Date Tasks Week 1 21 Jan. Initial class meetings, discussing potential projects. Week 2 28 Jan. Finalization of projects and assignment into teams. Week 3 4 Feb. Discussions within team and with class about scope of project. Week 4 11 Feb. Literature review and identification of available tools. Week 5 18 Feb. Literature review and refinement of scope and approach to project. Week 6 25 Feb. Literature review and refinement of approach to project. Week 7 4 March Summarization of literature review into Interim Report 1. Planning for how to divide up project responsibilities moving into implementation/prototyping phase. Week 8 11 March Initial prototyping and implementation. Work towards segregating positive and negative tweets. Week 9 18 March Spring break. Initial prototyping and implementation. Week 10 25 March First working prototypes/wireframes ready for evaluation and refinement. Week 11 1 April Work towards combining individual functional components (sentiment tool, topic tool, GUI). Week 12 8 April Continues work towards combining individual functional components. First end-to-end prototype should be functional by end of week. Week 13 15 April Prototype refinement. Week 14 22 April Evaluation and preparation for final report and presentation. Week 15 29 April Evaluation and preparation for final report and presentation. Table 7.2: Weekly breakdown of work. Chapter 8 Future Work Our topic analysis cleans the tweets by filtering out stopwords, URLs, punctuation, and RT markers. Adding stemming and lemmatization to our data cleaning for our system’s topic analysis would probably produce more meaningful topics for the researchers using our system. In regard to our sentiment analysis component, adding more labeled data to the emoji-based sentiment classifier could improve its results. This could be done by incorporating event specific hand-labeled data to the system (i.e., hand-labeled data for hurricanes could be used for collections about hurricanes). Addi- tionally, supplementing the emoji lookup table with a pre-defined dictionary that has sentiments associated with specific words could also improve the emoji-based sentiment classifier. For our dependency tree based sentiment classifier, combining name entity recognition with the parse trees to get the polarity of entities could improve its performance. Both classifiers would also benefit from using part of speech tagging. Adding more granular controls, and incorporating sentiment analysis with the user interface would be the next step in assisting researchers without a computing background with their analyses of tweets collected for DLRL. In the future, a more robust user interface could also be created with modern frameworks like Play. Play is a web framework written in Scala to build web friendly applications. 30 Acknowledgements Our team would like to thank and acknowledge Dr. Edward Fox for giving us the opportunity to work on a project that aligned with our research interests, as well as for offering advice and guidance on this project. We would like to acknowledge the Digital Library Research Laboratory, especially those who were involved with the Integrated Digital Event Archiving and Library (IDEAL) and Global Event and Trend Archive Research (GETAR) projects. Thus, we would also like to express our appreciation to the National Science Foundation for funding the IDEAL and GETAR projects (grants IIS-1319578 and IIS-1619028, respectively). Furthermore, we would also like to thank the Topic Analysis teams in CS 5604 in the Spring 2016 and Fall 2016 semesters, as their LDA tool was extremely useful to the success of the project. 31 Bibliography [1] Apache Hadoop project. http://hadoop.apache.org/. Accessed Jan 2017. Apache Software Foundation. [2] Stefano Baccianella, Andrea Esuli, and Fabrizio Sebastiani. “SentiWordNet 3.0: An Enhanced Lexi- cal Resource for Sentiment Analysis and Opinion Mining.” In: LREC. Vol. 10. 2010, pp. 2200–2204. [3] David M Blei, Andrew Y Ng, and Michael I Jordan. “Latent Dirichlet Allocation”. In: Journal of machine Learning research 3.Jan (2003), pp. 993–1022. [4] Felipe Bravo-Marquez, Eibe Frank, and Bernhard Pfahringer. “Positive, negative, or neutral: Learning an expanded opinion lexicon from emoticon-annotated tweets”. In: IJCAI 2015. Vol. 2015. AAAI Press. 2015, pp. 1229–1235. [5] Emojipedia. http://emojipedia.org/. Accessed Mar 2017. Emojipedia. [6] Edward Fox et al. “Global Event Trend and Archive Research (GETAR)”. In: (Nov. 2015). NSF grant IIS-1619028 and 1619371. URL: http://www.eventsarchive.org/sites/default/ files/GETARsummaryWeb.pdf. [7] Edward Fox et al. “Integrated Digital Event Archiving and Library (IDEAL)”. In: (Sept. 2013). NSF grant IIS-1319578. URL: https://www.nsf.gov/awardsearch/showAward?AWD_ID= 1319578. [8] Roger Hurwitz.General Inquirer. 2002. URL: http://www.wjh.harvard.edu/~inquirer/ (visited on 05/03/2017). [9] Roger Hurwitz.General Inquirer. 2002. URL: https://github.com/tensorflow/models/ tree/master/syntaxnet (visited on 05/03/2017). [10] C.J. Hutto and Eric Gilbert. “VADER: A Parsimonious Rule-based Model for Sentiment Analysis of Social Media Text”. In: Eighth International Conference on Weblogs and Social Media (ICWSM- 14). Available at (20/04/16). 2014. URL: http://comp.social.gatech.edu/papers/ icwsm14.vader.hutto.pdf. [11] Fangtao Li, Minlie Huang, and Xiaoyan Zhu. “Sentiment Analysis with Global Topics and Local Dependency.” In: AAAI. Vol. 10. 2010, pp. 1371–1376. [12] Zhiyuan Liu, Arjun Chen, and Mukherjee Bing. “Aspect extraction with automated prior knowledge learning”. In: Proceedings of the 52nd Annual Meeting of the Association for Computational Linguis- tics. 2014, 347–358. [13] Ingo Maier. “The Scala.Swing package”. In: Scala Improvement Process (SID) 8 (2009). URL: https: //www.scala- lang.org/old/sites/default/files/sids/imaier/Mon, %202009-11-02,%2008:55/scala-swing-design.pdf. 32 Bibliography 33 [14] Tetsuji Nakagawa, Kentaro Inui, and Sadao Kurohashi. “Dependency tree-based sentiment classi- fication using CRFs with hidden variables”. In: Human Language Technologies: The 2010 Annual Conference of the North American Chapter of the Association for Computational Linguistics. Asso- ciation for Computational Linguistics. 2010, pp. 786–794. [15] Graham Neubig et al. “Safety Information Mining-What can NLP do in a Disaster.” In: IJCNLP. Vol. 11. 2011, pp. 965–973. [16] Martin Odersky, Lex Spoon, and Bill Venners. Programming in Scala: A Comprehensive Step-by-step Guide. USA: Artima Incorporation, 2008. ISBN: 0981531601, 9780981531601. [17] Hassan Saif et al. “Contextual semantics for sentiment analysis of Twitter”. In: Information Process- ing & Management 52.1 (2016), pp. 5–19. [18] Scala Swing Library. http://www.scala-lang.org/api/2.11.2/scala-swing/. Accessed Mar 2017. [19] ScalaFX Library. https://github.com/scalafx/scalafx. Accessed Mar 2017. BSD Open Source. [20] Duyu Tang et al. “Building Large-Scale Twitter-Specific Sentiment Lexicon: A Representation Learn- ing Approach.” In: COLING. 2014, pp. 172–182. [21] Peter D Turney. “Thumbs up or thumbs down?: semantic orientation applied to unsupervised classifi- cation of reviews”. In: Proceedings of the 40th annual meeting of the Association for Computational Linguistics. Association for Computational Linguistics. 2002, pp. 417–424. [22] Soroush Vosoughi, Helen Zhou, and Deb Roy. “Enhanced Twitter sentiment classification using con- textual information”. In: arXiv preprint arXiv:1605.05195 (2016). [23] Theresa Wilson, Janyce Wiebe, and Paul Hoffmann. “Recognizing contextual polarity in phrase-level sentiment analysis”. In: Proceedings of the conference on human language technology and empirical methods in natural language processing. Association for Computational Linguistics. 2005, pp. 347– 354. [24] Hongliang Yu, Zhi-Hong Deng, and Shiyingxue Li. “Identifying Sentiment Words Using an Optimization- based Model without Seed Words.” In: ACL (2). 2013, pp. 855–859. [25] Massung S. Zhai C. Text Data Management and Analysis. Association for Computing Machinery, Morgan, and Claypool Publishers, 2016. Appendix A: Labeled Emojis Tables A.1 and A.2 are screenshots of the table that we used to label emojis. The Unicode and CLDR Short Names were found on the Unicode, Inc. website [5]. 34 Bibliography 35 Table A.1: Labeled emojis Bibliography 36 Table A.2: Labeled emojis, continued