Browsing by Author "Zhang, Xuan"
Now showing 1 - 5 of 5
Results Per Page
Sort Options
- High Precision Dynamic Power System Frequency Estimation Algorithm Based on Phasor ApproachZhang, Xuan (Virginia Tech, 2004-01-16)An internet-based, real-time, Global Positioning System (GPS) ---synchronized relative to the wide-area frequency-monitoring network (FNET) ---has been developed at Virginia Tech. In this FNET system, an algorithm that employs the relationship between phasor angles and deviated frequency [13] is used to calculate both frequency and its rate of change. Tests of the algorithm disclose that, for non-pure sinusoidal input (as compared to pure sinusoidal input), significant errors in the output frequency will result. Three approaches for increasing the accuracy of the output frequency were compared. The first---increasing the number of samples per cycle N---proved ineffective. The second---using the average of the first estimated frequencies rather than the instant first estimated frequency as the resampling frequency---produces a moderate increase in accuracy of the frequency estimation. The third---multiple resampling---significantly increased accuracy. But both the second and the third become ineffective to the extent the input is not pure sinusoidal. From a practical standpoint, attention needs to be paid toward eliminating noise in the input data from the power grid so as to make it more purely sinusoidal. Therefore, it will be worthwhile to test more sophisticated digital filters for processing the input data before feeding it to the algorithm.
- A Hybrid Model for Role-related User Classification on TwitterLi, Liuqing; Song, Ziqian; Zhang, Xuan; Fox, Edward A. (Virginia Tech, 2018-11-15)To aid a variety of research studies, we propose TWIROLE, a hybrid model for role-related user classification on Twitter, which detects male-related, female-related, and brand-related (i.e., organization or institution) users. TWIROLE leverages features from tweet contents, user profiles, and profile images, and then applies our hybrid model to identify a user’s role. To evaluate it, we used two existing large datasets about Twitter users, and conducted both intra- and inter-comparison experiments. TWIROLE outperforms existing methods and obtains more balanced results over the several roles. We also confirm that user names and profile images are good indicators for this task. Our research extends prior work that does not consider brand-related users, and is an aid to future evaluation efforts relative to investigations that rely upon self-labeled datasets.
- Identifying Product Defects from User Complaints: A Probabilistic Defect ModelZhang, Xuan; Qiao, Zhilei; Tang, Lijie; Fan, Weiguo Patrick; Fox, Edward A.; Wang, Gang Alan (Department of Computer Science, Virginia Polytechnic Institute & State University, 2016-03-02)The recent surge in using social media has created a massive amount of unstructured textual complaints about products and services. However, discovering and quantifying potential product defects from large amounts of unstructured text is a nontrivial task. In this paper, we develop a probabilistic defect model (PDM) that identifies the most critical product issues and corresponding product attributes, simultaneously. We facilitate domain-oriented key attributes (e.g., product model, year of production, defective components, symptoms, etc.) of a product to identify and acquire integral information of defect. We conduct comprehensive evaluations including quantitative evaluations and qualitative evaluations to ensure the quality of discovered information. Experimental results demonstrate that our proposed model outperforms existing unsupervised method (K-Means Clustering), and could find more valuable information. Our research has significant managerial implications for mangers, manufacturers, and policy makers.
- Named Entity Recognition for IDEALDu, Qianzhou; Zhang, Xuan (2015-05-10)The term “Named Entity”, which was first introduced by Grishman and Sundheim, is widely used in Natural Language Processing (NLP). The researchers were focusing on the information extraction task, that is extracting structured information of company activities and defense related activities from unstructured text, such as newspaper articles. The essential part of “Named Entity” is to recognize information elements, such as location, person, organization, time, date, money, percent expression, etc. To identify these entities from unstructured text, some researchers called this sub-task of information extraction as “Named Entity Recognition” (NER). Now, NER technology has become mature and there are good tools to implement this task, such as the Stanford Named Entity Recognizer (SNER), Illinois Named Entity Tagger (INET), Alias-i LingPipe (LIPI), and OpenCalasi (OCWS). Each of these has some advantages and is designed for some special data. In this term project, our final goal is to build a NER module for the IDEAL project based on a particular NER tool, such as SNER, to apply NER to the Twitter and web pages data sets. This project report presents our work towards this goal, including literature review, requirements, algorithm, development plan, system architecture, implementation, user manual, and development manual. Further, results are given with regard to multiple collections, along with discussion and plans for the future.
- Product Defect Discovery and Summarization from Online User ReviewsZhang, Xuan (Virginia Tech, 2018-10-29)Product defects concern various groups of people, such as customers, manufacturers, government officials, etc. Thus, defect-related knowledge and information are essential. In keeping with the growth of social media, online forums, and Internet commerce, people post a vast amount of feedback on products, which forms a good source for the automatic acquisition of knowledge about defects. However, considering the vast volume of online reviews, how to automatically identify critical product defects and summarize the related information from the huge number of user reviews is challenging, even when we target only the negative reviews. As a kind of opinion mining research, existing defect discovery methods mainly focus on how to classify the type of product issues, which is not enough for users. People expect to see defect information in multiple facets, such as product model, component, and symptom, which are necessary to understand the defects and quantify their influence. In addition, people are eager to seek problem resolutions once they spot defects. These challenges cannot be solved by existing aspect-oriented opinion mining models, which seldom consider the defect entities mentioned above. Furthermore, users also want to better capture the semantics of review text, and to summarize product defects more accurately in the form of natural language sentences. However, existing text summarization models including neural networks can hardly generalize to user review summarization due to the lack of labeled data. In this research, we explore topic models and neural network models for product defect discovery and summarization from user reviews. Firstly, a generative Probabilistic Defect Model (PDM) is proposed, which models the generation process of user reviews from key defect entities including product Model, Component, Symptom, and Incident Date. Using the joint topics in these aspects, which are produced by PDM, people can discover defects which are represented by those entities. Secondly, we devise a Product Defect Latent Dirichlet Allocation (PDLDA) model, which describes how negative reviews are generated from defect elements like Component, Symptom, and Resolution. The interdependency between these entities is modeled by PDLDA as well. PDLDA answers not only what the defects look like, but also how to address them using the crowd wisdom hidden in user reviews. Finally, the problem of how to summarize user reviews more accurately, and better capture the semantics in them, is studied using deep neural networks, especially Hierarchical Encoder-Decoder Models. For each of the research topics, comprehensive evaluations are conducted to justify the effectiveness and accuracy of the proposed models, on heterogeneous datasets. Further, on the theoretical side, this research contributes to the research stream on product defect discovery, opinion mining, probabilistic graphical models, and deep neural network models. Regarding impact, these techniques will benefit related users such as customers, manufacturers, and government officials.