VTechWorks staff will be away for the Independence Day holiday from July 4-7. We will respond to email inquiries on Monday, July 8. Thank you for your patience.
 

Data Mining Academic Emails to Model Employee Behaviors and Analyze Organizational Structure

Files

TR Number

Date

2016-06-06

Journal Title

Journal ISSN

Volume Title

Publisher

Virginia Tech

Abstract

Email correspondence has become the predominant method of communication for businesses. If not for the inherent privacy concerns, this electronically searchable data could be used to better understand how employees interact. After the Enron dataset was made available, researchers were able to provide great insight into employee behaviors based on the available data despite the many challenges with that dataset. The work in this thesis demonstrates a suite of methods to an appropriately anonymized academic email dataset created from volunteers' email metadata. This new dataset, from an internal email server, is first used to validate feature extraction and machine learning algorithms in order to generate insight into the interactions within the center. Based solely on email metadata, a random forest approach models behavior patterns and predicts employee job titles with 96 accuracy. This result represents classifier performance not only on participants in the study but also on other members of the center who were connected to participants through email. Furthermore, the data revealed relationships not present in the center's formal operating structure. The culmination of this work is an organic organizational chart, which contains a fuller understanding of the center's internal structure than can be found in the official organizational chart.

Description

Keywords

Data analytics, Machine learning, social computing

Citation

Collections