Face De-identification of Drivers from NDS Data and Its Effectiveness in Human Factors


Report (2.85 MB)
Downloads: 157

TR Number



Journal Title

Journal ISSN

Volume Title


National Surface Transportation Safety Center for Excellence


Advancements in artificial intelligence (AI) and the Internet of Things (IoT) have made data the foundation for most technological innovations. As we embark on the era of big data analysis, broader access to quality data is essential for consistent advancement in research. Therefore, data sharing has become increasingly important in all fields, including transportation safety research. Data sharing can accelerate research by providing access to more data and the ability to replicate experiments and validate results. However, data sharing also poses challenges, such as the need to protect the privacy of research participants and address ethical and safety concerns when data contains personally identifiable information (PII). This report mainly focuses on the problem of sharing drivers’ face videos for transportation research. Driver video collected either through naturalistic driving studies (NDS) or simulator-based experiments contains useful information for driver behavior and human factors research. The report first gives an overview of the multitude of problems that are associated with sharing driver videos. Then, it demonstrates possible directions for data sharing by de-identifying drivers’ faces using AI-based techniques. We have demonstrated that recent developments in generative adversarial networks (GANs) can effectively help in de-identifying a person by swapping their face with that of another person. The results achieved through the proposed techniques were then evaluated qualitatively and quantitatively to prove the validity of such a system. Specifically, the report demonstrates how face-swapping algorithms can effectively de-identify faces while still preserving important attributes related to human factors research, including eye movements, head movements, and mouth movements. The experiments were done to assess the validity of GAN-based face de-identification on faces with varied anthropometric measurements. The participants used in the data had varied physical features as well. The dataset used was under lighting conditions that varied from normal to extreme conditions. This helped to check the robustness of the GAN-based techniques. The experiment was carried out for over 115,000 frames to account for most naturalistic driving conditions. Error metrics for head movements like differences in roll angle, pitch angle, and yaw angle were calculated. Similarly, the errors in eye aspect ratio, lip aspect ratio, and pupil circularity were also calculated as they are important in the assessment of various secondary behaviors of drivers while driving. We also calculated errors to assess the de-identified and original pairs more quantitatively. Next, we showed that a face can be swapped with faces that are artificially generated. We used GAN-based techniques to generate faces that were not present in the dataset used for training the model and were not known to exist before the generation process. These faces were then used for swapping with the original faces in our experiments. This gives researchers additional flexibility in choosing the type of face they want to swap. The report concludes by discussing possible measures to share such de-identified videos with the greater research community. Data sharing across disciplines helps to build collaboration and advance research, but it is important to ensure that ethical and safety concerns are addressed when data contains PII. The proposed techniques in this report provide a way to share driver face videos while still protecting the privacy of research participants; however, we recommend that such sharing should still be done under proper guidance from institutional review boards and should have a proper data use license.



naturalistic driving study, PII, AI, machine vision, privacy