Deepfake Videos in the Wild: Analysis and Detection

Pu, JiamengMangaokar, NealKelly, LaurenBhattacharya, ParantapaSundaram, KavyaJaved, MobinWang, BolunViswanath, Bimal2023-03-072023-03-072021-04978-1-4503-8312-7/21/04http://hdl.handle.net/10919/114052AI-manipulated videos, commonly known as deepfakes, are an emerging problem. Recently, researchers in academia and industry have contributed several (self-created) benchmark deepfake datasets, and deepfake detection algorithms. However, little effort has gone towards understanding deepfake videos in the wild, leading to a limited understanding of the real-world applicability of research contributions in this space. Even if detection schemes are shown to perform well on existing datasets, it is unclear how well the methods generalize to real-world deepfakes. To bridge this gap in knowledge, we make the following contributions: First, we collect and present the largest dataset of deepfake videos in the wild, containing 1,869 videos from YouTube and Bilibili, and extract over 4.8M frames of content. Second, we present a comprehensive analysis of the growth patterns, popularity, creators, manipulation strategies, and production methods of deepfake content in the realworld. Third, we systematically evaluate existing defenses using our new dataset, and observe that they are not ready for deployment in the real-world. Fourth, we explore the potential for transfer learning schemes and competition-winning techniques to improve defenses.application/pdfenCreative Commons Attribution 4.0 InternationalDeepfake videosDeepfake detectionDeepfake datasetsDeepfake Videos in the Wild: Analysis and DetectionConference proceedinghttps://doi.org/10.1145/3442381.3449978