Paper published on a website (Scientific congresses, symposiums and conference proceedings)
Statistics-aware Audio-visual Deepfake Detector
ASTRID, Marcella; GHORBEL, Enjie; AOUADA, Djamila
2024IEEE International Conference on Image Processing (ICIP 2024)
Peer reviewed
 

Files


Full Text
2407.11650v2.pdf
Author preprint (773.33 kB) Creative Commons License - Attribution
Download

All documents in ORBilu are protected by a user license.

Send to



Details



Keywords :
Computer Science - Computer Vision and Pattern Recognition; Computer Science - Multimedia; Computer Science - Sound; eess.AS
Abstract :
[en] In this paper, we propose an enhanced audio-visual deep detection method. Recent methods in audio-visual deepfake detection mostly assess the synchronization between audio and visual features. Although they have shown promising results, they are based on the maximization/minimization of isolated feature distances without considering feature statistics. Moreover, they rely on cumbersome deep learning architectures and are heavily dependent on empirically fixed hyperparameters. Herein, to overcome these limitations, we propose: (1) a statistical feature loss to enhance the discrimination capability of the model, instead of relying solely on feature distances; (2) using the waveform for describing the audio as a replacement of frequency-based representations; (3) a post-processing normalization of the fakeness score; (4) the use of shallower network for reducing the computational complexity. Experiments on the DFDC and FakeAVCeleb datasets demonstrate the relevance of the proposed method.
Research center :
ULHPC - University of Luxembourg: High Performance Computing
Disciplines :
Computer science
Author, co-author :
ASTRID, Marcella  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
GHORBEL, Enjie  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust > CVI2 > Team Djamila AOUADA
AOUADA, Djamila  ;  University of Luxembourg > Interdisciplinary Centre for Security, Reliability and Trust (SNT) > CVI2
External co-authors :
no
Language :
English
Title :
Statistics-aware Audio-visual Deepfake Detector
Publication date :
October 2024
Event name :
IEEE International Conference on Image Processing (ICIP 2024)
Event place :
Abu Dhabi, United Arab Emirates
Event date :
27-30 October 2024
Audience :
International
Peer reviewed :
Peer reviewed
FnR Project :
FNR16353350 - Deepfake Detection Using Spatio-temporal-spectral Representations For Effective Learning, 2021 (01/03/2022-28/02/2025) - Djamila Aouada
Name of the research project :
U-AGR-7133 - BRIDGES2021/IS/16353350/FakeDeTeR_Post - AOUADA Djamila
Funders :
FNR - Luxembourg National Research Fund
Funding number :
BRIDGES2021/IS/16353350/FaKeDeTeR
Commentary :
Accepted in ICIP 2024
Available on ORBilu :
since 18 July 2024

Statistics


Number of views
141 (16 by Unilu)
Number of downloads
45 (2 by Unilu)

Bibliography


Similar publications



Contact ORBilu