Detecting audio copy-move forgery with an artificial neural network


AKDENİZ F., BECERİKLİ Y.

SIGNAL IMAGE AND VIDEO PROCESSING, cilt.18, sa.3, ss.2117-2133, 2024 (SCI-Expanded) identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 18 Sayı: 3
  • Basım Tarihi: 2024
  • Doi Numarası: 10.1007/s11760-023-02856-w
  • Dergi Adı: SIGNAL IMAGE AND VIDEO PROCESSING
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, Compendex, INSPEC, zbMATH
  • Sayfa Sayıları: ss.2117-2133
  • Anahtar Kelimeler: Artificial neural network, Audio copy-move forgery, Audio forensics, Audio tampering, Digital multimedia forensics, Digital multimedia security
  • Kocaeli Üniversitesi Adresli: Evet

Özet

Given how easily audio data can be obtained, audio recordings are subject to both malicious and unmalicious tampering and manipulation that can compromise the integrity and reliability of audio data. Because audio recordings can be used in many strategic areas, detecting such tampering and manipulation of audio data is critical. Although the literature demonstrates the lack of any accurate, integrated system for detecting copy-move forgery, the field shows great promise for research. Thus, our proposed method seeks to support the detection of the passive technique of audio copy-move forgery. For our study, forgery audio data were obtained from the TIMIT dataset, and 4378 audio recordings were used: 2189 of original audio and 2189 of audio created by copy-move forgery. After the voiced and unvoiced regions in the audio signal were determined by the yet another algorithm for pitch tracking, the features were obtained from the signals using Mel frequency cepstrum coefficients (MFCCs), delta (Delta) MFCCs, and Delta Delta MFCCs coefficients together, along with linear prediction coefficients (LPCs). In turn, those features were classified using artificial neural networks. Our experimental results demonstrate that the best results were 75.34% detection with the MFCC method, 73.97% detection with the Delta MFCC method, 72.37% detection with the Delta Delta MFCC method, 76.48% detection with the MFCC + Delta MFCC + Delta Delta MFCC method, and 74.77% detection with the LPC method. Using the MFCC + Delta MFCC + Delta Delta MFCC method, in which the features are used together, we determined that the models give far superior results even with relatively few epochs. The proposed method is also more robust than other methods in the literature because it does not use threshold values.