Deepfake technologies, which allow malicious actors to produce fake images, videos, and audio clips, are reaching an unprecedented convergence of quality, scalability, and ease of use. It will soon be possible to mass-produce highly realistic synthetic content that may be generated and spread faster than fake media detectors can manage. The proliferation of these technologies poses clear threats to society and democracy (for example, consider the dangers of shared videos wherein politicians give fake speeches). It appears that the future of information channels which we rely on when forming our beliefs and opinions is on the shaky ground unless detection technology can gain the upper hand. Synthetic audio detection is one key element of managing this threat.
Chengzhe Sun, Ehab AlBadawy, Siwei Lyu, Timothy Davison, Sarah R Robinson, Nathaniel Kavaler