COMMSEC: Breaking Fake Voice Detection with Speaker-Irrelative Features

PRESENTATION SLIDES

Voice is a vital medium for transmitting information. The advancement of speech synthesis technology has resulted in high-quality synthesized voices indistinguishable from human ears. These fake voices have been widely used in natural Deepfake production and other malicious activities, raising serious concerns regarding security and privacy. To deal with this situation, there have been many studies working on detecting fake voices and reporting excellent performance. However, is the story really over?

In this research, we propose SiFDetectCracker, a black-box adversarial attack framework based on Speaker-Irrelative Features (SiFs) against fake voice detection. We select background noise and mute parts before and after the speakers’ voice as the primary attack features. Manipulating these features in synthesized speech results in the fake speech detector making erroneous judgments. Experimental results indicate that SiFDetectCracker achieved a success rate exceeding 80% in circumventing existing state-of-the-art fake voice detection systems. Furthermore, we provide an analysis elucidating why current detectors exhibit sensitivity to our adversarial attack.

Date

Time

Track

Xuan Hai

PhD Student

Lanzhou University

Xin Liu

Associate Professor

Lanzhou University

Yuan Tan

Master Student

Lanzhou University

Song Li

Professor

Zhejiang University

See you in Bangkok!

SUPPORTING ORGANIZATION

HITB COMMSEC TRACK SPONSOR

HITB ARMORY SPONSOR

SILVER SPONSOR

CTF ORGANIZERS

CTF PRIZE SPONSOR

TCP/IP RECEPTION SPONSOR

ADDITIONAL SUPPORT BY

FRIENDS OF HITB

HACK IN THE BOX PTE LTD