Xinghui Wu, Xi'an Jiaotong University; Shiqing Ma, University of Massachusetts Amherst; Chao Shen and Chenhao Lin, Xi'an Jiaotong University; Qian Wang, Wuhan University; Qi Li, Tsinghua University; Yuan Rao, Xi'an Jiaotong University
Prior researchers show that existing automatic speech recognition (ASR) systems are vulnerable to adversarial examples. Most existing adversarial attacks against ASR systems are either white- or gray-box, limiting their practical usage in the real world. Some black-box attacks also assume the knowledge of output probability vectors to infer output distribution. Other black-box attacks leverage inefficient heavyweight processes, i.e., training auxiliary models or estimating gradients. Moreover, they require input-specific and manual hyperparameter tuning to improve the attack success rate against a specific ASR system. Despite such a heavyweight tuning process, nearly or even more than half of the generated adversarial examples are perceptible to humans.
This paper designs KENKU, an efficient and stealthy black-box adversarial attack framework against ASRs, supporting hidden voice command and integrated command attacks. It optimizes the novel acoustic feature loss and perturbation loss, based on Mel-frequency Cepstral Coefficients (MFCC). Both loss values can be calculated locally, avoiding training auxiliary models or estimating gradients, making the attack efficient. Furthermore, we introduce a hyperparameter in optimization that balances the attack effectiveness and imperceptibility automatically. KENKU uses the binary search algorithm to find its optimal value. We evaluated our prototype on eight real-world systems (including five digital and three physical attacks) and compared KENKU with five state-of-the-art works. Results show that KENKU can outperform existing works in the attack performance.
Open Access Media
USENIX is committed to Open Access to the research presented at our events. Papers and proceedings are freely available to everyone once the event begins. Any video, audio, and/or slides that are posted after the event are also free and open to everyone. Support USENIX and our commitment to Open Access.
author = {Xinghui Wu and Shiqing Ma and Chao Shen and Chenhao Lin and Qian Wang and Qi Li and Yuan Rao},
title = {{KENKU}: Towards Efficient and Stealthy Black-box Adversarial Attacks against {ASR} Systems},
booktitle = {32nd USENIX Security Symposium (USENIX Security 23)},
year = {2023},
isbn = {978-1-939133-37-3},
address = {Anaheim, CA},
pages = {247--264},
url = {https://www.usenix.org/conference/usenixsecurity23/presentation/wu-xinghui},
publisher = {USENIX Association},
month = aug
}