Artificial Intelligence (AI) technology has been widely deployed and made human lives much more convenient. It has become the cornerstone of many technologies, such as object detection, automatic speech recognition (ASR), natural language processing, and autonomous driving, etc. However, prior work has shown that most AI algorithms based on deep neural networks are easy to be attacked by adversarial examples. That is, by adding elaborate and human-imperceptible perturbation on the input data, the model will make a highly-confident but wrong prediction. To solve this problem, adversarial training was proposed to improve the robustness of AI models. And a lot of research results have proven it indeed an effective mechanism of defending against adversarial attacks.
However, is the adversarial training definitely so good? Unfortunately, the answer is no. Adversarial training usually costs more training time and training data than standard training. And interestingly, we find that it will make models more vulnerable to privacy attacks.
In this work, we will show that the training data can be easily reconstructed according to gradient information, and the federated learning (FL) is not secure enough. Besides, we firstly investigate the relationship between model robustness and data privacy. Our experiment results indicate that when adversarial training is used to enhance the robustness of AI models, it will increase the risk of data leakage in the meanwhile.
We adopt a gradient-matching approach, iteratively updating a random noise image and making it looks similar to original data. That is, we take “data” as a valid “model” and iteratively optimize it. In this way, we can extract much information from model parameters. We compare the whole process of reconstructing data from a normal model and from a robustness model, respectively. PSNR and MSE metrics are used to evaluate the quality of stolen data. The results show that the stolen data based on the robust model is much clearer and the robust model is much vulnerable than the normal model.
To the best of our knowledge, we are the first to propose that there may be a trade-off on privacy domain and model robustness domain from the risk of privacy leakage on gradients. Our findings deliver an important message: only focusing on the model robustness may lead to a false sense of security. They open a door for joint adversarial training and privacy-preserving in the future.