Authors: Bakary Badjie, José Cecílio, António Casimiro
Date: March 2023
DOI Bookmark: https://doi.org/10.48550/arXiv.2303.15901
The paper with the title “Denoising Autoencoder-based Defensive Distillation as an Adversarial Robustness Algorithm” was accepted to the Work-in-Progress Track at the 27th Ada-Europe International Conference on Reliable Software Technologies (AEiC 2023).
The paper is related to the safety work package of the VEDLIoT project. It aims to address the problem of the lack of robustness in deep neural networks caused by data poisoning adversarial attacks.
This work is an essential contribution towards ensuring the safety and reliability of deep learning models in real-world settings.
Adversarial attacks significantly threaten the robustness of deep neural networks (DNNs). Despite the multiple defensive methods employed, they are nevertheless vulnerable to poison attacks, where attackers meddle with the initial training data. In order to defend DNNs against such adversarial attacks, this work proposes a novel method that combines the defensive distillation mechanism with a denoising autoencoder (DAE). This technique tries to lower the sensitivity of the distilled model to poison attacks by spotting and reconstructing poisonous adversarial inputs in the training data. We added carefully created adversarial samples to the initial training data to assess the proposed method’s performance. Our experimental findings demonstrate that our method successfully identified and reconstructed the poisonous inputs while also considering enhancing the DNN’s resilience. The proposed approach provides a potent and robust defense mechanism for DNNs in various applications where data poisoning attacks are a concern. Thus, the defensive distillation technique’s limitation posed by poisonous adversarial attacks is overcome.