Exploring the difficulty of hiding keys in neural networks

Supervisor(s)Dr. Cecilia Pasquini


The emerging sub-field of adversarial machine learning (more precisely: machine learning in adversarial environments) has established a taxonomy of attacks that are performed during the training or inference phase of machine learning tasks and violate various protection goals. Deep neural networks (DNNs) appear particularly vulnerable to the misclassification of adversarial examples.

In order to defend neural networks against malicious attacks, recent approaches propose the use of secret keys in the training or inference pipelines of learning systems. However, the secrecy of the key is often not discussed.

In the goal of this thesis is to explore the issue for the case of a recently proposed key-based DNN. It should experimentally measure the leakage of key information under selected attacker models.


  • Papernot, N., McDaniel, P., Sinha, A., and Wellman, M.P. SoK: Security and Privacy in Machine Learning. In IEEE European Symposium on Security and Privacy (EuroS&P). 2018, pp. 399–414.
  • Shumailov, I., Zhao, Y., Mullins, R., and Anderson, R. The Taboo Trap: Behavioural Detection of Adversarial Samples. CoRR, abs/1811.07375, (2018).