István Fábián and Gábor György Gulyás

De-anonymizing Facial Recognition Embeddings

Advances of machine learning and hardware getting cheaper resulted in smart cameras equipped with facial recognition becoming unprecedentedly widespread worldwide. Undeniably, this has a great potential for a wide spectrum of uses, it also bears novel risks. In our work, we consider a specific related risk, one related to face embeddings, which are machine learning created metric values describing the face of a person. While embeddings seems arbitrary numbers to the naked eye and are hard to interpret for humans, we argue that some basic demographic attributes can be estimated from them and these values can be then used to look up the original person on social networking sites. We propose an approach for creating synthetic, life-like datasets consisting of embeddings and demographic data of several people. We show over these ground truth datasets that the aforementioned re-identifications attacks do not require expert skills in machine learning in order to be executed. In our experiments, we find that even with simple machine learning models the proportion of successfully re-identified people vary between 6.04% and 28.90%, depending on the population size of the simulation.

DOI: 10.36244/ICJ.2020.2.7


Please cite this paper the following way:

István Fábián and Gábor György Gulyás, "De-anonymizing Facial Recognition Embeddings", Infocommunications Journal, Vol. XII, No 2, July 2020, pp. 50-56. DOI: 10.36244/ICJ.2020.2.7

Technical Co-Sponsors





National Cooperation Fund, Hungary