Abstract: We describe a powerful new family of attacks that recover sensitive information about individuals using only simple summary statistics computed on a dataset. Notably our attacks succeed under minimal assumptions on the distribution of the data, even if the attacker has very limited information about this distribution, and even if the summary statistics are significantly distorted. Our attacks build on and generalize the method of fingerprinting codes for proving lower bounds in differential privacy, and also extend the practical attacks on genomic datasets demonstrated by Homer et al. Surprisingly, the amount of noise that our attacks can tolerate is nearly matched by the amount of noise required to achieve differential privacy, meaning that the robust privacy guarantees of differential privacy come at almost no cost in our model.
Based on joint work with Cynthia Dwork, Adam Smith, Thomas Steinke, and Salil Vadhan.
Bio: Jon Ullman is an assistant professor in the College of Computer and Information Sciences at Northeastern University. His research addresses questions like "when and how can we analyze sensitive datasets without compromising privacy" and "how can we prevent false discovery in the empirical sciences" using tools from cryptography, machine learning, algorithms, and game theory. Prior to joining Northeastern, he completed his PhD at Harvard University, and was in the inaugural class of junior fellows in the Simons Society of Fellows.