Ashwin Machanavajjhala's talk on DB seminar

Pufferfish: A Semantic Approach to Customizable Privacy

 

Abstract

Tremendous amounts of personal data about individuals are being collected and shared online. Legal requirements and an increase in public awareness due to egregious breaches of individual privacy have made statistical privacy an important field of research. Recent research, culminating in the development of a powerful notion called differential privacy, have transformed this field from a black art into a rigorous mathematical discipline. 

This talk highlights a key open challenge in statistical privacy: applications for data collection and analysis operate on diverse kinds of data, and have diverse requirements for the information that must be secret, and the adversaries that they must tolerate. Thus application domain experts, who are frequently not privacy experts, cannot directly use an existing, general-purpose privacy definition. Thus must develop a new definition or customize an existing one. Currently there exist no rigorous techniques to customize privacy to applications. 

I will motivate this challenge using a general impossibility result, called the No Free Lunch Theorem, which states that one can't simultaneously guarantee both utility and privacy for all types of data. In the context of differential privacy, I will show (i) a theoretical bound on the maximum utility in applications like social recommendations, and (ii) that it does not limit the ability of an attacker to learn sensitive information when the data are correlated (e.g., social networks). I will then present Pufferfish, an alternate rigorous semantic approach to defining privacy that allows us to customize privacy to the needs of an application. Pufferfish explicitly defines which information is kept secret, what adversaries are tolerated, and provides statistical bounds on the information disclosed about each secret to each adversary. Finally, I will conclude with ongoing work on developing privacy mechanisms and definitions (using Pufferfish) for handling correlated data.

 

Bio

Ashwin Machanavajjhala is an Assistant Professor in the Department of Computer Science, Duke University. Previously, he was a Senior Research Scientist in the Knowledge Management group at Yahoo! Research. His primary research interests lie in data privacy, systems for massive data analytics, and statistical methods for information extraction and entity resolution. Ashwin graduated with a Ph.D. from the Department of Computer Science, Cornell University. His thesis work on defining and enforcing privacy was awarded the 2008 ACM SIGMOD Jim Gray Dissertation Award Honorable Mention. He has also received an M.S. from Cornell University and a B.Tech in Computer Science and Engineering from the Indian Institute of Technology, Madras.