Title: Differential  Privacy for Statistics

Abstract

Much of the prior work in privacy focuses on classifying attributes  as sensitive or non-sensitive, we focus on privacy-preserving statistical analysis of data. The whole point of a statistical database is to teach general truths, for example, that smoking causes cancer. However, learning this fact can potentially reveal whether certain individuals will develop cancer, even though they are not necessarily in the database.  Differential Privacy arose in this context, aiming to constrain a computation in a way that the ability of an adversary to inflict any harm or good should be essentially the same, independent of whether any individual opts in to, or opts out of, the database.
In this presentation, we will motivate and review the definition of differential privacy, discuss how to add random noise to the output of a computation without distorting each answer significantly in differentially private algorithms. There are also times when the addition of noise for achieving privacy makes no sense. We consider the potential of applying probabilistic inference to improve the accuracy of existing approaches.  Then we show that the algorithms can be applied to personalized recommender systems, and that it can be adapted to protect the internal states of click stream data.