A mathematical guarantee that the of privacy wherein data is partitioned so that not everything is leaked.

Formal Definition

  • - datasets (one with individual data)
  • - training algorithm
  • - any subset of trained models (i.e, models with >90% accuracy)
  • - privacy loss (how much one person is used)
  • - slack allowing epsilon bound not to hold

Intuition

Pick any property of the trained model you care about (i.e performance, accuracy, etc). If you retrain the model many times, the model having that property doesn’t change much (at most factor when someone’s data is added or removed)

Concepts