From f1a36ef966cecc434f2aa1b74f627e2faef390ec Mon Sep 17 00:00:00 2001 From: fria <138676274+friadev@users.noreply.github.com> Date: Tue, 1 Jul 2025 08:06:55 -0500 Subject: [PATCH] add info about noise --- blog/posts/differential-privacy.md | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/blog/posts/differential-privacy.md b/blog/posts/differential-privacy.md index 12d7c972..f554e33f 100644 --- a/blog/posts/differential-privacy.md +++ b/blog/posts/differential-privacy.md @@ -23,6 +23,11 @@ It's useful to collect data from a large group of people. You can see trends in Latanya Sweeney in a [paper](https://dataprivacylab.org/projects/identifiability/paper1.pdf) from 2000 used U.S. Census data to try and re-identify people solely based on the metrics available to her. She found that 87% of Americans could be identified based on only 3 metrics: ZIP code, date of birth, and sex. +Obviously, being able to identify individuals based on publicly available data is a huge privacy issue. + ## History -Most of the concepts I write about seem to come from the 70's and 80's, but differential privacy is a relatively new concept. It was first introduced in a paper from 2006 called [*Calibrating Noise to Sensitivity in Private Data Analysis*](https://desfontain.es/PDFs/PhD/CalibratingNoiseToSensitivityInPrivateDataAnalysis.pdf) \ No newline at end of file +Most of the concepts I write about seem to come from the 70's and 80's, but differential privacy is a relatively new concept. It was first introduced in a paper from 2006 called [*Calibrating Noise to Sensitivity in Private Data Analysis*](https://desfontain.es/PDFs/PhD/CalibratingNoiseToSensitivityInPrivateDataAnalysis.pdf). + +The paper introduces the idea of adding noise to data to achieve privacy. Of course, adding noise to the dataset reduces its accuracy. Ɛ defines the amount of noise added to the dataset, with a small Ɛ meaning more privacy but less accurate data and vice versa. +