Skip to content

Commit

Permalink
Merge pull request #57 from vantage6/49-change-request-chapter-1-need…
Browse files Browse the repository at this point in the history
…s-more-pictures

49 change request chapter 1 needs more pictures
  • Loading branch information
dsmits authored Aug 13, 2024
2 parents 97cdd4f + ee193ed commit e99ed1f
Show file tree
Hide file tree
Showing 10 changed files with 16 additions and 4 deletions.
20 changes: 16 additions & 4 deletions episodes/chap1_introduction_FL.md
Original file line number Diff line number Diff line change
Expand Up @@ -38,7 +38,7 @@ In classic data analysis, all data is copied over into a single place. This make
use
conventional data analysis software and tools to gain insights.

![In classic analysis all the data is brought together](fig/classic%20analysis.jpg){alt='Two tables
![In classic analysis all the data is brought together](fig/chapter1/classic_analysis.jpg){alt='Two tables
with data are moved to a central location'}

Unfortunately this way of working does not respect the privacy of the people contained within the
Expand Down Expand Up @@ -74,7 +74,7 @@ will in this case be removed so that individuals stay anonymous. Data _pseudonim
similar process, but in this case, the records will be assigned an id that will make it
possible to link individuals across datasets.

![Data anonymization](fig%2Fdata_anonymization.jpg){alt='Names are censored before the datasets
![Data anonymization](fig/chapter1/data_anonymization.jpg){alt='Names are censored before the datasets
are sent to a central place'}

While data anonymization and pseudonymization are often a good first step, there is no guarantee
Expand All @@ -93,11 +93,14 @@ over it. In this case, it is not the data that travels, but the analysis itself.
a query or instruction to the data and only the results will get back to the user.
The results are often akin to a form of _aggregation_ of the data. This can be in the shape of
traditional
statistics like the mean, or it can be more intricate like a machine learning model.
statistics like the mean, or it could be a combination of aggregations to form a more complicated
analysis.

Aggregating the data does not ensure complete protection of person-level information, but it
certainly makes it less likely that this will happen.

![In federated analysis, an aggregation of the local data is sent to a central point. At the central point the local aggregations are combined. This can also be a small step in a larger analysis.](fig/chapter1/federated_sum.jpg)

Check warning on line 102 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/federated_sum.jpg

Check warning on line 102 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/federated_sum.jpg

TODO: Example of data leakage in simple aggregated case

## Federated learning
Expand All @@ -109,11 +112,13 @@ traditional federated learning, the clients train machine learning models, and o
the models are sent back to the central server. The central server combines the updates from all the
individual clients into one final machine learning model.

![In federated learning only the gradients of models are shared](fig/chapter1/federated_learning.jpg)

Check warning on line 115 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/federated_learning.jpg

Check warning on line 115 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/federated_learning.jpg

There are caveats to using this type of data analysis though. Although the data transmitted from the
clients to the server are an aggregation of the raw data, researchers have found a way to use this
data to reconstruct the original data. This vulnerability is called _gradient leakage_.

![An example of gradient leakage](fig/gradient_leakage.jpg)
![An example of gradient leakage. The order might not be correct but the images are still very close to the originals.](fig/chapter1/gradient_leakage.jpg)

Check warning on line 121 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/gradient_leakage.jpg

Check warning on line 121 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/gradient_leakage.jpg

## Secure Multiparty Computation

Expand All @@ -133,6 +138,8 @@ retrieved.
When combining multiple values using secret sharing, this will result in the parties owning new
puzzle pieces that when put together will reveal the result of the computation.

![In secure multiparty computation parties collaboratively perform an analysis while holding only encrypted pieces of the data](fig/chapter1/mpc.jpg)

Check warning on line 141 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/mpc.jpg

Check warning on line 141 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/mpc.jpg

::: callout

### Secret sharing, an example
Expand Down Expand Up @@ -173,6 +180,9 @@ selection of inputs with random noise. A single individual will then always be a
their data has contributed to the final result. An individual has _plausible deniability_ with
regards to whether it was part of the dataset.

![Differential privacy replaces a subset of the data with random values](fig/chapter1/differential_privacy.jpg)

Check warning on line 183 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/differential_privacy.jpg

Check warning on line 183 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/differential_privacy.jpg


## Blocks upon blocks

The previously mentioned techniques are not used in isolation, but are usually stacked on top of
Expand Down Expand Up @@ -204,6 +214,8 @@ of privacy enhancing algorithms than horizontal partitioning.
Data can even be horizontally and vertically partitioned at the same time. In these cases, it may be
necessary to combine multiple techniques.

![Horizontal and vertical partitioning refers to how data is separated](fig/chapter1/horizontal_vertical_partitioning.jpg)

Check warning on line 217 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/horizontal_vertical_partitioning.jpg

Check warning on line 217 in episodes/chap1_introduction_FL.md

View workflow job for this annotation

GitHub Actions / Build Full Site

[image missing alt-text]: fig/chapter1/horizontal_vertical_partitioning.jpg

## Technology doesn't solve everything

You have now learned about various technologies for analyzing data while preserving privacy of
Expand Down
Binary file added episodes/fig/chapter1/classic_analysis.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/chapter1/data_anonymization.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/chapter1/differential_privacy.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/chapter1/federated_learning.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/chapter1/federated_sum.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
File renamed without changes
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added episodes/fig/chapter1/mpc.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file removed episodes/fig/data_anonymization.jpg
Binary file not shown.

0 comments on commit e99ed1f

Please sign in to comment.