Personal data anonymization: key concepts & how it affects machine learning models

By tryolabs - 2021-01-28

Description

Introduction to Personal data anonymization essential aspects: formats, techniques, and process. Finally, we summarize how data anonymization affects Machine Learning models

Summary

  • Data anonymization is the alteration process of personally identifiable information (PII) in a dataset, to protect individual identification.
  • The values that are suppressed are those with few appearances in the original dataset due to the fact that they represent a high disclosure risk for those records that contain them.
  • the original dataset is blended with a fully synthetic one.
  • If we don’t manage to figure out how to build Machine Learning systems that have good security properties and that protect the privacy of information, that would really limit the usefulness of Machine Learning for many applications that we care about” Martín Abadi, Google’s researcher, stated this at the Khipu’s conference in 2019 while delivering an excellent overview of Privacy and Security in Machine Learning.

 

Topics

  1. Backend (0.35)
  2. Machine_Learning (0.24)
  3. Database (0.17)

Similar Articles

30 Most Asked Machine Learning Questions Answered

By Medium - 2021-03-18

Machine Learning is the path to a better and advanced future. A Machine Learning Developer is the most demanding job in 2021 and it is going to increase by 20–30% in the upcoming 3–5 years. Machine…

Introduction to Active Learning

By KDnuggets - 2020-12-15

An extensive overview of Active Learning, with an explanation into how it works and can assist with data labeling, as well as its performance and potential limitations.

K-fold Cross Validation with PyTorch

By MachineCurve - 2021-02-02

Explanations and code examples showing you how to use K-fold Cross Validation for Machine Learning model evaluation/testing with PyTorch.