Book meeting

1. Personal Data

It is crucial to understand the concepts of “personal data”, “purpose”, and “processing” to ensure that software complies with legislation when processing user data. Pay particular attention to the difference between “anonymization” and “pseudonymization”, as these have very precise and different definitions in GDPR.

Definition

The concept of personal data is defined in the General Data Protection Regulation (GDPR) as “any information relating to an identified or identifiable natural person (referred to as ’the data subject’)”. It covers a broad spectrum of information, including both directly identifying data (e.g., first and last name) and indirectly identifying data (e.g., phone number, license plate, device ID, etc.).

Any operation involving this type of data (collection, recording, transfer, modification, publication, etc.) constitutes processing under GDPR and must therefore comply with the regulation’s requirements. Such data processing must be lawful and have a specific purpose. The personal data collected and processed must be relevant and limited to what is strictly necessary to fulfill the purpose.

Examples of Personal Data

If they relate to natural persons, the following data are personal data:

  • First name, last name, alias, date of birth;
  • photos, audio recordings of voices;
  • landline or mobile number, postal address, email address;
  • IP address, computer login, cookie ID;
  • Fingerprints, palm structure or vein pattern, retina scan;
  • License plate, social security number, ID number;
  • Usage data from an application, comments, etc.

Identification of natural persons can be done:

  • from a single piece of information (example: first and last name);
  • by combining several pieces of information (example: a woman living at a certain address, born on a certain date, and member of a certain association).

Some data is considered particularly sensitive. GDPR generally prohibits the collection and use of such data unless all persons involved have given their explicit consent (active, explicit and preferably written consent, which must be voluntary, specific, and informed).

These requirements apply to the following data:

  • information about a person’s health;
  • data concerning sex life or sexual orientation;
  • data revealing a person’s ethnic or racial origin;
  • political opinions, religious beliefs, philosophical beliefs, or trade union membership;
  • genetic and biometric data used for unique identification of a person.

Anonymization of Personal Data

An anonymization process for personal data aims to make it impossible to identify individuals in a dataset. It is therefore an irreversible process. When anonymization is effective, the data is no longer considered personal data, and GDPR requirements no longer apply.

As a general rule, it is recommended that you never consider raw datasets as anonymous. Anonymization is achieved by processing personal data in a way that irreversibly prevents identification, either by:

  • individual identification: it is not possible to isolate individual records that can identify a person in the dataset;
  • linkage: the dataset does not allow linking two or more records to the same person or a group of persons;
  • inference: it is not possible to infer a value with high probability from other data in the dataset.

These processes in most cases result in degradation of data quality. The Article 29 Working Party (Art. 29 WP) has in its opinion on anonymization techniques described the most commonly used anonymization techniques as well as examples of datasets that were mistakenly considered anonymous. It is important to be aware that anonymization techniques have limitations. The decision to anonymize as well as the choice of method must be made individually in relation to the context (data types, usability, risk to persons, etc.).

Pseudonymization of Personal Data

Pseudonymization is a compromise between preserving raw data and producing anonymized datasets.

It refers to processing of personal data in a way that data can no longer be attributed to a person without additional information. GDPR requires that this additional information is stored separately and protected with technical and organizational measures to prevent re-identification. Unlike anonymization, pseudonymization can be a reversible process.

In practice, a pseudonymization process consists of replacing directly identifying data (first name, last name, etc.) with indirectly identifying data (alias, number in a register, etc.) to reduce their sensitivity. This could for example be a cryptographic hash of a person’s data, such as IP address, user ID, or email address.

Data that is pseudonymized is still considered personal data and is therefore subject to GDPR. However, the EU regulation encourages the use of pseudonymization in processing of personal data. Furthermore, GDPR considers pseudonymization as a method to reduce risks to data subjects and to facilitate compliance with regulations.

Need a penetration test?

Contact us for a no-obligation conversation about your security needs.

Contact us