AI & GDPR: When is training data anonymous?

AI training is not possible without data. The data required for training and validation is often personal, so the GDPR must be observed - or maybe not? This compact online event will look at the technical and legal aspects. In particular, the question of which data is personal at all and whether and, if so, how it can be anonymized will be examined. This applies to health-related data as well as industrial data.

Anonymization does not mean GDPR

If the data is anonymous, many things are easier. This is because the GDPR does not apply. If, on the other hand, the data is personal, a justification for the data processing must be found, which often consists of consent. AI training in compliance with consent is also possible - but only if various requirements are met. Among other things, it must also be possible to withdraw consent.

Legal consequences of prematurely assuming anonymization

If supposedly anonymous data that is subject to the GDPR is processed, the GDPR still applies. However, the resulting breach may mean that all data, including the trained neural network and the trained AI, must be deleted - even long after the training.

It should be noted that the legal concept of anonymity is dynamic. This means that data may be considered anonymous today, but the same data may be considered personal data next year. This can be the case, for example, if "re-identification" is suddenly no longer quite so unrealistic due to new events, e.g. new legal claims for information, new technical possibilities or the publication of metadata.

Exemplary scenarios

Blood count data from 100,000 people is processed to train a medical AI. The AI provider only has the blood count data without any metadata, i.e. without information on the donor, origin of the data, etc. Is the data anonymous for the AI provider - or does it have to comply with the GDPR?

A company collects data on the heart rate of drivers in order to identify "hot spots" in road traffic. The company says that it does not know the names of the drivers and that no one can be identified by their heart rate alone. Is this data anonymous - or does it have to comply with the GDPR?

The provider of a smart home solution collects data on the temperature in its customers' homes. The smart home provider wants to store this data in a data pool and offer it to third parties for AI training. Is the temperature data anonymous - or does it have to comply with the GDPR?

A call center wants to optimize its internal processes using AI by recording the telephone number and duration of calls. However, to achieve anonymization, the call center deletes the last three digits of the phone number and rounds the call time up or down. Is this sufficient for anonymization?

An excerpt from the topics

In the event, Prof. Dr. Christian Thies will present the technical processes and the inclusion of data in the training of an AI. Dr. Gerrit Hötzel will explain which data is personal and which is anonymous and Mr. Marius Adler will present the current status of the EU Commission's draft AI regulation. Afterwards - and also during the presentations - there will be an opportunity to ask questions:

  • Technical process of AI training

    • Modeling of medical information

    • Collection and management of clinical data

    • Training and validation in AI

    • Development and evaluation of AI-based medical applications

  • Data protection

    • Which data is personal?

    • When is anonymization achieved?

    • Methods of anonymization

    • Legal consequences of AI training with personal data

    • Withdrawal of consent before, during and after AI training

  • EU AI Regulation

    • Rough overview of the draft

    • Current status of the AI Regulation

    • Special regulations on data processing for the purpose of AI training?

  • Afterwards: opportunity for questions

Date & costs

The event will take place on Tuesday, 22.03.2022 (16:30 - 18:30). Participation in the event is free of charge. Please register as soon as possible.

Online event

The event will take place as an online event this year. The technical details for participation will be announced shortly.

Speaker

Prof. Dr. rer. medic. Christian Thies
Medical Information Systems Faculty of Computer Science
Reutlingen University of Applied Sciences

Prof. Dr. Christian Thies has been involved in the development of digital applications in the healthcare sector for almost 30 years. He is currently working with clinical partners to implement and evaluate pilot projects for the digital support of medical care. Previously, as a developer, project manager, development manager and product manager, he has developed and operated methods in a wide variety of application areas. This includes AI-based biomedical image analysis, medical device development, integration of medical information systems and decentralized patient monitoring. Application reality, benefit analysis, data protection and security form a consistent basis for his practice-oriented work.