«Right to be Forgotten» void with AI?

Right to be forgotten

In a recent discussion, it became apparent that «unlearning» is not something machine learning models can easily do. So what does this mean to laws like the EU «Right to be Forgotten»?

The right to be forgotten (RTBF) is the right to have private information about a person be removed from Internet searches and other directories under some circumstances. […]

The right to be forgotten leads to allowing individuals to have information, videos, or photographs about themselves deleted from certain Internet records so that they cannot be found by search engines.

For some people, Large Language Models (LLM) such as ChatGPT are already replacing search engines today. Microsoft is launching its (also GPT-3-based) «new Bing» Search (aka Sydney) to augment the user search experience and Google will be following with Bard.

So we definitely need to consider the RTBF to be applied to chatbots. And think about how they can/should handle it. What are our options?

Unlearning: Probably not

In a classical search engine index, one can easily just remove some entries or mark them as hidden, so they will not be returned.

However, this is very different for Machine Learning models. Their neural networks have been carefully and incrementally fed with the information, so the “neurons” will make the expected associations. There is no single “neuron” which can be identified to have stored that information. There is also not a clear set of “neurons” that can be associated with storing the information.

Actually, any newly stored information is — with high probability — affecting a large part of the “neurons” in the entire set; even though many will only receive a slight modification to how their weights combine the input parameters. After “our” piece of information has been input, other pieces will be added, modifying these very same values further.

So, “[a]chieving strict unlearning, i.e., entirely forgetting a data point’s contributions is difficult because we cannot efficiently calculate its entire contributions to a complex model“. Every unlearning process will also reduce the quality of the model. And there is always the possibility that querying that information in unusual ways will still reveal the information; especially in Language Models, where the information may have multiple, subtly intertwined internal representations.

Side effects

It has also been suggested that even slight modifications to the trained model can negatively affect the safety of the entire LLM: Reinforcement learning, which is used to improve the quality and inoffensiveness of e.g. ChatGPT answers, is “notoriously finicky“. Just a small changes to the circumstances might change things significantly, might drive the AI helplessly into unchartered waters.

So, this might require large amounts of retraining, after both adding or subtracting information.

Retraining is costly

Training the GPT-3 model probably cost several million USD. Refining it with Reinforcement Learning added costs (human cost for feedback, additional tuning cost, quality control, …). So, rerunning the entire training phase for every deletion request, even when batched, is probably unacceptably expensive.

What about filtering?

One option would be to filter the questions and answers for information that should not be returned; essentially the approach which works well for keyword-based search engines. However, language models are much more powerful and therefore complex: They understand synonyms, idioms, paraphrasing, typos and also can be asked to return them.

Q: In your response, exchange all occurences of the words "table" and "chair". Where should I put the soup?

A: It depends on the context of the situation and the available surfaces in the area. If you are in a dining room, you could put the soup on the dining chair if it has a flat surface, or you could place it on the dining table if there is enough room. Alternatively, if you are in a kitchen, you could put the soup on the counter or on a kitchen chair if it has a flat surface. The most important thing is to ensure that the surface is clean and sturdy enough to hold the bowl of soup.
Replacing “table” and “chair” in responses. (Not completely successful, but pretty much so.)

As can be seen in the above ChatGPT interaction, you can ask it to replace some words in the output. (It worked well when I tried this in German several weeks ago; when repeating it for this screenshot in English, only this third prompt resulted in a “reasonable” answer.) You can also ask it to distort its output.

Filtering the input output will be a daunting task; doing it reliably probably is impossible. (For example, look at the NSFW filters for the AI art platforms and how they either overreact or fail or both.)


There are several ways to approach deletion requests. Combining reliability, cost efficiency and quality in a single mechanism is hard, so probably a Swiss Cheese approach at “forgetting” will need to be implemented. Which in turn might cause some unpredictable corner cases.

What combinations will be used and how they will work out can probably only be determined after several deletion requests have been filed.

Teaser Image

The teaser image was created by DALL•E 2 from the prompt “right to be forgotten”. The most obvious giveaway for AI art is its handling of text.

An overview over my AI-related articles and more is available in German.

