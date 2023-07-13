- Advertisement -

Generative artificial intelligence (AI) recently burst onto the scene, producing text, images, sound, and video content that closely resemble human-made content. After being trained on publicly available data, ChatGPT, DALL-E, Bard and other AI models were unleashed to an eager public. The rate of adoption of this technology far surpasses the speed with which legislative bodies can pass the laws needed to ensure safety, reliability and fairness.

Norwegians are trying to get ahead of the game, raising questions about consumer protection and data privacy. The Norwegian Consumer Council published a report in June 2023 to address the harm generative AI might inflict on consumers. The report, Ghost in the machine – addressing the consumer harms of generative AI, presents overarching principles that would ensure generative AI systems are developed and used in a way that protects human rights.

The Norwegian data protection authority, Datatilsynet, is also raising awareness about the ways generative AI violates the General Data Protection Regulation (GDPR). Generative AI models train on large amounts of data taken from many different sources, usually without the knowledge or consent of the originator of the data.

“There are a few issues with generative AI in terms of data collection,” says Tobias Judin, head of the international section at Norway’s data protection authority. “The first questions are around what these companies do to train their models.”

Data privacy may be violated during the training phase Most of the models used for generative AI are foundational models, which means they are general purpose enough to be used by a variety of applications. The people who train the foundational models compile massive amounts of data from open sources on the internet, including a huge quantity of personal data. The first concern raised by data protection authorities is whether organisations are entitled to collect all that personal data. Many data privacy experts think the data collection is unlawful. “Another issue is transparency,” says Judin. “Are people made aware that their personal data will be used to train a model? Probably not. One of the legal principles concerning data collection is data minimisation, which says you shouldn’t collect more data than what is necessary.” Are people made aware that their personal data will be used to train a model? Probably not

Tobias Judin, Datatilsynet

- Advertisement - “Companies developing a foundational model will invariably say that, since the model is used for just about anything, it is necessary to collect all available data. This approach doesn’t sit well with GDPR. Another set of issues are around data accuracy and quality. Some of the data may be from web forums, including information that is contested – personal data. Those data will also be part of the training of this model.” Once a model is trained, the data is no longer needed. Many organisations think that since the data is not needed, they can delete it to make all issues around data privacy go away. But that thinking has now been challenged. A new type of attack, called model inversion attacks, involves making certain kinds of queries to an AI model to re-identify the data that went into training the model. Some of these attacks specifically target generative AI.