Can you stop your data being used to train AI systems?

Progressive woman is using mobile phone at night in neon lights. Innovation, metaverse and futuristic concepts
Your data is almost certainly being used to train AI. (Getty Images)

When you post anything online, from a picture to a social media comment, there is a good chance it is being used to 'train' artificial intelligence (AI).

Generative AI systems require huge amounts of data to 'train', recognising patterns in thousands of images or texts so that systems can recognise faces or form sentences, for example.

Your data has almost certainly been ingested and used by AI systems – and it's fairly difficult to prevent your data being used at all.

Companies from large internet giants to small start-ups routinely 'scrape' the web for any publicly available information, and there's little you can do about it.

Ben Winters, who leads the Electronic Privacy Information Center's AI and Human Rights Project, said earlier this year: "In the absence of meaningful privacy regulations, that means that people can scrape really widely all over the internet, take anything that is 'publicly available' v that top layer of the internet for lack of a better term – and just use it in their product."

But some companies, including Facebook's parent company Meta, do offer controls to prevent your posts being used to train AI.

It's a good idea to do so, as having your data used in this way potentially puts you at risk of an AI system accidentally revealing information about you – or hackers accessing a database and stealing it.

Recommended reading

Can you stop Meta (Facebook) from using your personal data?

Meta (the parent company of Facebook and Instagram) offers a limited way to stop AI systems being trained on your data.

How limited? Well, you can't stop it from training on things you post on Facebook, but you can opt out of Facebook using data scraped from other websites.

To do so, go to the Generative AI Data Subject Rights forms, and you'll see three options at the bottom. Select 'Delete any personal information from third parties used for generative AI'.

People converse in front of a logo of Facebook messenger, during a Meta event in Mumbai, India, 20 September, 2023. (Photo by Niharika Kulkarni/NurPhoto via Getty Images)
Can you stop Meta using your data? (Getty Images)

You then need to enter information such as your name and email address – and also 'relevant prompts' and 'further information' (fill these with blank text if you don't have any information).

You also need to attach a JPG, but you can attach any file so long as it's under 4MB.

Once submitted, Facebook promises an email response, although the company says: "We don't automatically fulfil requests sent using this form. We review them consistent with your local laws."

Can you stop OpenAI and ChatGPT from using your data?

You can't stop OpenAI using publicly available information from the internet, sadly (and its latest GPT-4 model was trained on a petabyte of information – that's 1,000,000,000,000,000 bytes).

But what you can do is stop OpenAI from using anything you type into ChatGPT itself.

First, sign into chat.openai.com, and click on your name in the bottom-left

Click on 'settings' and then 'data controls' and then click 'show'.

Toggle the setting 'chat History & Training' as you can see below the setting is enabled.

If the setting is off, your data will not be used to train OpenAI's models.