Software Development

Social media sites are training their AI products on your posts. Here’s how you can opt out.

While not unique to LinkedIn, the professional networking site is the latest online platform to automatically opt-in users to its artificial intelligence effort.

LinkedIn's AI training opt out screen. (Techical.ly/Holly Quinn)

If you use social media, you’re probably inadvertently training multiple generative AI models with every post.

This came crashing into our radar last week when LinkedIn updated its data privacy settings with a new caveat. By default, your LinkedIn account is set to allow the platform to use your shared information to train its AI. That includes text, photos, videos and links.

For LinkedIn, that means a broad pool of content to train the generative AI model it uses to stay competitive. But the idea of having one’s posts fed to an AI bot isn’t a popular one, as social media users spread posts advising people to opt-out. (See details below.)

LinkedIn allows you to opt out of sharing your information in the data privacy section of its settings. LinkedIn has gotten a fair amount of flack for the default opt-in — but at least has an easy opt-out feature, unlike other sites. 

Screenshot of the LinkedIn toggle to opt out of using your data for AI training.

Facebook, Instagram, TikTok and Reddit don’t have easy opt-out options for public posts in the United States, and they don’t have to inform you that they’re using your data. 

With its stronger data privacy laws, users in the EU are informed about such usage and may request that their data is not used by a site for any reason. Because a national data privacy law isn’t yet in place in the US, overseas word-of-mouth is sometimes the only way many US users learn about data privacy changes to major social media platforms.

Here’s what to know about the websites using your data to train AI, and how to opt out. 

How can I use social media without training AI?

Opt-out settings vary from platform to platform, and not all major social media sites have the option. 

With so much happening with AI training on social media platforms, now is not the time to just blindly accept updates to privacy policies and terms of services when they pop up on your screen. Reading these sometimes dense documents when they update is the best way to know how your data is being used and how to opt out of data sharing.

LinkedIn

Following the default opt-in news, LinkedIn quietly removed the AI-written prompts it gives users underneath posts. Previously, a sparkle icon near the comment section gave follow-up questions that users might be wondering after reading a post. 

On LinkedIn, once logged in you can visit this link and simply toggle the button to “off:” https://www.linkedin.com/mypreferences/d/settings/data-for-ai-improvement 

Meta

If you don’t want your Meta social media posts to train AI, but you also don’t want to abandon your cousin’s Facebook group or local business Instagram updates, your options are more limited. 

Meta — Facebook, Instagram, Threads and WhatsApp — doesn’t have an easy opt-out for public posts, at least for US users. Private posts, including posts shared with certain users only and posts to private groups, are not used for AI training.

You can go private — Meta’s privacy policy says it doesn’t use any private posts for AI training — or be a perpetual lurker and not post at all. 

Here’s how to adjust other privacy settings on Facebook (https://www.facebook.com/help/193677450678703), Instagram (https://help.instagram.com/285881641526716), Threads (https://help.instagram.com/225222310104065) and WhatsApp (https://faq.whatsapp.com/3307102709559968). 

TikTok

TikTok’s focus is public, not private, engagement and the user content on the platform has always been used to train the TikTok algorithm. It now even uses generative AI, including teaching a generative AI model to visually “see.” 

There is no easy opt-out and no way to engage with TikTok without interacting heavily with AI. 

Adjust other privacy settings here: https://www.tiktok.com/safety/en/account-settings 

Reddit

Reddit inked a $60 million deal with Google to use its content to train Google’s generative AI earlier this year. 

There is no easy opt-out. Because Reddit is a public platform, it does not stop AI from crawling its forums. Change other privacy settings here: https://support.reddithelp.com/hc/en-us/categories/360003246511-Privacy-Security  

“Content and information may also be available in search results on Internet search engines like Google or in responses provided by an AI chatbot like OpenAI’s ChatGPT,” the privacy policy says. “You should take the public nature of the Services into consideration before posting.”

Tumblr

Tumblr shares user data with third-party research and AI model training via its parent company Automatic, which also owns WordPress. While it discourages “crawlers,” the information is openly available to its partners like OpenAI and Midjourney. 

To opt out, go to your blog settings and click “visibility,” then toggle “prevent third-party sharing.” If you have multiple blogs, you’ll have to do this with each one.

X

X, formerly known as Twitter, uses post content to train its Gronk generative AI chatbot, and you can opt-out through settings under privacy and safety. 

The subsection titled “allow your posts as well as your interactions, inputs, and results with Grok to be used for training and fine-tuning” under “data sharing and personalization” can be toggled off.

YouTube

Google trains its AI services such as Gemini on data scraped from the web and its products like YouTube. Google previously said that video creators agree to this when they upload content.

Across tech companies, YouTube Subtitles, a dataset containing information from more than 100,000 videos, has been a rich trove of AI training data. While that dataset technically goes against terms of service, but it’s still used by companies like Apple and Salesforce, so you can’t really “opt out.”

Users can adjust other data and privacy settings here: https://myaccount.google.com/data-and-privacy  

Discord

By default, messaging platform Discord does not collect user data to train its AI programs, like its chatbot Clyde. Other platforms like OpenAI do not have access to the data to train their own models. 

To limit data use, go to Privacy & Safety under User Settings and scroll down to “how we use your data.” Toggle “use data to improve Discord” and “use data to customize my Discord experience” off. 

Twitch

Twitch, owned by Amazon, plays a role in training its parent company’s AI platform. 

While its privacy center details ways to limit other types of data sharing, there does not appear to be an easy “opt-out” option for AI data training.

Snapchat

Snapchat has an AI chatbot, and may even use your likeness in its My Selfie feature to serve up personalized ads with your face on them. Fortunately, My Selfie is an opt-in-only service.

To review My Selfie settings, go to “settings,” then “my account” and “My Selfie.”

The chatbot, My AI, may “be used by Snap to improve Snap’s products and personalize your experience, including ads.” Users can clear that data in settings under the “clear data” button in “privacy controls.” 

Pinterest 

Pinterest is exploring generative AI use for its advertising center. There does not appear to be an easy “opt-out” setting. Change other Pinterest privacy settings here: https://www.pinterest.com/settings/privacy

Why do social media platforms need user content to train AI?

Social media platforms see benefits when they use user content for training purposes.

The data that feeds generative AI can boost profitability by making ads even more targeted and by making recommendations more on point. But one big reason social media has become such an AI training ground is that it’s massive, up to date and contains human interactions. 

AI can learn about trends, slang and regional dialects in real time — all based on your posts. It can also “think” from diverse points of view and, more generally, it can learn to interact with humans more realistically, keeping these platforms viable in the generative AI race.

More unsettlingly, it can use your photos to train AI to create realistic virtual people and use artwork you post to create “new” works of art. 

If you want to know how your social media platforms handle AI training, a good place to start is their privacy policies. 

Since most privacy policies are long and full of legal jargon, an easy way to pinpoint the topic is to search online for the platform name plus the phrase “opt-in AI training.” That should lead you to the platform’s terms, what kind of posts the platform has access to and whether you can opt-out. 

Will opting out actually work?

Even if you opt out of data sharing your social media accounts, other companies may be scraping your information. 

Apple, for example, crawls the web for data to train its AI model. Early in the summer, the company released the Applebot-Extended tool to allow publishers to opt their sites and platforms out. 

While many did opt out — including Meta, Tumblr, the New York Times and the USA Today network — such web scraping bots are common. The bots are legal as long as the data is public and not protected and they don’t always give publishers the option to opt out.

Your only option for posting on social media, if you choose to do so, is to make every post as if it’s educating a robot because it probably already is.

Companies: Meta / Instagram / Tumblr / Apple / Facebook / LinkedIn / Reddit / Twitter

Before you go...

Please consider supporting Technical.ly to keep our independent journalism strong. Unlike most business-focused media outlets, we don’t have a paywall. Instead, we count on your personal and organizational support.

3 ways to support our work:
  • Contribute to the Journalism Fund. Charitable giving ensures our information remains free and accessible for residents to discover workforce programs and entrepreneurship pathways. This includes philanthropic grants and individual tax-deductible donations from readers like you.
  • Use our Preferred Partners. Our directory of vetted providers offers high-quality recommendations for services our readers need, and each referral supports our journalism.
  • Use our services. If you need entrepreneurs and tech leaders to buy your services, are seeking technologists to hire or want more professionals to know about your ecosystem, Technical.ly has the biggest and most engaged audience in the mid-Atlantic. We help companies tell their stories and answer big questions to meet and serve our community.
The journalism fund Preferred partners Our services
Engagement

Join our growing Slack community

Join 5,000 tech professionals and entrepreneurs in our community Slack today!

Trending

A new model for thinking about how to grow regional economies: the Innovation Ecosystem Stack

Delaware’s next governor will be an entrepreneur. Here’s why Matt Meyer thinks it matters. 

Can the nation’s biggest cyber hub even handle Craiglist founder’s $100M security pledge?

Penn dean is a startup founder and ‘engineer at heart’ who loves the connection between education and business

Technically Media