Brendanjones, (edited )
@Brendanjones@fosstodon.org avatar

Alright, so many companies are using user or customer data for training without consent that I think I'm going to have to make an ongoing thread to document them all. 🤖

Here we go! 1/x

Starting out with who have sold user data to another company, that will use it to train AI:

https://fosstodon.org/@Brendanjones/111964241353263058

Brendanjones,
@Brendanjones@fosstodon.org avatar

#Docusign have too much juicy data to be left out of the fun:

"Docusign just admitted that they use customer data (i.e., all those contracts, affidavits, and other confidential documents we send them) to train AI"

2/x

https://mastodon.social/@gvwilson/112012277852906749

Brendanjones,
@Brendanjones@fosstodon.org avatar

#Tumblr and #Wordpress in on the act:

"Tumblr and WordPress.com are preparing to sell user data to Midjourney and OpenAI"

3/x

https://www.404media.co/tumblr-and-wordpress-to-sell-users-data-to-train-ai-tools/ #OpenAI

otto42,
@otto42@fosstodon.org avatar
Brendanjones,
@Brendanjones@fosstodon.org avatar

@otto42 omfg

"We already discourage AI crawlers from gathering content from WordPress.com and will continue to do so, save for those with which we partner."

Brendanjones,
@Brendanjones@fosstodon.org avatar

16 years of user-generated Stack Overflow content is going to be fed into AI to “provide Stack Overflow content directly within Google Cloud”.

4/x

https://meta.stackexchange.com/questions/398127/our-partnership-with-google-and-commitment-to-socially-responsible-ai?cb=1

Brendanjones,
@Brendanjones@fosstodon.org avatar

Here's one from last year that just came to my attention: #Elsevier have packaged up millions of scientific papers and author profiles for anyone to use for "AI and digital transformation" (their words from https://www.elsevier.com/solutions/datasets).

The complete opposite of #OpenScience.

5/x

https://www.elsevier.com/about/press-releases/elsevier-introduces-authoritative-scientific-datasets-to-fuel-innovation-and and

#Science #AI

Brendanjones,
@Brendanjones@fosstodon.org avatar

The dataset includes:

  • 19 million full-text articles from peer-reviewed journals
  • 17 million author profiles
  • 1.8 billion cited references
  • 333 million chemical substances and reactions
  • 86 million bioactivities and biomedical records
  • 35 million chemical patents

Woah. That's humanity's knowledge, for sale if you have the money.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • ai
  • DreamBathrooms
  • ngwrru68w68
  • tester
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • mdbf
  • tacticalgear
  • JUstTest
  • osvaldo12
  • normalnudes
  • cubers
  • cisconetworking
  • everett
  • GTA5RPClips
  • ethstaker
  • Leos
  • provamag3
  • anitta
  • modclub
  • megavids
  • lostlight
  • All magazines