Search - kbin.social

DAIR, 5 months ago to random

404 Media reports that "Largest Dataset Powering AI Images Removed After Discovery of Child Sexual Abuse Material" 🧵

However, in 2021, a preprint by @abebab, Vinay Uday Prabhu & Emmanuel Kahembwe found a number issues in the dataset including " troublesome and explicit images and text pairs of rape, pornography, malign stereotypes, racist and ethnic slurs, and other extremely problematic content."

The preprint can be found here: https://arxiv.org/abs/2110.01963

https://www.404media.co/laion-datasets-removed-stanford-csam-child-abuse/

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ashar, poppastring, alex, BigAngBlack +2 more

tarkowski, 5 months ago

@ed @DAIR @abebab I would be surprised to learn that there is a patching culture for datasets like LAION.
The story shared by 404 shows that dataset maintenance standards are badly needed. I think it’s also a cultural change that’s needed: from a culture of data dumps to one of data care

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...