After all, they were not interested in the cultural and religious significance of these items. They just wanted the gold, which they could easily sell to others.
By reducing it to the raw material they destroyed the cultural and historical context. We are all poorer for it.
When #AI techbros train their #LLMs on people's works, these works are similarly removed from their individual context, "melted down" into the raw material — training data. Just disembodied words and phrases and sentences for the model to parrot later.
Historical and cultural and social context of these works is destroyed, melted away.
And it also happens without consent.
And it also happens on a gigantic scale.
And it also happens because techbros can sell the output.
Is there anyone serious who is saying this? Or is this just another way to make the tech seem more powerful than it is?
I don't get this "we're all gonna die" thing at all.
I do get the "we are too disorganized and greedy to integrate new technology well without the economy getting screwed up and people suffering... but that's another matter..."
@msh
Not true. All the #benchmarks say otherwise. You have to look past the hyped #LLMs to the bread and butter BERT and BART models, but the trend is undeniable:
If it's true that LLM-generated content is going to kill user-generated content platforms because it's DDOSing their moderation systems, one would expect to see a rise in consumption of higher production value content. Good news for streaming platforms, I guess?
LLM-generated content doesn't have to DDOS a moderation system. Moderators are pretty good at distinguishing generated content from other stuff. My accuracy rate was about 80% for the most problematic stuff, the stuff that's churned out en masse to scam revenue sharing programs. The problem is that it's hard to say exactly how you know something is generated, and the people who run moderation systems think moderation systems have to be perfectly systematic in order to be fair. #ai#LLMs
It is, in fact, impossible to be perfectly systematic, but for about 70-80% of the population, this is really hard to understand. So the only route for survival for user-generated content platforms, meta-systematic moderation (needs a whole post, but basically doing what needs doing in the gaps between explicit rules), is going to be something that seems unfair and arbitrary to most people. I don't see how they get past that. #ai#LLMs#artificialintelligence#moderation
Da habe ich bei dir weniger Sorgen. Es ist auch wohl die einzige Möglichkeit, noch etwas zu tun. :)
Ich befürchte jedoch, als Gesellschaft haben wir da schon versagt. Es hätte nie soweit kommen dürfen, ohne dass wir als Gesellschaft auf #GAI / #LLMs und #AGI vorbereitet sind.
Next was an excellent talk by @diyiyang on socially responsible #NLP at the #Stanford Cyber Policy center. This is a great overview of the various problems with #LLMs and also how approaches can be adapted to make systems more culturally aware. You should definitely skip the Q&A here though https://www.youtube.com/watch?v=PrVWEdVfvIQ (7/8)
This, via @emilymbender, is so good: a paper opposing " gratuitous anthropomorphic features." I've been arguing that LLMs should not use first-person human but third-person* machine and not use brain verbs but machine verbs (i.e., instead of "I write," "the program assembled).
Corrected. I had said first-person machine but I stand corrected by the expert, linguist Dr. Bender.
“LLMs not only fail to properly generate correct Python code when default function names are swapped, but some of them even become more confident in their incorrect predictions as the model size increases, an instance of the recently discovered phenomenon of Inverse Scaling, which runs contrary to the commonly observed trend of increasing prediction quality with increasing model size” https://arxiv.org/abs/2305.15507
I had a pretty busy day, but at least I was able to go for a nice walk and listen to some nice talks for my #AcademicRunPlaylist while I was waiting for my car at the garage! (1/7)
First was a fantastic panel on using #LLMs for science at the Alan Turing Institute with @abebab, Atoosa Kasirzadeh, and @SandraWachter. The panelists are incisive and withering in their criticism of blindly applying LLMs to fields where the truth matters, as well as the tendency of industry and academia to center benefits rather than harms. Highly recommend https://www.youtube.com/watch?v=FgoT1Jygf1k (2/7)
Looking at other people’s initial impressions of Google’s new AI search beta, and I feel like a lot of the discourse is kind of missing the point?
Yes, all the usual questions about #AI and #LLMs are still important. But what makes Google’s experiment in particular interesting is that it’s NOT a chatbot.
It’s a different UX for generative AI. And its an experience that is grounded in search versus Bing which felt like it was just a search-tinted coat of paint on top of ChatGPT
One of the decisive moments in my understanding of #LLMs and their limitations was when, last autumn, @emilymbender walked me through her Thai Library thought experiment.
She's now written it up as a Medium post, and you can read it here. The value comes from really pondering the question she poses, so take the time to think about it. What would YOU do in the situation she outlines?
@ct_bergstrom@emilymbender The frustrating thing about the topic is that when one has understood the basics workings of #LLMs everyone is a pundit. Because at this point everything can be just hand-waving and speculating there is no basis for gaining any scientific knowledge IMO.
@ct_bergstrom@emilymbender No, I read it.
From your writing I think I know that you also see the importance of the mis- disinformation aspect.
My concern is that the focus on the what it "is or is not" lets the public easily dismiss the whole debate around #LLMs as "academic" or on the other end sensationalise it.
@ct_bergstrom@emilymbender That's very good. One difference going forward between #LLMs and Emily's "stuck in a Thai library with only words", is that Bing-style ChatGPT gets to make up answers and see how real people respond. If you were smart, couldn't speak Thai, were stuck in a Thai library, and you could try out sentences on Thai people to see their responses, could you gradually build up some concepts of what words mean? Or, do you still need some external context to apply meaning?
Another crucial aspect in this matter of #AI rewriting your code:
You should make your tools reusable, so that others can benefit from it as well. This is not possible with #LLMs. Sure you can share the prompt, but the output is all wishy-washy.
Use a proper tool for this kind of task, e.g. ast-grep - ⚡ A fast and polyglot tool for code searching, linting, rewriting at large scale. Written in #Rust:
On that CNET thing in the last boost, my first thought was "this is gonna make search even more useless" and… yeeeep "They are clearly optimized to take advantage of Google’s search algorithms, and to end up at the top of peoples’ results pages"
@willoremus digs into the compute cost of #LLMs and boy does that not look like good news for all the startups cramming #AI into everything (gift link) https://wapo.st/3WTCK8Q
$20/month isn't a ton of money, but if you use a free UI with the #OpenAI API, you can get away with paying less than $5/month for #ChatGPT with relatively heavy usage and it's a lot more stable than the free version #AI#LLM#LLMs
Nvidia is now a trillion dollar company thanks to the demand glut for their chips to train LLMs. I still wonder about the unit economics but hopefully it'll go down significantly with time. #LLMs#AI#gpu#nvidia
“Democracy depends on the informed (not misinformed) consent of the governed. By allowing the most economically and politically powerful people, corporations, and governments to control our attention, these systems will control us.” - Daniel C. Dennett
Very interesting study on the impact of bias in #AI leading to biased thinking.
“It turned out that anyone who received #AI assistance was twice as likely to go with the bias built into the #AI, even if their initial opinion had been different.”
imo this decade in tech is about use cases that are in the real world, not on a computer. #LLMs let you interface with a computer without touching or looking at it. #AR let’s you overlay compute on top of the real world. #VR let’s you stand in someone else’s shoes and experience what they’re seeing. all of this needs a lot of work, but that’s where it’s going
#AI#GenerativeAI#LLMs#OpenSource#BigTech: "To be clear, examples like BLOOM and GPT-J are still far from the proverbial “start-up in a garage,” and were not developed for deployments comparable to other commercial models and their benchmarks. Big Tech, and large, well-capitalized companies more generally, still have advantages.
But the extent of that advantage depends on a key question: even if larger companies can build the highest performing models, will a variety of entities still be able to create models that are good enough for the vast majority of deployed use cases? Bigger might always be better; however, it’s also possible that the models that smaller entities can develop will suit the needs of consumers (whether individuals or companies) well enough, and be more affordable. Segments of the market and different groups of users may operate differently. There may be some sets of use cases in which competition is strongly a function of relative model quality, while in other instances competition depends on reaching some threshold level of model quality, and then differentiation occurs through other non-AI factors (like marketing and sales). Users might in many cases need outputs that reach a given quality threshold, without necessarily being best in class; or, a model might serve a subset of users at very high quality levels and thus be sufficient even if it doesn’t hit performance benchmarks that matter to others."
With #WWDC around the corner, I've cleared my deck and wrote up my thoughts on Microsoft's Build conference. Interesting guidance for adding #LLMs to apps and the upcoming tools in Azure do look like promising building blocks.