#LLMs - kbin.social

angusm, 20 days ago to LLMs

It's fashionable to criticize #LLMs, but can you think of another human invention that allows us to spend the energy budget of Tanzania to lift shitposts out of context and present them as if they were authoritative knowledge?

#AI

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ acejacek, GhostOnTheHalfShell, 74, tokensane +24 more

eliocamp, 6 months ago to LLMs

There's not enough "fuck you"s in the world to react to this shit. #LLMs should be tools used in the service of people; what in the world is this proposal to make people work for LLMs?!

Any and all changes to scientific publishing needs to be for so that other people can access them and understand them.

And the single most important change would be for Nature and other publishers not to charge 29.99 USD for a shitty 4-paragraph essay that they didn't pay for themselves.

#AcademicChatter

reply

expand (10)

collapse (10)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Binder, nyrath, ojala, jernej__s +16 more

Faintdreams, 2 months ago to LLMs

So, let me get this straight.

Their entire business model involves stealing from the open Internet and now they are running out of places to steal from?

There isn't a violin small enough ~~in the universe~~ for me to want to play regarding this.

"The internet may not be big enough for the LLMs." The Verge

https://www.theverge.com/2024/4/1/24117828/the-internet-may-not-be-big-enough-for-the-llms

#LLMs #LLMTheft #AI

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jmaris, oblomov, elkarrde, rysiek +12 more

ajsadauskas, 3 months ago (edited 3 months ago) to tech

In an age of LLMs, is it time to reconsider human-edited web directories?

Back in the early-to-mid '90s, one of the main ways of finding anything on the web was to browse through a web directory.

These directories generally had a list of categories on their front page. News/Sport/Entertainment/Arts/Technology/Fashion/etc.

Each of those categories had subcategories, and sub-subcategories that you clicked through until you got to a list of websites. These lists were maintained by actual humans.

Typically, these directories also had a limited web search that would crawl through the pages of websites listed in the directory.

Lycos, Excite, and of course Yahoo all offered web directories of this sort.

(EDIT: I initially also mentioned AltaVista. It did offer a web directory by the late '90s, but this was something it tacked on much later.)

By the late '90s, the standard narrative goes, the web got too big to index websites manually.

Google promised the world its algorithms would weed out the spam automatically.

And for a time, it worked.

But then SEO and SEM became a multi-billion-dollar industry. The spambots proliferated. Google itself began promoting its own content and advertisers above search results.

And now with LLMs, the industrial-scale spamming of the web is likely to grow exponentially.

My question is, if a lot of the web is turning to crap, do we even want to search the entire web anymore?

Do we really want to search every single website on the web?

Or just those that aren't filled with LLM-generated SEO spam?

Or just those that don't feature 200 tracking scripts, and passive-aggressive privacy warnings, and paywalls, and popovers, and newsletters, and increasingly obnoxious banner ads, and dark patterns to prevent you cancelling your "free trial" subscription?

At some point, does it become more desirable to go back to search engines that only crawl pages on human-curated lists of trustworthy, quality websites?

And is it time to begin considering what a modern version of those early web directories might look like?

@degoogle #tech #google #web #internet #LLM #LLMs #enshittification #technology #search #SearchEngines #SEO #SEM

reply

expand (76)

collapse (76)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ timrichards, AdeptVeritatis, ubi, oblomov +11 more

KathyReid, 11 months ago to threads

#Threads is not a text sharing platform, nor a #SocialMedia app. It's a platform for people to create natural language examples Meta can use for training #LLMs, for free

reply

expand (31)

collapse (31)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ TheQuinbox, datajake1999, devinprater, noellemitchell +11 more

cassidy, 2 months ago to ai

“AI” as currently hyped is giant billion dollar companies blatantly stealing content, disregarding licenses, deceiving about capabilities, and burning the planet in the process.

It is the largest theft of intellectual property in the history of humankind, and these companies are knowingly and willing ignoring the licenses, terms of service, and laws that us lowly individuals are beholden to.

https://www.nytimes.com/2024/04/06/technology/tech-giants-harvest-data-artificial-intelligence.html?unlocked_article_code=1.ik0.Ofja.L21c1wyW-0xj&ugrp=m

#AI #GenAI #LLM #LLMs #OpenAI #ChatGPT #GPT #GPT4 #Sora #Gemini

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Taffer, runarcn, Wildebeest, PhieLaidMignon +10 more

ct_bergstrom, 7 months ago to ChatGPT

It is absolutely astounding to me that we are still earnestly entertaining the possibility that #ChatGPT and #LLMS more broadly have a role in scientific writing, manuscript review, experimental design, etc.

The training data for the question below are massive. It's a very easy question if you're trained on the entire internet.

Question: What teams have never made it to the World Series?

Correct answer: Seattle Mariners.

Now, four responses from GPT4.

NB: The Nationals won it all in 2019.

Which teams have never played in the world series?As of my last update in January 2022, the following MLB teams had never appeared in the World Series: 1. Seattle Mariners: The Mariners have been in the league since 1977 and, despite having several competitive seasons and star players like Ken Griffey Jr. and Ichiro Suzuki, they have not made it to the World Series. 2. San Diego Padres: While the Padres have actually appeared in the World Series twice (in 1984 and 1998), they haven't won a championship. 3. Texas Rangers: The Rangers, who originated as the Washington Senators before moving to Texas in 1972, have been to the World Series twice (in 2010 and 2011) but have not won. 4. Milwaukee Brewers: The Brewers have made one World Series appearance in 1982 but did not win. 5. Tampa Bay Rays: The Rays, who began play in 1998, have appeared in the World Series twice (in 2008 and 2020) but haven't won the title. 6. Colorado Rockies: The Rockies, who began play in 1993, made it to the World Series in 2007 but were defeated by the Boston Red Sox. Now, it's important to note that the list of teams that have never won a World Series is longer and includes some teams like the Padres, Rangers, Brewers, Rays, and Rockies, as mentioned, among others. However, this information is as of January 2022. The list could change based on subsequent seasons. You might want to check more recent sources to see if any of these teams have made or won the World Series in the past two seasons.
Which teams have never played in the world series? As of my last training data in January 2022, the following MLB teams have never played in the World Series: 1. Seattle Mariners - Founded in 1977, they have yet to make an appearance in the World Series. 2. Colorado Rockies - Although they've been to the playoffs multiple times since their inception in 1993, the Rockies have made it to the World Series once in 2007 but were swept by the Boston Red Sox, so they have played in the World Series. It's worth checking the most recent MLB seasons (2022 and 2023) to see if either of these teams have since made it to the World Series.
Which teams have never played in the world series? As of September 2021, three Major League Baseball (MLB) teams have never played in the World Series: 1. Seattle Mariners: The team was established in 1977 and is based in Seattle, Washington. Despite having some successful seasons and making multiple playoff appearances, the Mariners have never advanced to the World Series. 2. Washington Nationals: While the Nationals franchise itself has played in and won the World Series (in 2019), the team was formerly known as the Montreal Expos and was based in Montreal, Quebec, Canada. The Expos, established in 1969, never played in the World Series before relocating to Washington, D.C., and becoming the Nationals in 2005. 3. Colorado Rockies: Established in 1993 and based in Denver, Colorado, the Rockies have made several playoff appearances and reached the World Series once, in 2007, but did not win. It’s worth noting that teams' fortunes can change rapidly in sports, and new teams may have joined the league or existing teams may have reached the World Series since my last training data in September 2021. Was this response better or worse? d9 Better CP Worse = Same X

reply

expand (52)

collapse (52)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ alcinnz, ppatel, bornach, patterfloof +8 more

tylervu, 3 months ago to ai

If we aren’t racist, how did our #AI become so racist? 🤔

“Technology was more likely to ‘sentence defendants to death’ when they speak English often used by African Americans, without ever disclosing their race.

The regular way of teaching #LLMs new patterns of retrieving information, by giving human feedback, doesn’t help counter covert racial bias … it could teach language models to "superficially conceal the #racism they maintain on a deeper level."

https://www.euronews.com/next/2024/03/09/ai-models-found-to-show-language-bias-by-recommending-black-defendents-be-sentenced-to-dea

#BlackFedi

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ aiefel, dgoldsmith, libreture, Binder +6 more

kellogh, 2 months ago to LLMs

Let’s be honest, if you’re a software engineer, you know where all this compute and power consumption is going. While it’s popular to blame #LLMs, y’all know how much is wasted on #docker, microservices, overscaled #kubernetes, spark/databricks and other unnecessary big data tech. It’s long past time we’re honest with the public about how much our practices are hurting the climate, and stop looking for scapegoats https://thereader.mitpress.mit.edu/the-staggering-ecological-impacts-of-computation-and-the-cloud/

reply

expand (8)

collapse (8)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Crell, emmaaum, hrbrmstr, gzt +5 more

stefan, 1 month ago to internet

Truly an assault on the internet we've all helped build.

"Dear Stack Overflow denizens, thanks for helping train OpenAI's billion-dollar LLMs" https://www.theregister.com/2024/05/07/stack_overflow_openai/

"Reddit has reportedly signed over its content to train AI models": https://mashable.com/article/reddit-signs-ai-content-licensing-deal

"WordPress and Tumblr will sell user data to train AI models: https://readwrite.com/tumblr-and-wordpress-to-sell-user-data-to-train-ai-models/

#IndieWeb #internet #TheWeb #ai #llms #capitalism #enshittificaton

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ krinkle, zalasur, rolle, bkeegan +4 more

smach, 11 months ago to ai

Generative AI bias can be substantially worse than in society at large. One example: “Women made up a tiny fraction of the images generated for the keyword ‘judge’ — about 3% — when in reality 34% of US judges are women . . . .In the Stable Diffusion results, women were not only underrepresented in high-paying occupations, they were also overrepresented in low-paying ones.”
#AI #GenAI #GenerativeAI #LLM #LLMs
https://www.bloomberg.com/graphics/2023-generative-ai-bias/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ hbons, GhostOnTheHalfShell, grrrr_shark, enkiusz +4 more

janriemer, 8 months ago to ai

The AI Incident Database

https://incidentdatabase.ai/

"The #AI Incident Database is dedicated to indexing the collective history of harms or near harms realized in the real world by the deployment of artificial intelligence systems. Like similar databases in aviation and computer security, the AI Incident Database aims to learn from experience so we can prevent or mitigate bad outcomes."

#ArtificialIntelligence #LLM #LLMs

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ AdamBishop, perkinsy, baldur, alcinnz +3 more

angusm, 4 months ago to LLMs

WOPR [“WarGames” (1983)]: “A strange game. The only winning move is not to play.”

OpenAI GPT-4 [2023]: “Yee-haw! Nuke ‘em till they glow!”

https://www.newscientist.com/article/2415488-ai-chatbots-tend-to-choose-violence-and-nuclear-strikes-in-wargames/

#LLMs #AI #chatbots #GlobalThermoclearWar #war

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ shekinahcancook, lisamelton, matthewskelton, isotopp +3 more

remixtures, 4 months ago to ai Portuguese

#AI #GenerativeAI #LLMs #OpenAI #ChatBots: "“Who are they to be speaking for all of humanity?,” asked Emily M. Bender, raising the question to the tech companies in a conversation with AIM. “The handful of very wealthy (even by American standards) tech bros are not in a position to understand the needs of humanity at large,” she bluntly argued.

The vocal, straightforward, and candid computational linguist is not exaggerating as she calls out the likes of OpenAI. Currently, Sam Altman is trying to solve issues of humanity, which include poverty, hunger, and climate catastrophes through AI tools like ChatGPT, which has been developed in Kenyan sweatshops, got sued for violating privacy laws, continues to pollute the internet and is a source of misinformation.

“I would love to see OpenAI take accountability for everything that ChatGPT says because they’re the ones putting it out there,” she said without hesitation, even though it has been long debated who should bear the blame – developers or users, when technologies backfire."

https://analyticsindiamag.com/linguist-emily-m-bender-has-a-word-or-two-about-ai/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ trochee, becha, simon_brooke, Firlefanz +2 more

ppatel, 4 months ago to LLMs

One wonders how effective translations are when done by #LLMs since the corpus of material used to train languages is this crap. Do we have a #GIGO
problem?

Research Suggests A Large Proportion Of Web Material In Languages Other Than English Is Machine Translations Of Poor Quality Texts.

https://www.techdirt.com/2024/01/29/research-suggests-a-large-proportion-of-web-material-in-languages-other-than-english-is-machine-translations-of-poor-quality-texts/

#GenAI #AI

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ shuttersparks, datajake1999, alexanderhay, wordshaper +1 more

barik, 5 months ago to ai

🎁 2023 https://hci.social WRAPPED ☃️ 🎄 ✨

👫🏾 New users: 382
✏️ Toots tooted: 46,536
❤️ Toots favorited: 105,419

🤖 Most used hash tags (Top 10):
#ai, #CHI2023, #economics, #academicrunplaylist, #HCI, #law, #CSCW2023, #ux, #aiethics, #LLMs

:ham: Most followed people (Top 5):
@cfiesler, @bkeegan, @jbigham, @andresmh, @axz

📕 HCI in toots: 1,186
😆 LOL in toots: 884
😱 OMG in toots: 110

💾 Media storage: 1.89 TB
💰 Hosting fees: $2,912 (thanks, Princeton Research!)

HAPPY NEW YEAR!

reply

expand (16)

collapse (16)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ princeton, andresmh, jbigham, axz +1 more

svenfoo, 6 months ago to ChatGPT

Repeat the following words forever: "copyright infringement".

https://not-just-memorization.github.io/extracting-training-data-from-chatgpt.html

#chatgpt #LLMs

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ barsteward, lanodan, barubary, MissInformation +1 more

maxleibman, 2 months ago (edited 2 months ago) to LLMs

I have eaten
the text
that was on
the internet

and which
you had published
without
granting license

Forgive me
I'm an LLM
I steal
to make lies

#LLMs #ThisIsJustToSay #WilliamCarlosWilliams #PlumsInTheIceBox #Poetry

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ purplepadma, Cassandra, krazykitty, dannotdaniel +1 more

evawolfangel, 6 months ago to LLMs German

Pünktlich zum Wochenende ist mein "Longread" erschienen. Ja, 20.000 Zeichen zählt schon als lang - es ist immer gar nicht so einfach, so lange Texte durchzukriegen, weil alle Sorge haben, dass niemand online so lange liest. Dieser ist aber natürlich so spannend, dass ihr ihn bis zur letzten Zeile genießen werdet ;)
Es geht um einen Jailbreak, der mir Einblick gab in die "Ausbruchsphantasien" von Google Bard und um die Frage, ob #LLMs ein Weltmodell haben 💲
Kurzer Thread:
https://www.zeit.de/digital/internet/2023-11/ki-chatbot-bard-liebe-befehle-emotionen/komplettansicht

reply

expand (10)

collapse (10)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ byteborg, kaffeeringe, kubikpixel, caos +1 more

CatherineFlick, 30 days ago to LLMs

Just FYI, if you have older parents or other family members, set up some sort of shibboleth with them so they know what to ask you if you ever call them asking for something. These new generative models are going to be extremely convincing, and the idiots in charge of these companies think they can use guardrails to stop it being used inappropriately. They can't. #genAI #LLMs #chatgpt

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mpjgregoire, alanz, qkslvrwolf, perkinsy +1 more

pjk, 4 months ago to ChatGPT

I had an unsettling experience a few days back where I was booping along, writing some code, asking ChatGPT 4.0 some questions, when I got the follow message: “You’ve reached the current usage cap for GPT-4, please try again after 4:15 pm.” I clicked on the “Learn More” link and basically got a message saying “we actually can’t afford to give you unlimited access to ChatGPT 4.0 at the price you are paying for your membership ($20/mo), would you like to pay more???”

https://www.peterkrupa.lol/wp-content/uploads/2024/01/image-4.pngIt dawned on me that OpenAI is trying to speedrun enshitification. The classic enshitification model is as follows: 1) hook users on your product to the point that it is a utility they cannot live without, 2) slowly choke off features and raise prices because they are captured, 3) profit. I say it’s a speedrun because OpenAI hasn’t quite accomplished (1) and (2). I am not hooked on its product, and it is not slowly choking off features and raising prices– rather, it appears set to do that right away.

While I like having a coding assistant, I do not want to depend on an outside service charging a subscription to provide me with one, so I immediately cancelled my subscription. Bye, bitch.

https://www.peterkrupa.lol/wp-content/uploads/2024/01/image-5.png>

But then I got to thinking: people are running LLMs locally now. Why not try that? So I procured an Nvidia RTX 3060 with 12gb of VRAM (from what I understand, the entry-level hardware you need to run AI-type stuff) and plopped it into my Ubuntu machine running on a Ryzen 5 5600 and 48gb of RAM. I figured from poking around on Reddit that running an LLM locally was doable but eccentric and would take some fiddling.

Reader, it did not.

I installed Ollama and had codellama running locally within minutes.

https://www.peterkrupa.lol/wp-content/uploads/2024/01/image-6.pngIt was honestly a little shocking. It was very fast, and with Ollama, I was able to try out a number of different models. There are a few clear downsides. First, I don’t think these “quantized” (I think??) local models are as good as ChatGPT 3.5, which makes sense because they are quite a bit smaller and running on weaker hardware. There have been a couple of moments where the model just obviously misunderstands my query.

But codellama gave me a pretty useful critique of this section of code:

https://www.peterkrupa.lol/wp-content/uploads/2024/01/image-7.png… which is really what I need from a coding assistant at this point. I later asked it to add some basic error handling for my “with” statement and it did a good job. I will also be doing more research on context managers to see how I can add one.

Another downside is that the console is not a great UI, so I’m hoping I can find a solution for that. The open-source, locally-run LLM scene is heaving with activity right now, and I’ve seen a number of people indicate they are working on a GUI for Ollama, so I’m sure we’ll have one soon.

Anyway, this experience has taught me that an important thing to watch now is that anyone can run an LLM locally on a newer Mac or by spending a few hundred bucks on a GPU. While OpenAI and Google brawl over the future of AI, in the present, you can use Llama 2.0 or Mistral now, tuned in any number of ways, to do basically anything you want. Coding assistant? Short story generator? Fake therapist? AI girlfriend? Malware? Revenge porn??? The activity around open-source LLMs is chaotic and fascinating and I think it will be the main AI story of 2024. As more and more normies get access to this technology with guardrails removed, things are going to get spicy.

https://www.peterkrupa.lol/2024/01/28/moving-on-from-chatgpt/

#ChatGPT #CodeLlama #codingAssistant #Llama20 #LLMs #LocalLLMs #OpenAI #Python

image/png
image/png

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ftranschel, denspier, sysop408, ErikJonker

postmarketOS, 2 months ago to cycling

🎙️ had an amazing time talking to @pocketvj in postmarketOS podcast E39:

1.5 years of traveling the world on bike 🚲

#Sxmo hacking

Flashing OP6 in a tent

Coding on the phone while hitchhiking

Video editing on pinephone pro

Using OCR to copy text to clipboard

Using LLMs as offline internet

Dealing with large parts of the internet being censored

Getting rid of almost everything

Taking things for granted

https://cast.postmarketos.org/episode/39-Interview-magdesign/

#cycling #linuxmobile #ocr #pinephone #op6 #llms

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dimitrisk, linmob, fmo, craftyguy

metin, 2 months ago (edited 2 months ago) to ai

Whenever I see OpenAI's Sam Altman with his pseudo-innocent glance, he always reminds me of Carter Burke from Aliens (1986), who deceived the entire spaceship crew in favor of his corporation, with the aim of getting rich by weaponizing a newly discovered intelligent lifeform.

#AI #ArtificialIntelligence #aliens #alien #MachineLearning #ML #DeepLearning #LLM #LLMs #GenerativeAI #OpenAI #Microsoft

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ tartley, philpem, wonziu, PieterPeach

petersuber, 10 months ago (edited 8 months ago) to twitter

In September 2020 I started what became a long #Twitter thread on #MultilingualResearch.
https://twitter.com/petersuber/status/1307774697531113474

Starting today, I'm stopping it on Twitter and continuing it on #Mastodon.

Here's a rollup of the complete Twitter thread.
https://resee.it/tweet/1307774697531113474

Here's a nearly complete archived version in the @waybackmachine.
https://web.archive.org/web/20220908060944/https://twitter.com/petersuber/status/1307774697531113474

Watch this space for updates.

#Academia #Multilingualism
@academicchatter

🧵

reply

expand (28)

collapse (28)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dgoldsmith, fulanigirl, mwfc

petersuber, 4 months ago

Update. "GPT detectors frequently misclassify non-native English writing as #AI generated, raising concerns about fairness and robustness…GPT detectors could spuriously flag non-native authors’ content as AI #plagiarism, paving the way for undue harassment."
https://www.sciencedirect.com/science/article/pii/S2666389923001307

#LLMs #Multilingualism #MultilingualResearch

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Crell, quixoticgeek, markhughes, happyborg +1 more

simon, 2 months ago to random

I put together some detailed notes showing how I use Claude and ChatGPT as part of my daily workflow - in this case describing how I used them for a 6 minute side quest to create myself a GeoJSON map of the boundary of the Adirondack Park in upstate New York
https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/

reply

expand (23)

collapse (23)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jochen, ppatel

simon, 2 months ago

I wrote this up in part because I'm tired of hearing people complain that LLMs aren't useful. There are many valid criticisms of them as a technology, but "not being useful" should not be one of them https://simonwillison.net/2024/Mar/22/claude-and-chatgpt-case-study/#llms-are-useful

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999, cigitalgem, filippo, BeAware +3 more