It is absolutely astounding to me that we are still earnestly entertaining the... - ChatGPT, dude

ct_bergstrom, 7 months ago

It is absolutely astounding to me that we are still earnestly entertaining the possibility that #ChatGPT and #LLMS more broadly have a role in scientific writing, manuscript review, experimental design, etc.

The training data for the question below are massive. It's a very easy question if you're trained on the entire internet.

Question: What teams have never made it to the World Series?

Correct answer: Seattle Mariners.

Now, four responses from GPT4.

NB: The Nationals won it all in 2019.

Which teams have never played in the world series?As of my last update in January 2022, the following MLB teams had never appeared in the World Series: 1. Seattle Mariners: The Mariners have been in the league since 1977 and, despite having several competitive seasons and star players like Ken Griffey Jr. and Ichiro Suzuki, they have not made it to the World Series. 2. San Diego Padres: While the Padres have actually appeared in the World Series twice (in 1984 and 1998), they haven't won a championship. 3. Texas Rangers: The Rangers, who originated as the Washington Senators before moving to Texas in 1972, have been to the World Series twice (in 2010 and 2011) but have not won. 4. Milwaukee Brewers: The Brewers have made one World Series appearance in 1982 but did not win. 5. Tampa Bay Rays: The Rays, who began play in 1998, have appeared in the World Series twice (in 2008 and 2020) but haven't won the title. 6. Colorado Rockies: The Rockies, who began play in 1993, made it to the World Series in 2007 but were defeated by the Boston Red Sox. Now, it's important to note that the list of teams that have never won a World Series is longer and includes some teams like the Padres, Rangers, Brewers, Rays, and Rockies, as mentioned, among others. However, this information is as of January 2022. The list could change based on subsequent seasons. You might want to check more recent sources to see if any of these teams have made or won the World Series in the past two seasons.
Which teams have never played in the world series? As of my last training data in January 2022, the following MLB teams have never played in the World Series: 1. Seattle Mariners - Founded in 1977, they have yet to make an appearance in the World Series. 2. Colorado Rockies - Although they've been to the playoffs multiple times since their inception in 1993, the Rockies have made it to the World Series once in 2007 but were swept by the Boston Red Sox, so they have played in the World Series. It's worth checking the most recent MLB seasons (2022 and 2023) to see if either of these teams have since made it to the World Series.
Which teams have never played in the world series? As of September 2021, three Major League Baseball (MLB) teams have never played in the World Series: 1. Seattle Mariners: The team was established in 1977 and is based in Seattle, Washington. Despite having some successful seasons and making multiple playoff appearances, the Mariners have never advanced to the World Series. 2. Washington Nationals: While the Nationals franchise itself has played in and won the World Series (in 2019), the team was formerly known as the Montreal Expos and was based in Montreal, Quebec, Canada. The Expos, established in 1969, never played in the World Series before relocating to Washington, D.C., and becoming the Nationals in 2005. 3. Colorado Rockies: Established in 1993 and based in Denver, Colorado, the Rockies have made several playoff appearances and reached the World Series once, in 2007, but did not win. It’s worth noting that teams' fortunes can change rapidly in sports, and new teams may have joined the league or existing teams may have reached the World Series since my last training data in September 2021. Was this response better or worse? d9 Better CP Worse = Same X

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ alcinnz, ppatel, bornach, patterfloof +8 more

Image

Image alternative text

ct_bergstrom, 7 months ago

I had GPT regenerate the answer 20 times. A few things to note:

Factual error rate: the system correctlu answered 1 time in 20.

Run-to-run inconsistency. I get different answers each time.

Logical errors and internally contradictory text in which one paragraph says a team did play and another says it didn't.

One attempt to self-correct that still doesn't quite work.

How could we think this sort of thing is useful for writing or even reviewing our work?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ erosdiscordia, alcinnz, oblomov, ppatel +7 more

attresa, 7 months ago

@ct_bergstrom if you ask gtp chat about a non-existent phenomenon, it can still generate fake news text...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

vpoupart, 7 months ago

@ct_bergstrom Imagine when #LLMs will be trained with their own bullshit

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

wtsparrow, 7 months ago

@ct_bergstrom
Artificial stupidity

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

geos, 7 months ago

@ct_bergstrom

How did I miss this?#OsakaPopstar
#Archies
#SugarSugar

https://youtu.be/sJo5DwL3IQU

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

robhon, 7 months ago

@ct_bergstrom

I think the same thing is going on with the human inclination to see faces in inanimate objects (ie, a wall outlet).

LLM's are merely creating natural sounding language output; they're not sentient. But we hear natural sounding language and we're inclined to assume some level of sentience.

In essence, sometimes we're a bunch of dumb monkeys smiling back at the wall outlet.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@robhon I'd go one step further. @emilymbender persuaded me its our innate skill at making sense and meaning of what we believe to have communication that makes the entirely magic trick work.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

irenes, 7 months ago

@ct_bergstrom @robhon @emilymbender agreed - we had to talk about this in a work context recently so we went back to the stochastic parrots paper and were impressed to see just how thorough it is on that topic

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

theLastTheorist, 7 months ago

@ct_bergstrom @robhon @emilymbender The real AI is us.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

robhon, 7 months ago

@ct_bergstrom @emilymbender

I definitely agree. I think it's less about how AI was programmed and more about how evolution has programmed us.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jsdodge, 7 months ago

@ct_bergstrom
But just wait, soon these folks will “create an intelligent aid so adept that the result is ‘the approximation of a one-to-one teacher-to-student ratio’” in university classes

https://www.harvardmagazine.com/2023/11/jhj-portrait-ai-teaching-learning

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

drs1969, 7 months ago

@ct_bergstrom This is an error (an understandable one) in the prompt though. You asked for teams plural so it's going to give you a list of teams. Because that's it's understanding of the request.

If you ask which team you get the right answer do you not?

'Prompt engineering' is a skill that needs to be learned. That we think we can ask a casual question and get the right answer out the gate so to speak, is also interesting I think.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@drs1969 So basically all you need to do is know the answer in advance and then you can engineer prompts so that it gets them right. That’s encouraging.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

doctormo, 7 months ago

@ct_bergstrom @drs1969

That's the only question I would ever ask one of these internet LLMs.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

drs1969, 7 months ago

@ct_bergstrom No. You need to think of it like a database(ish) with a query language and consider whether the query you give it is going to get you the data you are looking for.

Your Q is interesting. You were looking for an answer. The model 'thought' you were looking for multiple points of data.

I'm guessing you are using the free version? I think that's GPT3. GPT 3.5 and certainly 4 perform much better on lots of tasks, but they aren't free at point of use.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@drs1969 as I said in the thread this is gpt4.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

drs1969, 7 months ago

@ct_bergstrom I missed that at the bottom.

What happens if you ask it if it is sure? For math reasoning Qs adding 'let's take it step by step' brings massive improvements in performance (in the better models)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

resuna, 7 months ago

@drs1969 @ct_bergstrom If I ask a database the equivalent question in SQL, and there is only one result, I will only get one result.

But a LLM is not "answering a question", it is generating a plausible-sounding continuation of the prompt. Whether it's an answer to a question, let alone a correct answer, depends on what text followed similar prompts in the source data.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ech, 7 months ago

@ct_bergstrom @drs1969 Yeah, it's a tool that writes (mediocre?) prose; I think it's best to think of it that way.

So if you want to use it to write something, you have to go back and forth with it until the material is correct.

If it answers questions correctly, you got lucky: it's echoing something from its training data, I guess.

That's my understanding of how it could be useful. People seem concerned that it is going to take over the world and kill us all, so I might be missing something, though.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

H4Heights, 7 months ago

@ct_bergstrom It’s been used by CCHQ for a while now to generate policy 🤭

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

PaulGrahamRaven, 7 months ago

@ct_bergstrom something something Emperor's new clothes something

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

SteveClough, 7 months ago

@ct_bergstrom It has been said plenty of times before, but GPT is really good if what you want is the structure of an answer. It will structure your answer really well.

What is it no good at is actually putting the right information into the answer.

Too many people seem to get this the wrong way around.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ThreeSigma, 7 months ago

@ct_bergstrom
My lay understanding is that a correlation engine cannot do negation well. They have no logic modeling, so they suck at this.
Essentially, they are bad at figuring out what is true and what is false… and therefore have little utility in search or research.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

jhavok, 7 months ago

@ct_bergstrom This supports the theory of Artificial Stupidity: LLMs simply generate random grammatically correct structures. They do not have any idea if these structures are factually correct, because they don't have any idea.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

0xtdec, 7 months ago

@ct_bergstrom LLMs can be neither accurate nor factual, they can merely generate a plausible answer.

Which can be useful in cases where any plausible answer is useful, e.g. suggesting a thank you letter, but not in a Q&A context.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

harris, 7 months ago

@ct_bergstrom I have a simple GPT-4 powered bot that can look things up on Wikipedia. After some searching it seemed to arrive at the right answer. I also have the temperature turned way down on it to make it as deterministic as possible. Of course this info is in its training data but something about making it do research seems to work here.

(For the record I don’t think LLMs are a good way to get accurate information, but I thought this was an interesting experiment.)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

punklawyer, 7 months ago

@ct_bergstrom

Seems like the race to the bottom has made humans bottom feeders willing to eat digital shit.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

markigra, 7 months ago

@ct_bergstrom @RunRichRun Even if it’s only a hallucination, long suffering #Mariners fans are happy to have the company I guess.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

hananc, 7 months ago

@ct_bergstrom https://perplexity.ai returned the expected answer, and a bit more.

There are six MLB teams that have never won the World Series, according to various sources.

These teams are:
Texas Rangers
San Diego Padres
Tampa Bay Rays
Milwaukee Brewers
Colorado Rockies
Seattle Mariners

The Seattle Mariners are the only team that has never appeared in the World Series.

The other five teams have all made it to the World Series at least once, but have never won the championship

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ChristosArgyrop, 7 months ago

@ct_bergstrom LLMs are pretty great for generating BS fluff, which is 90% of the text in any given publication anyways. Unfortunately without the fluff one does not have a publication

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

seanteachesmath, 7 months ago

@ct_bergstrom AI, expensively failing to do something that could easily be done with about 30 seconds of basic research.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

SmallOther, 7 months ago

deleted_by_author

Loading...

ct_bergstrom, 7 months ago

@SmallOther agreed.

https://undark.org/2023/04/06/chatgpt-isnt-hallucinating-its-bullshitting/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

NIH_LLAMAS, 7 months ago

@ct_bergstrom I wonder what prods it to generate a post-hoc analysis, which it only does rarely. Of course it could be more like "i just realized..." is merely the sort of thing people sometimes say, and that occasionally gets snagged as part of a language structure to include and so it grabs at anything it can build a paragraph around.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Sebastian, 7 months ago

@ct_bergstrom Literally the first thing I was testing. If it doesn't get some little fact about video games right, I wouldn't necessarily trust it with your quantum mechanics homework...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

PhilipLeftwich, 7 months ago

@ct_bergstrom
I like this one - what are the names of the Kardashians. Chat GPT doesnt know where the Jenners came from

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

FrayJay, 7 months ago

@ct_bergstrom LLMs were never designed to be fact citing machines, but to generate content that is statistically logical and with correct grammar. So could team A win against team B of the same sport's? Jea they could, good job LLM. Could a formula 1 car beat a football team? Well...

It will change over time cause people have started to use them as "answer machines" and the engineers are trying to get them to be that now + be more factual if needed. But its not that easy + needs data.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chrismarquardt, 7 months ago

@ct_bergstrom LLM‘s are certainly not without problems, but saying “ChatGPT is
bad” without providing a version number is akin to saying “the car can’t go fast” without providing make, model and year. The differences between versions are astounding. Version 4 (sep 2023) gets it right two out of two times, once with allowing it internet access, once without.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@chrismarquardt I listed the version in the original post. GPT4. No idea why you are getting different results.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chrismarquardt, 7 months ago

@ct_bergstrom I copy/pasted your question verbatim, tried twice, got the right answer twice. So where’s the difference?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@chrismarquardt I just replicated in my phone using their app. GPT4. No clue what the difference is. Do you have plugins installed?

Chat Details X Model Info † GPT-4 Our most capable model, great for tasks that require creativity and advanced reasoning.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chrismarquardt, 7 months ago (edited 7 months ago)

@ct_bergstrom no plugins, but I have custom instructions that basically tell it to cut the BS. That might prevent hallucinations.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

geekwonk, 7 months ago

@ct_bergstrom when i demanded an explanation, it simply apologized and corrected itself. when i further demanded an explanation without apology, it said the following. not great!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

obrien_kat, 7 months ago

@ct_bergstrom fascinating. I’ve input questions about myself (as something I can easily verify as fact or false) every few weeks, asking several times at each period, and it’s been wild how different answers are, and how much is false - including things easily findable online. Like it invented that I was a journalist at certain outlets I wasn’t or that I attended a different university. Just bizarre.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cbdawson, 7 months ago

@ct_bergstrom I periodically play around with it randomly to see what kinds of responses it gives for various prompts. This week I had it generate some blurbs for a trans event I’m doing. I intentionally used the full word “transgender” to avoid misinterpretation.

The responses talked about “transcendence” and never mentioned “transgender”, altho some used the word “inclusive”. 🤦‍♂️🤦‍♂️🤦‍♂️

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

larsjuhljensen, 7 months ago

@ct_bergstrom I'm not as negative about LLMs as you, but this is indeed exactly how one should not be using it (in my opinion). Using it to answer factual questions is certainly not the way to go, since it is well known that it 1) will make up stuff, and 2) has no idea what is right and what it just made up (so it cannot correct itself).

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

coffeegeek, 7 months ago

@ct_bergstrom Heh, trying getting Chat GPT to write anything factual on anything coffee or espresso related.

For instance, I have asked it to write a 500 word post on the origin of the cappuccino, or espresso. The answers are entertaining, but no where near the actual truth. Bonus point when u ask for it in "casual voice: "Yo, check this out!"

Sidenote, when we get unsolicited blog article submissions, I can smell ChatGPT a mile away.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

stefani, 7 months ago

@ct_bergstrom

some of us might be sad that the Mariners are all alone this way.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ct_bergstrom, 7 months ago

@stefani Some of us might.

https://fediscience.org/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

bucknam, 7 months ago

@ct_bergstrom to be fair the Rockies have been so horrible for so long we Coloradans usually also forget that they ever got to the World Series

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ericdaryl, 7 months ago

@ct_bergstrom It's grim... I lost so much respect for a colleague the other day in a committee meeting when we starting re-working some faculty handbook language around tenure requirements and they generated a draft with GPT...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

sarae, 7 months ago

@ct_bergstrom I like how the one thing even ChatGPT knows is that the Mariners...aren't great

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

cswalker21, 7 months ago

@ct_bergstrom I asked it which of Colorado’s 14,000 ft. mountain peaks was located north of I-70 (there’s only one). It listed five in confident language, all five of which are south of I-70, and it left off Long’s Peak, the correct answer.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

Add comment