LLMs

kellogh,
@kellogh@hachyderm.io avatar

it’s a little disingenuous to refer to #LLMs as #opensource because you can really only open source an LLM in roughly the same way you open source a microprocessor — RISCV is open source, the plans for it anyway, but it still costs millions to riff off it and make your own custom version, same with LLMs. that’s not exactly what open source was going for

wagesj45,
@wagesj45@mastodon.jordanwages.com avatar

@kellogh I get what you're saying. But barrier to entry has always been "high" depending on who you're talking to. The average person doesn't have the skill and knowledge to set up and compile open source software. I still think it fits the original spirit of the term and movement. The more knowledge out there the better for us smelly nerds. :smug:

kellogh,
@kellogh@hachyderm.io avatar

@wagesj45 oh wow, i get what you’re saying, but that screenshot rubs me the wrong way. nooot a good example

happyborg,
@happyborg@fosstodon.org avatar

#LLMs =
Large
Lamentable
Mishaps

moorejh,
@moorejh@mastodon.online avatar

Our KRAGEN paper is out! This method combines LLMs & RAG with Graph of Thoughts for asking complex questions of a knowledge graph or any vector DB https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/btae353/7687047 #llms #artificialintelligence #bioinformatics #datascience

kellogh,
@kellogh@hachyderm.io avatar

@moorejh that’s super cool. so it both uses an KG to ground it, but also outputs a graph for interpretability? how slow/expensive is it?

timbray,
@timbray@cosocial.ca avatar

I see that openai.com/gptbot is crawling my blog, top to bottom, side to side. I’m sure OpenAI has consulted the “Rights” link clearly displayed on every page, invoking a Creative Commons license that freely grants rights to reuse and remix but not for commercial purposes.

#genAI #llms

kellogh,
@kellogh@hachyderm.io avatar

one thing i love about #LLMs is asking it “how tf do i do X” and it responds with 5 ideas, four of which are terrible but one is far better than anything i’d thought of. or their all terrible but one makes me realize i’ve been thinking about the problem wrong

kellogh,
@kellogh@hachyderm.io avatar

also, #LLMs cause me to think a lot about the multifaceted nature of intelligence. we used to over-weight language skill, but now that LLMs have that in spades, it’s apparent that there’s more going on

for example, spontaneity. if it had an ounce of sponteity, it could suggest approaching the problem differently

mamund,
@mamund@mastodon.social avatar

Evaluating Large Language Models Using “Counterfactual Tasks”

https://aiguide.substack.com/p/evaluating-large-language-models?utm_source=post-email-title&publication_id=1273940&post_id=144603950&utm_campaign=email-post-title&isFreemail=true&r=4pxfn&triedRedirect=true&utm_medium=email

"In [the counterfactual task] paradigm, models are evaluated on pairs of tasks that require the same types of abstraction and reasoning, but for each pair, the content of the first task is likely to be similar to training data, whereas the content of the second task (a “counterfactual task”) is designed to be unlikely to be similar to training data." --

albertcardona,
@albertcardona@mathstodon.xyz avatar

@mamund

"Pray, Mr. Babbage, if you put into the machine wrong figures, will the right answers come out?"

To expect a priori from an LLM more than an answer equivalent to an interpolated data point as obtained from Monte Carlo data augmentation seems unwarranted.

Or likewise, to expect from a Random Forest classifier to correctly classify an input whose parameter values falls outside the learned parameter ranges.

As expected, this study shows that LLMs fail at "counterfactual" tasks – at providing answers to questions outside the training set.

albertcardona, (edited )
@albertcardona@mathstodon.xyz avatar

@mamund

"Embers of autoregression" indeed: the average person dramatically underestimates the amount of knowledge others have, collectively, and hence expresses surprise at the "right" answers an LLM spits out – failing to comprehend the sheer size of the data sets that went into training the LLM in the first place. But query outside that corpus of knowledge, even query about that corpus but in an unexpected way, and the LLM unwittingly responds with nonsense.

markhughes,
@markhughes@mastodon.social avatar

Using #LLMs for technical design is like asking Picasso to design aircraft.

Engineer: more wings please. No, not that many, and make them symmetrical this time. No, no, no...

#engineering #ai

josemurilo,
@josemurilo@mato.social avatar

"the worst part is that when they [#LLMs] can’t complete a task confidently, they don’t give you an error or tell you they’re unable to finish. They make something up and serve you incorrect information.
…companies like #Google are pretending this isn’t a problem and pushing these systems toward taking over as our phones’ virtual assistants and the brains behind our online searches."

https://www.computerworld.com/article/2117752/google-gemini-ai.html

angusm,
@angusm@mastodon.social avatar

It's fashionable to criticize #LLMs, but can you think of another human invention that allows us to spend the energy budget of Tanzania to lift shitposts out of context and present them as if they were authoritative knowledge?

#AI

prefec2,
@prefec2@norden.social avatar

@LouisIngenthron @angusm very good argument.

prefec2,
@prefec2@norden.social avatar

@failedLyndonLaRouchite @angusm I checked the article. It refers for example.to cancer detection, but it does not state that it uses LLMs. The described setup rather hints for some neural network processing a few measurements as inputs.

kellogh,
@kellogh@hachyderm.io avatar

i’m very excited about the interpretability work that has been doing with .

in this paper, they used classical machine learning algorithms to discover concepts. if a concept like “golden gate bridge” is present in the text, then they discover the associated pattern of neuron activations.

this means that you can monitor LLM responses for concepts and behaviors, like “illicit behavior” or “fart jokes”

https://www.anthropic.com/research/mapping-mind-language-model

kellogh,
@kellogh@hachyderm.io avatar

this is great work. i’m excited to see where this goes next

i hope #anthropic exposes this via their API. at this point in time, most of the promising interpretability work is only available on open source models that you can run yourself. it would be great to also have them available from #AI vendors

Lobrien,

@kellogh This does, of course, imply vastly easier subversion of guardrails. Bad actors will have an easier time manipulating bias.

openwebsearcheu,
@openwebsearcheu@suma-ev.social avatar

💭 Dreaming of in Europe
👉 The German Science Journal „Spektrum.de“ writes about the OWS.EU project & the challenge of creating a European as a foundation for , & special interest applications.

„So far, 1.3 billion URLs in 185 languages, totaling 60 terabytes, have been crawled and indexed“ says project lead Michael Granitzer in the article.

Find out more about potential future applications & OWS.EU´s unique approach:

https://openwebsearch.eu/the-dream-of-an-open-search-engine-i-spektrum-de/

kellogh,
@kellogh@hachyderm.io avatar

if i had more time, i'd love to investigate PII coming from #LLMs. i've seen it generate phone numbers and secrets, but i wonder if these are real or not. i imagine you could look at the logits to figure out if phone number digits were randomly chosen or if the sequence is meaningful to the LLM. anyone aware of researchers who have already done this?

kellogh,
@kellogh@hachyderm.io avatar

i would guess that phone numbers are probably mostly random, since so many phone numbers are found online, whereas AWS keys are less common, so you're probably more likely to get partial or even full real keys

Lobrien,

@kellogh Someone claimed that a long magic number used in their highly-optimized (FFT?) code was spit out by Copilot. (This was soon after release.) The constant was arrived at by long fine-tuning, not conceptual in any way.

scottjenson,
@scottjenson@social.coop avatar

Saying "LLMs will eventually do every job" is a bit like:

  1. Seeing Wifi wireless data
  2. Then predicting "Wireless" Power saws (no electrical cord or battery) are just around the corner

It's a misapplication of the tech. You need to understand how #LLMs work and extrapolate that capability. It's all text people. Summarizing, collating, template matching. All fair game. But stray outside of that box and things get much harder.

mattwilcox,
@mattwilcox@mstdn.social avatar

@scottjenson Tbh I’m not convinced on any of those either. Again; because it’s a bias machine to expose the average. It’s useless at anything else. It won’t give you modern css. It won’t look at your own code base to avoid duplication. If you’re less than average skill it may give you beneficial things; maybe. There are no “smarts” at all, just stats. In the medical field it’s diagnosed scans of plastic toys as having cancer. I wouldn’t want a dr to be leaning on this stuff.

miki,
@miki@dragonscave.space avatar

@scottjenson Saying "we will once be able to fly from New York to Paris" is like seeing the contraption that the Wright brothers have just designed and extrapolating a jet engine.

fizise,
@fizise@sigmoid.social avatar

Nice example of how important emphasis can be for language understanding. Depending on which word in the sentence below is emphasized, it completely changes its meaning.
For (and for our lecture) this means that learning to understand language purely from written text is probably not an "easy" task....

Picture from Brian Sacash, via LinkedIn, cf. https://www.linkedin.com/feed/update/urn:li:activity:7195767258848067584/

@sourisnumerique @enorouzi @shufan @lysander07

tayarndt,
@tayarndt@techopolis.social avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • LLMs
  • DreamBathrooms
  • everett
  • InstantRegret
  • magazineikmin
  • thenastyranch
  • rosin
  • Durango
  • khanakhh
  • Youngstown
  • slotface
  • ethstaker
  • kavyap
  • ngwrru68w68
  • osvaldo12
  • JUstTest
  • normalnudes
  • modclub
  • GTA5RPClips
  • tacticalgear
  • mdbf
  • tester
  • cisconetworking
  • anitta
  • Leos
  • cubers
  • megavids
  • provamag3
  • lostlight
  • All magazines