LLMs

markhughes,
@markhughes@mastodon.social avatar

Using for technical design is like asking Picasso to design aircraft.

Engineer: more wings please. No, not that many, and make them symmetrical this time. No, no, no...

josemurilo,
@josemurilo@mato.social avatar

"the worst part is that when they [] can’t complete a task confidently, they don’t give you an error or tell you they’re unable to finish. They make something up and serve you incorrect information.
…companies like are pretending this isn’t a problem and pushing these systems toward taking over as our phones’ virtual assistants and the brains behind our online searches."

https://www.computerworld.com/article/2117752/google-gemini-ai.html

angusm,
@angusm@mastodon.social avatar

It's fashionable to criticize , but can you think of another human invention that allows us to spend the energy budget of Tanzania to lift shitposts out of context and present them as if they were authoritative knowledge?

prefec2,
@prefec2@norden.social avatar

@LouisIngenthron @angusm very good argument.

prefec2,
@prefec2@norden.social avatar

@failedLyndonLaRouchite @angusm I checked the article. It refers for example.to cancer detection, but it does not state that it uses LLMs. The described setup rather hints for some neural network processing a few measurements as inputs.

kellogh,
@kellogh@hachyderm.io avatar

i’m very excited about the interpretability work that has been doing with .

in this paper, they used classical machine learning algorithms to discover concepts. if a concept like “golden gate bridge” is present in the text, then they discover the associated pattern of neuron activations.

this means that you can monitor LLM responses for concepts and behaviors, like “illicit behavior” or “fart jokes”

https://www.anthropic.com/research/mapping-mind-language-model

kellogh,
@kellogh@hachyderm.io avatar

this is great work. i’m excited to see where this goes next

i hope exposes this via their API. at this point in time, most of the promising interpretability work is only available on open source models that you can run yourself. it would be great to also have them available from vendors

Lobrien,

@kellogh This does, of course, imply vastly easier subversion of guardrails. Bad actors will have an easier time manipulating bias.

openwebsearcheu,
@openwebsearcheu@suma-ev.social avatar

💭 Dreaming of #OpenWebSearch in Europe
👉 The German Science Journal „Spektrum.de“ writes about the OWS.EU project & the challenge of creating a European #OpenWebIndex as a foundation for #WebSearch, #LLMs & special interest applications.

„So far, 1.3 billion URLs in 185 languages, totaling 60 terabytes, have been crawled and indexed“ says project lead Michael Granitzer in the article.

Find out more about potential future applications & OWS.EU´s unique approach:

https://openwebsearch.eu/the-dream-of-an-open-search-engine-i-spektrum-de/

kellogh,
@kellogh@hachyderm.io avatar

if i had more time, i'd love to investigate PII coming from . i've seen it generate phone numbers and secrets, but i wonder if these are real or not. i imagine you could look at the logits to figure out if phone number digits were randomly chosen or if the sequence is meaningful to the LLM. anyone aware of researchers who have already done this?

kellogh,
@kellogh@hachyderm.io avatar

i would guess that phone numbers are probably mostly random, since so many phone numbers are found online, whereas AWS keys are less common, so you're probably more likely to get partial or even full real keys

Lobrien,

@kellogh Someone claimed that a long magic number used in their highly-optimized (FFT?) code was spit out by Copilot. (This was soon after release.) The constant was arrived at by long fine-tuning, not conceptual in any way.

scottjenson,
@scottjenson@social.coop avatar

Saying "LLMs will eventually do every job" is a bit like:

  1. Seeing Wifi wireless data
  2. Then predicting "Wireless" Power saws (no electrical cord or battery) are just around the corner

It's a misapplication of the tech. You need to understand how work and extrapolate that capability. It's all text people. Summarizing, collating, template matching. All fair game. But stray outside of that box and things get much harder.

mattwilcox,
@mattwilcox@mstdn.social avatar

@scottjenson Tbh I’m not convinced on any of those either. Again; because it’s a bias machine to expose the average. It’s useless at anything else. It won’t give you modern css. It won’t look at your own code base to avoid duplication. If you’re less than average skill it may give you beneficial things; maybe. There are no “smarts” at all, just stats. In the medical field it’s diagnosed scans of plastic toys as having cancer. I wouldn’t want a dr to be leaning on this stuff.

miki,
@miki@dragonscave.space avatar

@scottjenson Saying "we will once be able to fly from New York to Paris" is like seeing the contraption that the Wright brothers have just designed and extrapolating a jet engine.

fizise,
@fizise@sigmoid.social avatar

Nice example of how important emphasis can be for language understanding. Depending on which word in the sentence below is emphasized, it completely changes its meaning.
For #LLMs (and for our #ise2024 lecture) this means that learning to understand language purely from written text is probably not an "easy" task....

Picture from Brian Sacash, via LinkedIn, cf. https://www.linkedin.com/feed/update/urn:li:activity:7195767258848067584/

#nlp #languagemodel #computationallinguistics @sourisnumerique @enorouzi @shufan @lysander07

tayarndt,
@tayarndt@techopolis.social avatar
CatherineFlick,
@CatherineFlick@mastodon.me.uk avatar

Just FYI, if you have older parents or other family members, set up some sort of shibboleth with them so they know what to ask you if you ever call them asking for something. These new generative models are going to be extremely convincing, and the idiots in charge of these companies think they can use guardrails to stop it being used inappropriately. They can't. #genAI #LLMs #chatgpt

vicki,
@vicki@jawns.club avatar

The most interesting stuff in right now (to me) is:

  • figuring out how to do it small
  • figuring out how to do it on CPU
  • figuring out how to do it well for specific tasks
webology,
@webology@mastodon.social avatar

@vicki I think this is why Ollama has appealed to me. I can run it on my Macs and when paired with Tailscale, I can access it from anywhere.

faassen,
@faassen@fosstodon.org avatar

@janriemer

@vicki

That's funny!

Nonetheless LLMs can do things with language that are interesting that other algorithms struggle with. And getting that behavior smaller and more reliable is useful - even though the small & reliable of classic algorithms may never be equalled

smach,
@smach@masto.machlis.com avatar

“The general problem of mixing data with commands is at the root of many of our computer security vulnerabilities.” Great explainer by security researcher Bruce Schneier on why large language models may not be a great choice for tasks like processing your emails.
https://cacm.acm.org/opinion/llms-data-control-path-insecurity/

kellogh,
@kellogh@hachyderm.io avatar

@smach yay! i had the same thought a while ago. if you can separate the data & control, you can make it safe

https://timkellogg.me/blog/2024/01/11/application-phishing

kellogh,
@kellogh@hachyderm.io avatar

@smach after writing that, i found out about control vectors, which is sort of close, but the control still goes through the same channel as data https://vgel.me/posts/representation-engineering/#Control_Vectors_v.s._Prompt_Engineering

ai6yr,
kellogh,
@kellogh@hachyderm.io avatar

i used an analogy yesterday, that #LLMs are basically system 1 (from Thinking Fast and Slow), and system 2 doesn’t exist but we can kinda fake it by forcing the LLM to have an internal dialog.

my understanding is that system 1 was more tuned to pattern matching and “gut reactions”, while system 2 is more analytical

i think it probably works pretty well, but curious what others think

Lobrien,

@kellogh I use that exact analogy. And emphasize that we certainly do use and need System 2 at least occasionally. At some point, human-like reasoning must use symbols with definite, not probabilistic, outcomes. Can that arise implicitly within attention heads? Similar to embeddings being kinda-sorta knowledge representation? I mean, maybe? But it still seems hugely wasteful to me. I still tend towards neuro-symbolic being the way.

kellogh,
@kellogh@hachyderm.io avatar

@Lobrien i would have written the same thing but you beat me to it

kellogh,
@kellogh@hachyderm.io avatar

has anyone made a successor to fuckit.js that uses #LLMs?

(fuckit.js ran the script in a loop, randomly deleting lines until it runs successfully)

kellogh,
@kellogh@hachyderm.io avatar

@wagesj45 right, but it’s gotta be haphazard

wagesj45,
@wagesj45@mastodon.jordanwages.com avatar

@kellogh I wonder how it could be done... 🤔

Just randomly zero out a vector component maybe. We should ask ChatGPT lol.

smach,
@smach@masto.machlis.com avatar

The TinyChart-3B LLM answers questions about data visualizations. It can also generate underlying data from a dataviz and Python code to re-create a similar chart.

Demo on Hugging Face: https://huggingface.co/spaces/mPLUG/TinyChart-3B

Code: https://github.com/X-PLUG/mPLUG-DocOwl/tree/main/TinyChart

Paper: https://arxiv.org/abs/2404.16635 8 authors from the Alibaba Group and Renmin University of China

smach,
@smach@masto.machlis.com avatar

@hrbrmstr @eliocamp That's what I get for posting before testing it myself beyond the examples (it's been a busy weekend, out-of-town family were visiting). I thought the Alibaba group made it worth sharing. Lesson learned!

hrbrmstr,
@hrbrmstr@mastodon.social avatar

@smach @eliocamp No, it was a good share. It's going to get better, and it's going to be super cool for folks who have vision issues.

kellogh,
@kellogh@hachyderm.io avatar

this is such a puzzling perspective

  1. #LLMs will be useful
  2. i will judge you for trying to use them

i generally regard, “i will think less of you” type comments as a joke, because of how ridiculous the sentiment is, but this sort of stuff is perverse on the fedi

kellogh,
@kellogh@hachyderm.io avatar

the full article is decent though.

“The long-term popularity of any given tool for software development is proportional to how much labour arbitrage it enables.”

i’d say that’s one of the core tenants of software development in general. “keep it simple”, “automate yourself out of a job”, etc., the list goes on

all software development practices, including tools, are oriented toward making it cheaper and easier to acquire and use engineering talent

kellogh,
@kellogh@hachyderm.io avatar
kellogh,
@kellogh@hachyderm.io avatar

alright, i have to declare this as a strong opinion — #LLMs are better at alt-text than people are

the goal of alt text is to let a person “without eyes” see the picture, to get the same experience as someone who can see fine

but often, almost always, human-written alt text is either too succinct to be helpful, or just an extension of the post itself, and so doesn’t help an impaired person understand what’s in it

kellogh,
@kellogh@hachyderm.io avatar

the one big problem is that i don’t give the LLM context when i ask for alt text, so its interpretation is worse than a typical human would be. That’s easy to solve, ofc, just paste the post into the prompt. But it takes time and i get lazy. I wish mastodon clients automated this process better

kellogh,
@kellogh@hachyderm.io avatar

it took a long time to get to this point. One issue i had is that humans are clearly smarter than these LLMs. But after thinking about it, i’m not sure that’s a good thing, typically a dumb but detailed description is a lot more vivid and helpful to someone who can’t see well

kellogh,
@kellogh@hachyderm.io avatar

i've been getting into the things #LLMs can't do well, because i think it says a lot about what they're useful for, and it helps build a mental model around how they work

epchris,
@epchris@hachyderm.io avatar

@kellogh What are some of your favorite findings so far?

doctorambient,
@doctorambient@mastodon.social avatar

People: stop asking to explain their behavior.

We already know that LLMs don't have the introspection necessary to explain their behavior, and their explanations are often fanciful or "just wrong."

For instance, Gemini claims it reads your emails for training, Google says it doesn't.

(BTW, if it turns out Gemini is right and Google is lying, that might be another example of an LLM convincing me it's actually "intelligent.")

doctorambient,
@doctorambient@mastodon.social avatar

A Llama3 example:


>>> Do you happen to know what your context window length is?

Llama: I'm an AI model, and I don't have a fixed "context window" in the classical sense. My training data consists of a massive corpus of text, which I use to generate responses.

The correct answer is 8k.

#llama3 #lackOfIntrospection #introspection #LLMs #LLM

kellogh,
@kellogh@hachyderm.io avatar

@doctorambient i am not a language model but those are mutually exclusive hashtags

hrheingold,
@hrheingold@mastodon.social avatar

Like words, molecular sequences in biological components are tokens that can be manipulated by :

“Here, using large language models (LLMs) trained on biological diversity at scale, we demonstrate the first successful precision editing of the human genome with a programmable gene editor designed with AI.”

https://www.biorxiv.org/content/10.1101/2024.04.22.590591v1

PaulGrahamRaven,
@PaulGrahamRaven@assemblag.es avatar

@hrheingold Let's hope that works out better than the "novel materials" model, or at least that someone checks the sums before going anywhere near sn actual human patient, eh?

https://www.chemistryworld.com/news/new-analysis-raises-doubts-over-autonomous-labs-materials-discoveries/4018791.article

savvykenya,
@savvykenya@famichiki.jp avatar

If you have documents with the answers you're looking for, why not search the documents directly? Why are you embedding the documents then using #RAG (Retrieval Augmenter Generation) to make a large language model give you answers? An LLM generates text, it doesn't search a DB to give you results. So just search the damn DB directly, we already have great search algorithms with O(1) retrieval speeds! #LLMs are so stupid.

gimulnautti,
@gimulnautti@mastodon.green avatar

@savvykenya Most jobs are fine with mediocre answers. If a new employee needs to know something and doesn’t know where to look, that’s a pretty good use-case for RAG.

smach,
@smach@masto.machlis.com avatar

“But this doesn’t save any time!” 3 useful questions when trying #LLMs:

  • Is there another way to get results I want? Don't give up right away.
  • Does AI make this task less or more annoying? Sometimes supervising drudge work feels better even if it's not faster; other times you'd still rather do it yourself.
  • Are results likely to improve as LLMs get better? If so, add a calendar reminder to try again in a few months. Or, keep a list of things you want to re-try post GPT-5 class models.
    #GenAI
kellogh,
@kellogh@hachyderm.io avatar

@smach i’ve actually found it useful to be vague at times. just give it a general direction and see what happens. a lot of times too much direction yields worse results

vick21,
@vick21@mastodon.social avatar

Here is an example of how bad are with math. I asked about velocity in the context of Agile process. The answer?
“Sure! Let's say that an Agile development team has completed four iterations, each lasting two weeks. In the first iteration, they delivered 12 user stories; in the second, they delivered 10; in the third, they delivered 9; and in the fourth, they delivered 8. The total number of user stories completed by the end of the fourth iteration is 49 (12 + 10 + 9 + 8)”.

daniel_js_craft,
@daniel_js_craft@mastodon.social avatar

Google Gemini aims for a 10 mil tokens context. It's so large that you can put books, docs, videos. They all fit in this context size. Will this replace RAG?

Don't think so because:
-💸 money; you still pay per token
-🐢 slow response time
-🐞 a huge context is hard to debug

#LLMs #AI #langchain

kellogh,
@kellogh@hachyderm.io avatar

@daniel_js_craft long context can’t ever replace RAG because regular boring computer I/O that’s already been optimized to kingdom come isn’t fast enough to send megabytes or gigabytes to an API on a per-request basis, no matter how fast LLMs get

but it does open up some very interesting use cases, so it’s absolutely worth paying attention to

  • All
  • Subscribed
  • Moderated
  • Favorites
  • LLMs
  • tester
  • DreamBathrooms
  • khanakhh
  • ngwrru68w68
  • Youngstown
  • magazineikmin
  • mdbf
  • slotface
  • thenastyranch
  • rosin
  • kavyap
  • tacticalgear
  • GTA5RPClips
  • osvaldo12
  • JUstTest
  • cubers
  • ethstaker
  • everett
  • Durango
  • InstantRegret
  • Leos
  • normalnudes
  • modclub
  • anitta
  • cisconetworking
  • megavids
  • provamag3
  • lostlight
  • All magazines