metin, (edited ) to ai
@metin@graphics.social avatar
Lazarou, to stackoverflow
@Lazarou@mastodon.social avatar

This just makes me want to delete everything of mine on corporate social media, and I pretty much have tbh

#StackOverflow #AI #LLMs

LChoshen, to llm
@LChoshen@sigmoid.social avatar

Do LLMs learn foundational concepts required to build world models? (less than expected)

We address this question with 🌐🐨EWoK (Elements of World Knowledge)🐨🌐

a flexible cognition-inspired framework to test knowledge across physical and social domains

https://ewok-core.github.io

#llm #llms #evaluation #ml #machinelearning

metin, to ai
@metin@graphics.social avatar
ai6yr, to ai
@ai6yr@m.ai6yr.org avatar

Giant sucking sounds from over there on Reddit https://www.bbc.com/news/articles/cxe92v47850o

remixtures, to ai Portuguese
@remixtures@tldr.nettime.org avatar

#AI #GenerativeAI #LLMs #ParetoCurves: "Which is the most accurate AI system for generating code? Surprisingly, there isn’t currently a good way to answer questions like these.

Based on HumanEval, a widely used benchmark for code generation, the most accurate publicly available system is LDB (short for LLM debugger).1 But there’s a catch. The most accurate generative AI systems, including LDB, tend to be agents,2 which repeatedly invoke language models like GPT-4. That means they can be orders of magnitude more costly to run than the models themselves (which are already pretty costly). If we eke out a 2% accuracy improvement for 100x the cost, is that really better?

In this post, we argue that:

  • AI agent accuracy measurements that don’t control for cost aren’t useful.

  • Pareto curves can help visualize the accuracy-cost tradeoff.

  • Current state-of-the-art agent architectures are complex and costly but no more accurate than extremely simple baseline agents that cost 50x less in some cases.

  • Proxies for cost such as parameter count are misleading if the goal is to identify the best system for a given task. We should directly measure dollar costs instead.

  • Published agent evaluations are difficult to reproduce because of a lack of standardization and questionable, undocumented evaluation methods in some cases."

https://www.aisnakeoil.com/p/ai-leaderboards-are-no-longer-useful

leanpub, to ai
@leanpub@mastodon.social avatar

AI for Efficient Programming: Harnessing the Power of Large Language Models http://leanpub.com/courses/fredhutch/ai_for_software is the featured online course on the Leanpub homepage! https://leanpub.com #AI #courses #programming #LLMs

doctorambient, to ai
@doctorambient@mastodon.social avatar

"The biggest question raised by a future populated by unexceptional A.I., however, is existential. Should we as a society be investing tens of billions of dollars, our precious electricity that could be used toward moving away from fossil fuels, and a generation of the brightest math and science minds on incremental improvements in mediocre email writing?" (From an NYT article. See original thread.)

@peter https://thepit.social/@peter/112445916259675495

AccordionGuy, to ai
@AccordionGuy@mastodon.cloud avatar

Do you REALLY want to get a feel for how GPT-4o does what it does? Just complete this poem — by doing so, you’ll have performed a computation similar to the one it does when you feed it a text-plus-image prompt.

#AI #ArtificialIntelligence #LLM #LLMs #LargeLanguageModel #LargeLanguageModels

https://www.globalnerdy.com/2024/05/15/the-simplest-way-to-illustrate-how-gpt-4o-works/

iammannyj, to opensource
@iammannyj@fosstodon.org avatar

IBM open-sources its Granite AI models - and they mean business

Many companies claim to have open-sourced their LLMs, but IBM actually did it.

https://www.zdnet.com/article/ibm-open-sources-its-granite-ai-models-and-they-mean-business/

ai6yr, to ai
@ai6yr@m.ai6yr.org avatar

IEEE Spectrum: telling it like it is.

ai6yr,
@ai6yr@m.ai6yr.org avatar

Alas, they wimped out and changed the title online, probably after a bunch of tech-bros or engineers-in-love-with-AI complained. #Ai #llms

tayarndt, to LLMs
@tayarndt@techopolis.social avatar
CatherineFlick, to LLMs
@CatherineFlick@mastodon.me.uk avatar

Just FYI, if you have older parents or other family members, set up some sort of shibboleth with them so they know what to ask you if you ever call them asking for something. These new generative models are going to be extremely convincing, and the idiots in charge of these companies think they can use guardrails to stop it being used inappropriately. They can't. #genAI #LLMs #chatgpt

ai6yr, to random
@ai6yr@m.ai6yr.org avatar

OpenAI or Science Fiction Movie?

https://www.youtube.com/watch?v=ne6p6MfLBxc

ai6yr,
@ai6yr@m.ai6yr.org avatar

On one hand, the technology advance!

On the other hand: it'll burn down the planet faster with all that energy use

Also, the complete decline of civilization itself, if you think Futurama had it right here: https://www.youtube.com/watch?v=IrrADTN-dvg

#ai #openai #llms #dating

metin, to ai
@metin@graphics.social avatar
vicki, to LLMs

The most interesting stuff in #LLMs right now (to me) is:

  • figuring out how to do it small
  • figuring out how to do it on CPU
  • figuring out how to do it well for specific tasks
Seirdy, to react
@Seirdy@pleroma.envs.net avatar

New bookmark: React, Electron, and LLMs have a common purpose: the labour arbitrage theory of dev tool popularity.

“React and the component model standardises the software developer and reduces their individual bargaining power excluding them from a proportional share in the gains”. An amazing write-up by @baldur about the de-skilling of developers to reduce their ability to fight back against their employers.


Originally posted on seirdy.one: See Original (POSSE). #GenAI #llms #webdev

ceoln, to Bitcoin
@ceoln@qoto.org avatar

I feel like it would be very consistent if the next thing after and The and and , turned out to be .

I don't know if it will actually attract and support tons of scams and media bros and think pieces, but if it did it would feel right somehow.

changelog, to LLMs
@changelog@changelog.social avatar

💥 New episode of Changelog & Friends!

🎙️ with @anniesexton

🎧 https://changelog.com/friends/43

#career #llms #culture #podcast

happyborg, to ai
@happyborg@fosstodon.org avatar

The first thing we taught #AI is how to lie convincingly.

WTF could go wrong and who TF decided this was a good way to start?

#LLMs

smach, to LLMs
@smach@masto.machlis.com avatar

“The general problem of mixing data with commands is at the root of many of our computer security vulnerabilities.” Great explainer by security researcher Bruce Schneier on why large language models may not be a great choice for tasks like processing your emails.
https://cacm.acm.org/opinion/llms-data-control-path-insecurity/

phryk, to LLMs
@phryk@mastodon.social avatar
sohkamyung, to singapore
@sohkamyung@mstdn.io avatar

"When the Singaporean government asked local writers if they would agree to having their work used to train a large language model, it probably did not expect the country’s tiny literary community to react so fiercely."

https://restofworld.org/2024/singapore-writers-reject-ai-training/

KathyReid, to stackoverflow
@KathyReid@aus.social avatar

I just issued a data deletion request to #StackOverflow to erase all of the associations between my name and the questions, answers and comments I have on the platform.

One of the key ways in which #RAG works to supplement #LLMs is based on proven associations. Higher ranked Stack Overflow members' answers will carry more weight in any #LLM that is produced.

By asking for my name to be disassociated from the textual data, it removes a semantic relationship that is helpful for determining which tokens of text to use in an #LLM.

If you sell out your user base without consultation, expect a backlash.

ai6yr, to LLMs
@ai6yr@m.ai6yr.org avatar
  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • ngwrru68w68
  • tester
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • InstantRegret
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • DreamBathrooms
  • megavids
  • tacticalgear
  • osvaldo12
  • normalnudes
  • cubers
  • cisconetworking
  • everett
  • GTA5RPClips
  • ethstaker
  • Leos
  • provamag3
  • anitta
  • modclub
  • lostlight
  • All magazines