Apparently one of the most common uses of LLMs in #academia is copy editing: cleaning up your #writing on points of spelling, grammar and style. This is wildly unattractive to me. I love writing. My personal style, my personal voice, are extremely high priorities to me. It annoys me no end when a journal editor replaces one of my unconventional style choices with something bland. If an editor ran a paper of mine through an #LLM I would scream bloody murder.
@mrundkvist I agree, but every spell-scheck, every grammar-check etc in Microsoft Word is a result of AI. I would like to find a word-processor that does NOT use AI at all.
Weekend discovery. An intermediate step in the RAG process is document chunking. Determining the appropriate chunk size can become a trial & error game. James Briggs does a great job of explaining how to use Semantic Chunking to get better results.
āHumans are a social species down to our core; the more modern life erodes our opportunities for actual human companionship ā whether itās by interposing technology as an intermediary into every interaction, or sucking up all our time with the capitalist/consumerist grind ā the more desperate weāll become for friendly-sounding volleyball substitutes.ā
āHereās the paradox: even while the language fluency fools us into imagining these chatbots are āpersonsā, we simultaneously place far more trust in their accuracy than we would with any real human. Thatās because the fact that we still know itās a computer activates another, more modern rule of thumb: that computer-generated information is accurate and trustworthy.ā
Itās a brilliant essay; I highly recommend you take the time to read it.
and there are so fucking many things wrong with it
one of the most amazingly wrong things is that... they're already throwing "ai" bullshit at these screencaps they're doing every five seconds, right? that's what does the OCR and also does the LLM-driven description for the search functionality later
and yet no one
NO. ONE.
thought to tell it
"and don't save screens with the word 'password' on them."
YOU COULD DO THIS WITH GREP, YOU STUPID FUCKS, WHAT THE HELL IS WRONG WITH YOU?! IT'S NOT HARD!
@joncruz yes but those are localisation issues which have been solved already for decades by a company the size of Microsoft, and I say that from experience of having been a dev there.
("grep" was hyperbole, not a serious suggestion. As I said, they're already using LLM shit upon which the entire feature depends. if their LLM can't find it to exclude it, then their LLM won't be able to find it to store it in plain text, either.)
While you're thinking about what to submit to the Call for Problems for the #ALTA2024 Shared Task (link below), we're sharing with you the 2nd-place winner of the #ALTA2023 Shared Task, where participants distinguished between #LLM-generated and human-generated text.
This post by @maggie has some great ideas on how #LLM tech can help enable #LocalFirst applications for regular folks. I've been wanting to do something similar within @agregore some day with local LLMs helping people author p2p web apps.
@mauve
I've yet to get a better response from a local LLM to a code question than I get from a web search or going to StackExchange etc. Are you finding good uses yet?
I confess I haven't tried too hard, but then most people won't and that's the point really anyway. š¤·āāļø
I expect they should be good for accessibility, such as speech in/out but an not seeing those apps. Why not?! š¤¦āāļø
Although I see Mozilla have put a local LLM in Firefox to generate alt text for images.
@maggie nice reading. I'm a bit skeptical about LLM but you might be right š¤ future will tell. As a dev, I'm glad to read non-techie people getting the point of local first app: you did well to introduce the concept
PĆ¼nktlich zur #bibliocon24 starten wir im VĆBB einen neuen, experimentellen Dienst: den VĆBB-Chatbot. Als meines Wissens erste (?) deutsche Bibliothek kombinieren wir hier Sprachtalent und "Wissen" eines Large Language Models (#LLM) mit den vollstƤndigen Metadaten unseres #VĆBB Kataloges (als sog. Embedding).
I have a newly graduated SW Eng (BS in CS) who is struggling to find a job and getting advice to go back and get a Masterās Degree in #LLM in order to be more marketable.
Iāve always heard that grad degrees arenāt strictly necessary in SWE to start but is this changing? Are there other time investments that make more sense (open source contributions, certifications, personal projects, etc?)?
CAI klƤrt jetzt auch auf Twitch Ć¼ber Cannabis auf. Er kennt sich mit dem Gesetz aus, kennt Risiken beim Umgang mit THC und informiert Ć¼ber Hanf im Allgemeinen und Ć¼ber verschiedene Sorten. Wusstest du dass CBD-haltiges Cannabis ohne THC keine psychischen Effekte hat, entspannend wirkt und beim einschlafen helfen kann?
āI speak to a lot of businesses around #AI, and particularly #GenAI, and Iām sensing a #hype fatigue. Part of this is due to the challenging of bridging the gap from PoC to production"
I can't imagine working without GenAI any more. I often write quick bash scripts to automate things, but for some reason, the syntax always falls out of my head and I'm constantly looking things up.
Now I just hit ChatGPT and ask it to write the script for me. With the latest version, is usually works perfectly the first time, so long as I craft a good prompt. This is a huge productivity boost.
With #LLM applications more abundant, have researchers been using them to assist their writing? We know they have when writing peer reviews [1], but how about doing so in writing their published papers?
Liang et al comes back to answer this question in [3]. They applied the same corpus-based methodology proposed in [2] on 950k papers published between 2020 to 2024, and the answer is a resounding YES, esp. in CS (up to 17.5%) (screenshot 1).