@metin That's interesting because in my circle (tech-savvy nerds and researchers) a lot of people use and recommend the use of ChatGPT. For example, the tutor of a scientific containerization course I attended last week used ChatGPT extensively to solve some very specific problems. Of course, you could get the same results using search engines, but an AI is much faster in these cases and can at least point you in the right direction.
@daniel Yes, I think it's a matter of time before AI will be widely used. Personally, I barely use ChatGPT, because I don't trust the output yet, due to the hallucinations. I'm waiting until that has been solved. But I know that it's already usable for exact purposes like coding.
Finally tried #OpenAI’s GPT-4o chatbot and asked it about my go-to LLM topic (my favorite #CahillConcialdi map projection). It gave a factually incorrect answer. 🤷
So. No. It wasn't. Idiocracy was cynical defeatist garbage to lull otherwise intelligent high minded people into thinking there's no point in hoping or working for anything better.
In that world, every person is an idiot.
In the real world, I mainly see intelligent people with good intentions. In our billions, we share that we are all enduring a tsunami of vicious stupidity ginned up by a population that probably would not fill a sports stadium.
It's amazing how completely fucked normal people are when it comes to #Microsoft#copilot and understanding what is coming to their computers. This is an actual conversation I had today.
T: "I heard Microsoft has a new thing coming that takes screenshots of your screen called Copilot. Do I have that?"
M: "That's called Recall and I think it's only coming to Copilot+ computers."
T: "Well I already have Copilot. I think I have normal Copilot and Copilot for Office 365."
M: "Um I don't think its related to that. For some reason they're calling new laptops Copilot+"
T: "My son has Copilot from his programming class. Is that the same as my Copilot?"
M: "No that's GitHub Copilot which is a different thing."
You couldn't have done a worse job with naming if you tried. Hats off to Microsoft marketing for being so confusing it took a team of people walking through your marketing docs to figure out what unwanted feature is coming to who. #ai#llm
@john@matdevdug I know for a fact that some airlines are using Surface devices; both EFBOne and Jeppesen FlightDeck Pro offer versions for iPad and Windows on Surface Pro.
Is there a consensus among #AI experts about whether #LLM models can reliably summarize specific text without introducing weird extraneous information?
My assumption has been that the models can't really do this, and I can't imagine a way that they could reliably avoid omitting important details or even add information, but I'm not sure I've seen much discussion of this aspect of what they can and can't do.
Obviously, using an LLM to choose a 'random number' is going to not actually generate a random or pseudorandom number. LLMs aren't actually able to run random number code, it'll just choose from a small subset of 'random' that it's read before.
Letting regular people use LLMs in an open ended way is leading to a lot of nonsense :neocat_facepalm:
While you consider submitting to the Call for Problems for the #ALTA2024 Shared Task (see link below), we'd like to share with you the winner of the #ALTA2023 Shared Task, which involved distinguishing #LLM-generated from human-generated text.
Here, Rinaldo Gagiano and Lin Tian from #RMIT use a fine-tuned #Falcon7B model with label smoothing, yielding an accuracy of 99.91%. Well done!
"As OpenAI trains its new model, its new Safety and Security committee will work to hone policies and processes for safeguarding the technology, the company said. The committee includes Mr. Altman, as well as OpenAI board members Bret Taylor, Adam D’Angelo and Nicole Seligman. The company said that the new policies could be in place in the late summer or fall."
@sebsauvage j'ai l'impression aussi qu'economiquement parlant, il y a que Nvidia qui fait sont beurre et que les autres boîtes sur les investissements des banques.
There was a paper shared recently about the exponential amount of training data to get incremental performance gains in #llm#ai, but I seem to have misplaced it. Do you know what I’m referring to? Mind sharing the link if you have it?
Dear #LazyWeb: What is needed to get Google to show me fun AI suggestions like adding glue to pizza sauce? How do I get the fake-#AI results?
I am not kidding. Most of my searches are on macOS (12 & 14) using Safari and occasionally other browsers (I've got 7 installed...) but I only log into my G accounts on an as-needed basis and because I use a real mail client for email, I almost never need to log in. I wipe cookies on every browser restart.
Modern #AI text generators create randomized output with no prior planning. They resist to be quality-checked by tools and processes established in the software industry.
Given this, the results are amazing. However, companies are selling the idea that these assistants will do quality checking themselves soon™.
This is mass delusion. But hey, the perks for managers/investors are worthwhile 🤷.
Measuring correctness is very hard, so any percentage, especially a high one, needs context:
What problems do you pose to the #AI?
How is the accuracy of the answer measured?
How is reproducibility of the measurement ensured? (Also over small modifications of the training data or code? I.e. regression tests?)
Right now, I can be sure that my hammer hits the nail, unless my aim is off or the hammerhead was already loose. AI tools don't have that property and maybe never will. @TheServitor
@marcel@TheServitor correctness is a little wonky to talk about, because the comparison isn’t direct. in engineering, we assume that 100% correctness isn’t achievable, if you haven’t found bugs, you just haven’t used it long enough. but with a program, every time you find a bug and fix it, you’re permanently closer to correct, whereas with an LLM, each inference is a new case. all you can do is optimize the entire model. so LLM correctness is statistically extremely difficult