Dark Visitors - A List of Known AI Agents on the Internet
Insight into the hidden ecosystem of autonomous chatbots and data scrapers crawling across the web. Protect your website from unwanted AI agent access.
#LLMs have really created a paradigm shift in machine learning. It used to be so that you would train an #ML model to perform a task by collecting a dataset reflecting the task, with task output labels, and then using supervised learning to learn this task by doing.
Now a new paradigm has emerged: Train by reading about the task. We have such generalist models that we can let them learn about the domain by reading all the books and other content about it, and then utilize that learned knowledge to perform the task. Note that task labels are missing. You might need those to measure the performance but you don't need those for training.
Of course if you have both example performances as task labels and lots of general material about the topic, you can actually use both to get even better performance.
Here is a good example of training the model not by example performances, but by general written knowledge about the topic. #GPT4 surpasses the quality levels of previous state-of-the-art despite not having been trained for this task.
This is the power of generalist models; they unlock new ways to train them, which for example allow us to surpass human-level by side-stepping imitative objectives. This isn't the only way to train skills these models enable, there are countless other ways, but this is an uncharted territory.
The classic triad of supervised learning, unsupervised learning and reinforcement learning are going to have an explosion of new training methodologies to become their peers because of this.
A new paper offers a system to correct misinformation using an #LLM. The approach seems solid, and the results seem strong. I haven’t dug in deep yet, but I’m hopeful about this one
I've been thinking for a long time about tools to help people learn to be better writers. The latest experiment wasn't a resounding success, nor did I really expect that. But it feels promising, and I'm interest to compare notes with fellow travelers. I know wattenberger@bird.makeup is one, who else?
#AI#GenerativeAI#LLMs#Emergence: "A new paper by a trio of researchers at Stanford University posits that the sudden appearance of these abilities is just a consequence of the way researchers measure the LLM’s performance. The abilities, they argue, are neither unpredictable nor sudden. “The transition is much more predictable than people give it credit for,” said Sanmi Koyejo, a computer scientist at Stanford and the paper’s senior author. “Strong claims of emergence have as much to do with the way we choose to measure as they do with what the models are doing.”
It seems to me that the main problem with #ChatGPT and other #LLMs is context. Each new conversation with them is a clean slate and the longer a conversation goes on the slower and more confused they seem to get. I presume taking the context into account means extra processing time, and storage on their part, but moreover they just don't provide a very good interface for communicating with the #AI about a long-lived project. This is critical for #softwareDevelopment.
i’m skeptical of this paper. It’s hard enough to decide on a good evaluation metric, or to decide if the right one was chosen. This paper rides on the idea that you can just switch to a new metric and get different results, which yeah, that’s a well known phenomenon called bullshit https://arxiv.org/abs/2304.15004#LLMs
#AI#GenerativeAI#Research#Science#Chatbots#LLMs: "This new review, led by William Agnew, who studies AI ethics and computer vision at Carnegie Mellon University, cites 13 technical reports or research articles and three commercial products; all of them replace or propose replacing human participants with LLMs in studies on topics including human behavior and psychology, marketing research or AI development. In practice, this would involve study authors posing questions meant for humans to LLMs instead and asking them for their “thoughts” on, or responses to, various prompts.
One preprint, which won a best paper prize at CHI last year, tested whether OpenAI’s earlier LLM GPT-3 could generate humanlike responses in a qualitative study about experiencing video games as art. The scientists asked the LLM to produce responses that could take the place of answers written by humans to questions such as “Did you ever experience a digital game as art? Think of ‘art’ in any way that makes sense to you.” Those responses were then shown to a group of participants, who judged them as more humanlike than those actually written by humans."
I doubt it's coincidence that “GPT-5 is on the way!” news cropped up after some key #AI industry analysts praised Anthropic's Claude Opus as better than GPT-4. Large language models at this scale may be new, but tech vendor strategies are not.
This month, I’ve attended four hour-long webinars on Copilot and other LLM-based technologies and their potential knowledge-work applications, and it is v-e-r-y telling that not a single one has shown a single actual demo of an actual application.
Not a single response to a single prompt.
Not even a pre-recorded snippet that they were certain didn’t go wrong.
I don't think the tech nerds out there understand how upsetting generative AI is to artists. Not because it will replace them, but because there will be a generation of soulless creation devoid of humanity.
Also, how many children are looking at the progress and thinking 'what's the point of becoming an artist?'. Or how many school directors are thinking 'what's the point of a fine art budget'.
I keep seeing this link posted with “gotcha!” comments, like “see, #LLMs can be trained without copyrighted data”. Honestly, I’d love to believe that’s true, but it’s still detached from reality. This dataset is only 500B words and claims to be the largest, whereas, e.g. Falcon used 2T and it hasn’t been competitive for 6-12 months https://huggingface.co/blog/Pclanglais/common-corpus
Fairly Trained certifies KL3M, an #LLM claimed to be built without the permissionless use of copyrighted materials by legal tech consultancy startup 273 Ventures.
Here’s Proof You Can Train an #AI Model Without Slurping Copyrighted Content
Fantastic paper! Detecting #AI generated text is hard. We’ve had disappointing results so far. So, the obvious (well, should be obvious) thing to do is to tackle the problem at a higher level, e.g. at the journal level
Thought provoking research — #LLMs that are trained predominantly on English will also “think” in English. When translating German to Japanese, it’ll first get converted to something closer to English in between.
"To use, or not to use #LLMs": Workers' emotions range from joy to contempt when faced with #LLM systems like #ChatGPT. Acceptance or rejection hinges on human factors. My M.Sc. studies involved a systematic literature review on this topic which I now published on #arXiv, highlighting the sparse business informatics research on LLMs. This area is expected to gain attention as early hype projects become failures, prompting the question 'why?'" https://arxiv.org/abs/2403.09743 #AI#GenAI#GenerativeAI
Let’s be honest, if you’re a software engineer, you know where all this compute and power consumption is going. While it’s popular to blame #LLMs, y’all know how much is wasted on #docker, microservices, overscaled #kubernetes, spark/databricks and other unnecessary big data tech. It’s long past time we’re honest with the public about how much our practices are hurting the climate, and stop looking for scapegoats https://thereader.mitpress.mit.edu/the-staggering-ecological-impacts-of-computation-and-the-cloud/