#Tech giants have been partnering w/ up-&-coming #AI start-ups, like #Microsoft backing #OpenAI, but Amazon has not been as active as rivals until now.
Considering this set of principles by which #Anthropic tries to train its #AI, I found that it does not always meet those principles.
Anthropic, an AI startup founded by former OpenAI staff and that raised $1.3B, including $300M from #Google, details its “constitutional AI” for safer #chatbots.
After months of work and $10 million, Databricks has unveiled DBRX - the world's most potent publicly available open-source large language model.
DBRX outperforms open models like Meta's Llama 2 across benchmarks, even nearing the abilities of OpenAI's closed GPT-4. Novel architectural tweaks like a "mixture of experts" boosted DBRX's training efficiency by 30-50%.
Tried Claude.ai from #Anthropic -
Its UX has an ivory background with black and violet font. Not sure if it’s a conscious choice of showing privilege based on trust, but it works.
The chat responses have an embedded option to ‘copy’ and give feedback. It’s helpful for both users and the product.
It says “no” more often than its competitor for answers it is not sure of.
Has little features like the provision to delete the security code that’s sent via SMS once used. #ai#chatgpt
Anthropic researchers find that AI models can be trained to deceive
The models acted deceptively when fed their respective trigger phrases. Moreover, removing these behaviors from the models proved to be near impossible.
The most commonly used AI safety techniques had little to no effect on the models’ deceptive behaviors
Sounds like it can replace/augment those with experience levels #lmgt4y#StackOverflow#StackExchange
But actual specialists? Have -1 incentive now to write down their experience. 📉trends ensue.
Back in 2022, Anthropic CEO Dario Amodei chose not to release the super-powerful AI chatbot, Claude, that his company had just finished training, opting instead to focus on further internal safety testing. That move likely cost the company billions — three months later, OpenAI launched ChatGPT.
Having a reputation for credibility and caution in an industry that appears to have thrown a large chunk of it to the wind is not a bad thing though. Claude is now in its third iteration, but that caution remains, with the company pledging not to release AIs above certain capability levels until it can develop sufficiently robust safety measures.
TIME’s interview with Amodei gives an insight into what the AI industry might look like when safety is considered a core part of the strategy.
i’m very excited about the interpretability work that #anthropic has been doing with #LLMs.
in this paper, they used classical machine learning algorithms to discover concepts. if a concept like “golden gate bridge” is present in the text, then they discover the associated pattern of neuron activations.
this means that you can monitor LLM responses for concepts and behaviors, like “illicit behavior” or “fart jokes”
this is great work. i’m excited to see where this goes next
i hope #anthropic exposes this via their API. at this point in time, most of the promising interpretability work is only available on open source models that you can run yourself. it would be great to also have them available from #AI vendors
Big news in the #AI world: Current and former employees of #OpenAI and other AI companies like #DeepMind and #Anthropic warn of ethical and safety risks and want a way to publicly whistleblow about these without fear of retaliation.