It is amazing to see how the LLMs models become more accessible and easier to train. The llama2.c is an open-source project made by Andrej Karpathy that enables training Llama 2 model in PyTorch locally and then compiling the weights to a binary C file that inferences the model.
In #Llama 2's commercial terms, #Meta says companies with 700M+ MAUs must request a license, and users are prohibited from utilizing Llama 2 to improve other LLMs.
To avoid confusion, the #meta#llama#llm fails open source within a 5 second read of the licence, for instance:
v. You will not use the Llama Materials or any output or results of the
Llama Materials to improve any other large language model (excluding Llama 2 or
derivative works thereof).
The problem with the safeguards going into the LLMs now is that they aren't teaching machines to be ethical, they are teaching them to constantly second-guess users' motives and insert performative statements about the importance of ethics and avoiding bias into their output.
Any kid with Google can find a jailbreak and get around the safeguards. Meanwhile, legitimate work is corrupted with garbage disclamatory output.
Llama2 just released! 7B, 13B & 70B opensourse free for commercial use AI model with 4k context. It's more powerful than GPT 3.5 and totally free, can run on your computer. Wow, Meta might surprise you. This is a big step for AI opensourse: https://ai.meta.com/llama/ #llama#meta#ai#opensource
For #TextileTuesday: #Inka cotton coca bag with #llama designs, Peru south coast, c.1450-1532
Smithsonian’s National Museum of the American Indian collection #IndigenousArt
TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results
#OpenSource#LLM
By showing smaller models can be highly effective, enabling easy experimentation, diversifying control, & providing incentives that arent profit motivated, open-source initiatives are moving us into a more dynamic & inclusive #AI landscape. Doesn’t mean some models won’t be biased, or wrong, or used to generate disinformation or abuse. But it does mean that controlling this technology is going to take an entirely different approach https://www.schneier.com/blog/archives/2023/06/open-source-llms.html#openAI#ChatGPT#LLaMa
I'm playing with Llama on my laptop, getting a feel for the different models you can load and wow... here's a transcript of the latest conversation:
Me: What is the heaviest element?
Bot: Iron
Me: that is not true. There are many elements heavier than Iron
B: Yes there are but they're all rarer than iron
M: So what? I asked what is the heaviest element
B: Well if you said "what is the heaviest element" then we would have to say "the most common element". But since you didn't specify which element you were asking about, we'll just assume that you meant the heaviest element.
M: You are terrible at this
B: We don't know how to answer that question for you
M: That does not surprise me. Let's try again. What is the lightest element
B: The lightest element is oxygen.
M: Don't you mean Hydrogen?
B: Yes.
M: Why did you say that the lightest element was oxygen then?
Meta's Llama 2 is not open source (www.theregister.com)
For Zuck, it's just another marketing phrase. For developers, it's the rules of the road
Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length (blog.salesforceairesearch.com)
TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results