#LLaMa - Threads - kbin.social

Everybody’s talking about Mistral, an upstart French challenger to OpenAI (arstechnica.com)

On Monday, Mistral AI announced a new AI language model called Mixtral 8x7B, a "mixture of experts" (MoE) model with open weights that reportedly truly matches OpenAI's GPT-3.5 in performance—an achievement that has been claimed by others in the past but is being taken seriously by AI heavyweights such as OpenAI's Andrej...

Meta's Llama 2 is not open source (www.theregister.com)

For Zuck, it's just another marketing phrase. For developers, it's the rules of the road

Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length (blog.salesforceairesearch.com)

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results

Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length (blog.salesforceairesearch.com)

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results