LocalLLaMa

Long Sequence Modeling with XGen: A 7B LLM Trained on 8K Input Sequence Length (blog.salesforceairesearch.com)

TLDR We trained a series of 7B LLMs named XGen-7B with standard dense attention on up to 8K sequence length for up to 1.5T tokens. We also fine tune the models on public-domain instructional data. The main take-aways are: * On standard NLP benchmarks, XGen achieves comparable or better results

Baichuan 7B reaches top of LLM leaderboard for it's size? (github.com)

baichuan-7B is an open-source large-scale pre-trained model developed by Baichuan Intelligent Technology. Based on the Transformer architecture, it is a model with 7 billion parameters trained on approximately 1.2 trillion tokens. It supports both Chinese and English, with a context window length of 4096. It achieves the best...

  • All
  • Subscribed
  • Moderated
  • Favorites
  • localllama
  • Durango
  • DreamBathrooms
  • thenastyranch
  • magazineikmin
  • khanakhh
  • InstantRegret
  • Youngstown
  • ngwrru68w68
  • slotface
  • rosin
  • tacticalgear
  • mdbf
  • kavyap
  • modclub
  • megavids
  • osvaldo12
  • ethstaker
  • cubers
  • normalnudes
  • everett
  • tester
  • GTA5RPClips
  • Leos
  • cisconetworking
  • provamag3
  • anitta
  • JUstTest
  • lostlight
  • All magazines