Blaed

@Blaed@lemmy.world

This profile is from a federated server and may be incomplete. Browse more on the original instance.

Blaed,

I was pleasantly surprised by many models of the Deepseek family. Verbose, but in a good way? At least that was my experience. Love to see it mentioned here.

Blaed,

What sort of tokens per second are you seeing with your hardware? Mind sharing some notes on what you’re running there? Super curious!

Blaed,

I appreciate this comment more than you will know. Thanks for sharing your thoughts.

It’s been a challenge realizing this time capsule is more than that - but a grassroots community and open-source project bigger than me. Adjusting the content to reflect shared interests has been a concept I have grappled with these last few weeks - especially as we exit some of the exciting innovations we saw earlier this year.

I think the type of content series you mention is the next step here - that being practical and pragmatic insights that illustrate / enable new workflows and applications.

That being said, this type of content creation will likely take more time than the journalistic reporting I’ve been doing - but I think it’s absolutely worth the effort and the next logical evolution of whatever this forum becomes.

Thanks again for your kind words. I work 5/6 day weeks in my tech job on top of this, so burnout is a real thing. I think I’ll go for a hike this week and reevaluate how to best proliferate and spread FOSAI.

If you’re reading this now and have ideas of your own - I’m all ears.

Blaed,

I am actively exploring this question.

So far - it’s been the best performing 7B model I’ve been able to get my hands on. Anyone running consumer hardware could get a GGUF version running on almost any dedicated GPU/CPU combo.

I am a firm believer there is more performance and better quality of responses to be found in smaller parameter models. Not too mention interesting use cases you could apply fine-tuning an ensemble approach.

A lot of people sleep on 7B, but I think Mistral is a little different - there’s a lot of exploring to be had finding these use cases but I think they’re out there waiting to be discovered.

I’ll definitely report back on how the first attempt at fine-tuning this myself goes. Until then, I suppose it would be great for any roleplay or basic chat interaction. Given it’s low headroom - it’s much more lightweight to prototype with outside of the other families and model sizes.

If anyone else has a particular use case for 7B models - let us know here. Curious to know what others are doing with smaller params.

Blaed,

I respect your honesty.

Blaed,

What I find interesting is how useful these tools are (even with the imperfections that you mention). Imagine a world where this level of intelligence has a consistent low error rate.

Semantic computation and agentic function calling with this level of accuracy will revolutionize the world. It’s only a matter of time, adoption, and availability.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • everett
  • magazineikmin
  • mdbf
  • thenastyranch
  • khanakhh
  • rosin
  • Youngstown
  • ethstaker
  • slotface
  • modclub
  • kavyap
  • DreamBathrooms
  • Durango
  • provamag3
  • ngwrru68w68
  • InstantRegret
  • tacticalgear
  • GTA5RPClips
  • cubers
  • normalnudes
  • osvaldo12
  • tester
  • anitta
  • cisconetworking
  • megavids
  • Leos
  • lostlight
  • All magazines