@chikim@mastodon.social avatar

chikim

@chikim@mastodon.social

Love music, technology, accessibility! Faculty at Berklee College of Music 👨🏻‍💻🎹🐕‍🦺

This profile is from a federated server and may be incomplete. Browse more on the original instance.

bryansmart, to random

@chikim I've really been enjoying VOLlama. Nice work! Would be nice to be able to switch between OpenAI and local models without going in to API prefs. More accelerator keys for menu options would be good, too. Could maybe a blank line be inserted in the log between each entry? Last, can you trap key-down on the Control key to stop the system voice? I know it's a hobby project, so no idea how much time you have for any of that, but just throwing them out there.

chikim,
@chikim@mastodon.social avatar

@bryansmart Thanks. What do you mean by blank line between each entry? Like user: bla bla blank line llama: bla bla blank line? This would be easiest request to implement. Switching between platform is little tricky bc I have to keep track of what model you used with which platform. I'm sure there's a way, but catching modifier by itself will be also tricky. Pause/resume will be significantly more work because how each system implements api call for tts. Also I'm using threading to feed text.

chikim,
@chikim@mastodon.social avatar

@bryansmart the Alternative is press alt or option+up to go into the edit mode, and it'll paste only one message at a time into the prompt field.

chikim, to llm
@chikim@mastodon.social avatar

I created a multi-needle in a haystack test where a randomly selected secret sentence was split into pieces and scattered throughout the document with 7.5k tokens in random places. The task was to find these pieces and reconstruct the complete sentence with exact words, punctuation, capitalization, and sequence. After running 100 tests, llama3:8b-instruct-q8 achieved a 44% success rate, while llama3:70b-instruct-q8 achieved 100%! https://github.com/chigkim/haystack-test

chikim, to llm
@chikim@mastodon.social avatar

VOLlama v0.1.0, an open-source, accessible chat client for OLlama
Unfortunately, many user interfaces for open source large language models are either inaccessible or annoying to use with screen readers, so I decided to make one for myself and others. Non screen reder users are welcome to use it as well.
I hope that ML UI libraries like Streamlit and Gradio will become more friendly with screen readers in the future, so making apps like this is not necessary!

https://chigkim.github.io/VOLlama/

chikim, to llm
@chikim@mastodon.social avatar

Wow, Private LLM locally runs Llama-3-8B on iOS. No idea how accessible with VoiceOver though. #LLM #AI #ML https://privatellm.app/en

chikim, to random
@chikim@mastodon.social avatar

For those of you who have used VOLlama, thank you for testing my hobby project! I'm considering moving it out of pre-release and marking the latest build as the first public release. Any thoughts on its stability or issues with bugs? Is it fairly stable or unstable? Of course, like all my other projects, it'll be free open source! @vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel

chikim,
@chikim@mastodon.social avatar

@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel I'm sure too. Also all the ridiculous complaints from entitled people. haha

chikim,
@chikim@mastodon.social avatar

@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel Have you disable the smart quotes from system settings > Keyboard?

chikim,
@chikim@mastodon.social avatar

I disabled the single file package for VOLlama 0.1.2. The file size got bigger, but it loads much much faster because it doesn't have to unpack itself every time when you open it.
https://chigkim.github.io/VOLlama/
@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel

chikim, to llm
@chikim@mastodon.social avatar

Tired of neutral responses from LLMs? Llama-3 seems great at following system prompts, so try this system prompt for an opinionated chatbot.
"You are a helpful, opinionated, decisive assistant. When asked a yes/no question, begin your respond with one word answer: yes or no. For open-ended or complex questions, adopt a firm stance. Justify your views with well-reasoned arguments, robust evidence, and succinct explanations, ensuring clarity and confidence in every response."

chikim, to llm
@chikim@mastodon.social avatar

Mark Zuckerberg on Llama 3: Apparently Meta stopped training Llama-3-70b before convergence and decided to move onto Llama-4. Meaning they could have kept training and made it smarter! Also llama3-70b multimodal as well as multilingual and bigger context window are coming. #LLM #AI #ML https://youtu.be/bc6uFV9CJGg

chikim,
@chikim@mastodon.social avatar

@kellogh They're on a race with other companies, so I guess it makes sense. You want to move onto next quicly and get better.

chikim,
@chikim@mastodon.social avatar

@kellogh Giving AI community free opensource at the same time having to answer the board and investors, I can understand his decision though. If you don't like his style, you can move onto other open source models. :)

chikim, to llm
@chikim@mastodon.social avatar

Start saving money for that M4 Ultra with 500GB! Maybe this could be the first open source that could surpass GPT-4! AIatMeta: "Llama 3 8B & 70B models are just the beginning of what we’re working to release for Llama 3. Our largest models currently in the works are 400B+ parameters and while they’re still in active development, we’re excited about how this work is trending." #LLM #AI #ML https://twitter.com/AIatMeta/status/1780997414071181370

chikim, to llm
@chikim@mastodon.social avatar
chikim, to ML
@chikim@mastodon.social avatar

Earlier today, Microsoft released new WizardLM-2 7b, 8x22b, 70b with great benchmark result, (of course, they say as good or almost same as GPT-4), but they removed weights on Huggingface, repo on Github, and their whitepaper. Someone on Reddit joked maybe they released GPT-4 by mistake! lol Quantized. weights from other people are still around on Huggingface! #ML #LLM #AI

chikim,
@chikim@mastodon.social avatar

@vick21 How do you use Copilot? IOS App? edge browser, Windows 11 machine? VScode plugin? I think I tried on the web a while ago, and I didn't like the interface. I stil pay $20 to OpenAI. lol

chikim, to macos
@chikim@mastodon.social avatar

Cool tip for running LLMs on Apple Silicon! By default, MacOS allows GPU to use up to 2/3 of RAM on machines with <=36GB and 3/4 on machines with >36GB. I used the command sudo sysctl iogpu.wired_limit_mb=57344 to override and allocate 56GB/64GB for GPU. This allowed me to load all layers of larger models for a faster speed! #MacOS #LLM #AI #ML

chikim, to llm
@chikim@mastodon.social avatar

Mixtral-8x22b keeps asking feedback on how it can improve even though it has no memory. lol "I understand that our conversation will not be used directly to improve my model, but the feedback you provide can still help me understand your needs better and improve my responses in future interactions with you or other users. If there are any specific areas where you feel I could improve, please let me know so that I can address those concerns in our future conversations." #LLM #Mistral #AI

chikim,
@chikim@mastodon.social avatar

@kellogh Yes, within the context for the particular conversation, but it kept saying if I provide feedback, it'll help other users as well. lol

chikim, to llm
@chikim@mastodon.social avatar

Thanks to all the recent large LLMs, "Apple is considering support for up to half a terabyte of RAM" for the highest-end m4 Mac configurations. I'm sure the price won't be cheap, but I bet it will be cheaper than getting 500GB in vram from NVidia. lol https://9to5mac.com/2024/04/11/apple-first-m4-mac-release-ai/

chikim, to llm
@chikim@mastodon.social avatar

Apparently Meta is planning to release two small varients of Llama-3 next week "as a precursor to the launch of the biggest version of Llama 3, expected this summer." Command-r-plus, mixtral 8x22b, Google CodeGemma... All of sudden companies are releasing LLMS like crazy! Where's Apple? Maybe In WWDC 2024? lol #LLM #AI #ML https://www.theinformation.com/articles/meta-platforms-to-launch-small-versions-of-llama-3-next-week

chikim,
@chikim@mastodon.social avatar

@ppatel Yeah I don't think they'll go open source, but they'll probably make a way for developers to take adantage of it.

chikim, to random
@chikim@mastodon.social avatar

I've noticed that Ollama seems to use CPU with cache instead of GPU when the entire model and context can't fit into the unified memory with llama.cpp. In my case, prompt speed got 14x slower, and eval speed got 4.3x slower. The GPU usage for Ollama stayed at 0%, and the wired memory usage in the Activity Monitor was significantly less than the model size. An extreme decrease in speed with a large model may be bc Ollama uses the CPU not GPU. Then reducing the context size might increase speed.

chikim, to ai
@chikim@mastodon.social avatar

Maybe we have an open source competitor for ElevenLabs? Check out their demo which they switch between original and synthesized. I can't tell. lol Apparently they're going to fully open source codebase and model weights. #TTS #AI #ML https://jasonppy.github.io/VoiceCraft_web/

chikim,
@chikim@mastodon.social avatar

@ppatel Their demo may have intentionally featured speech with poor quality because it's easier to hear the artifacts in pristine audio than in audio with noise. I guess we'll find out when they release the weights at the end of March.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • InstantRegret
  • mdbf
  • ethstaker
  • magazineikmin
  • cubers
  • rosin
  • thenastyranch
  • Youngstown
  • osvaldo12
  • slotface
  • khanakhh
  • kavyap
  • DreamBathrooms
  • provamag3
  • Durango
  • everett
  • tacticalgear
  • modclub
  • anitta
  • cisconetworking
  • tester
  • ngwrru68w68
  • GTA5RPClips
  • normalnudes
  • megavids
  • Leos
  • lostlight
  • All magazines