chikim

@chikim@mastodon.social

Love music, technology, accessibility! Faculty at Berklee College of Music 👨🏻‍💻🎹🐕‍🦺

This profile is from a federated server and may be incomplete. Browse more on the original instance.

bryansmart, 1 month ago to random

@chikim I've really been enjoying VOLlama. Nice work! Would be nice to be able to switch between OpenAI and local models without going in to API prefs. More accelerator keys for menu options would be good, too. Could maybe a blank line be inserted in the log between each entry? Last, can you trap key-down on the Control key to stop the system voice? I know it's a hobby project, so no idea how much time you have for any of that, but just throwing them out there.

reply

expand (24)

collapse (24)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart Thanks. What do you mean by blank line between each entry? Like user: bla bla blank line llama: bla bla blank line? This would be easiest request to implement. Switching between platform is little tricky bc I have to keep track of what model you used with which platform. I'm sure there's a way, but catching modifier by itself will be also tricky. Pause/resume will be significantly more work because how each system implements api call for tts. Also I'm using threading to feed text.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@bryansmart the Alternative is press alt or option+up to go into the edit mode, and it'll paste only one message at a time into the prompt field.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago to llm

I created a multi-needle in a haystack test where a randomly selected secret sentence was split into pieces and scattered throughout the document with 7.5k tokens in random places. The task was to find these pieces and reconstruct the complete sentence with exact words, punctuation, capitalization, and sequence. After running 100 tests, llama3:8b-instruct-q8 achieved a 44% success rate, while llama3:70b-instruct-q8 achieved 100%! #LLM #AI #ML https://github.com/chigkim/haystack-test

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 1 month ago to llm

VOLlama v0.1.0, an open-source, accessible chat client for OLlama
Unfortunately, many user interfaces for open source large language models are either inaccessible or annoying to use with screen readers, so I decided to make one for myself and others. Non screen reder users are welcome to use it as well.
I hope that ML UI libraries like Streamlit and Gradio will become more friendly with screen readers in the future, so making apps like this is not necessary!
#LLM #AI #ML
https://chigkim.github.io/VOLlama/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jaybird110127, objectinspace

chikim, 1 month ago to llm

Wow, Private LLM locally runs Llama-3-8B on iOS. No idea how accessible with VoiceOver though. #LLM #AI #ML https://privatellm.app/en

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago to random

For those of you who have used VOLlama, thank you for testing my hobby project! I'm considering moving it out of pre-release and marking the latest build as the first public release. Any thoughts on its stability or issues with bugs? Is it fairly stable or unstable? Of course, like all my other projects, it'll be free open source! @vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel I'm sure too. Also all the ridiculous complaints from entitled people. haha

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel Have you disable the smart quotes from system settings > Keyboard?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

I disabled the single file package for VOLlama 0.1.2. The file size got bigger, but it loads much much faster because it doesn't have to unpack itself every time when you open it.
https://chigkim.github.io/VOLlama/
@vick21 @technowitch @FreakyFwoof @kaveinthran @pixelate @ppatel

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jaybird110127

chikim, 1 month ago to llm

Tired of neutral responses from LLMs? Llama-3 seems great at following system prompts, so try this system prompt for an opinionated chatbot.
"You are a helpful, opinionated, decisive assistant. When asked a yes/no question, begin your respond with one word answer: yes or no. For open-ended or complex questions, adopt a firm stance. Justify your views with well-reasoned arguments, robust evidence, and succinct explanations, ensuring clarity and confidence in every response."
#LLM #AI #ML

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ miki, datajake1999

chikim, 1 month ago to llm

Mark Zuckerberg on Llama 3: Apparently Meta stopped training Llama-3-70b before convergence and decided to move onto Llama-4. Meaning they could have kept training and made it smarter! Also llama3-70b multimodal as well as multilingual and bigger context window are coming. #LLM #AI #ML https://youtu.be/bc6uFV9CJGg

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@kellogh They're on a race with other companies, so I guess it makes sense. You want to move onto next quicly and get better.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago

@kellogh Giving AI community free opensource at the same time having to answer the board and investors, I can understand his decision though. If you don't like his style, you can move onto other open source models. :)

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago to llm

Start saving money for that M4 Ultra with 500GB! Maybe this could be the first open source that could surpass GPT-4! AIatMeta: "Llama 3 8B & 70B models are just the beginning of what we’re working to release for Llama 3. Our largest models currently in the works are 400B+ parameters and while they’re still in active development, we’re excited about how this work is trending." #LLM #AI #ML https://twitter.com/AIatMeta/status/1780997414071181370

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 1 month ago to llm

Llama-3 is out. #LLM #ML #AI https://llama.meta.com/llama3/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ppatel

chikim, 2 months ago to ML

Earlier today, Microsoft released new WizardLM-2 7b, 8x22b, 70b with great benchmark result, (of course, they say as good or almost same as GPT-4), but they removed weights on Huggingface, repo on Github, and their whitepaper. Someone on Reddit joked maybe they released GPT-4 by mistake! lol Quantized. weights from other people are still around on Huggingface! #ML #LLM #AI

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago

@vick21 How do you use Copilot? IOS App? edge browser, Windows 11 machine? VScode plugin? I think I tried on the web a while ago, and I didn't like the interface. I stil pay $20 to OpenAI. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago to macos

Cool tip for running LLMs on Apple Silicon! By default, MacOS allows GPU to use up to 2/3 of RAM on machines with <=36GB and 3/4 on machines with >36GB. I used the command sudo sysctl iogpu.wired_limit_mb=57344 to override and allocate 56GB/64GB for GPU. This allowed me to load all layers of larger models for a faster speed! #MacOS #LLM #AI #ML

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ ppatel, jaybird110127, miki

chikim, 2 months ago to llm

Mixtral-8x22b keeps asking feedback on how it can improve even though it has no memory. lol "I understand that our conversation will not be used directly to improve my model, but the feedback you provide can still help me understand your needs better and improve my responses in future interactions with you or other users. If there are any specific areas where you feel I could improve, please let me know so that I can address those concerns in our future conversations." #LLM #Mistral #AI

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago

@kellogh Yes, within the context for the particular conversation, but it kept saying if I provide feedback, it'll help other users as well. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago to llm

Thanks to all the recent large LLMs, "Apple is considering support for up to half a terabyte of RAM" for the highest-end m4 Mac configurations. I'm sure the price won't be cheap, but I bet it will be cheaper than getting 500GB in vram from NVidia. lol #LLM #AI #ML https://9to5mac.com/2024/04/11/apple-first-m4-mac-release-ai/

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 2 months ago to llm

Apparently Meta is planning to release two small varients of Llama-3 next week "as a precursor to the launch of the biggest version of Llama 3, expected this summer." Command-r-plus, mixtral 8x22b, Google CodeGemma... All of sudden companies are releasing LLMS like crazy! Where's Apple? Maybe In WWDC 2024? lol #LLM #AI #ML https://www.theinformation.com/articles/meta-platforms-to-launch-small-versions-of-llama-3-next-week

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago

@ppatel Yeah I don't think they'll go open source, but they'll probably make a way for developers to take adantage of it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 2 months ago to random

I've noticed that Ollama seems to use CPU with cache instead of GPU when the entire model and context can't fit into the unified memory with llama.cpp. In my case, prompt speed got 14x slower, and eval speed got 4.3x slower. The GPU usage for Ollama stayed at 0%, and the wired memory usage in the Activity Monitor was significantly less than the model size. An extreme decrease in speed with a large model may be bc Ollama uses the CPU not GPU. Then reducing the context size might increase speed.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Binder

chikim, 2 months ago to ai

Maybe we have an open source competitor for ElevenLabs? Check out their demo which they switch between original and synthesized. I can't tell. lol Apparently they're going to fully open source codebase and model weights. #TTS #AI #ML https://jasonppy.github.io/VoiceCraft_web/

reply

expand (3)

collapse (3)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ objectinspace, datajake1999, ppatel

chikim, 2 months ago

@ppatel Their demo may have intentionally featured speech with poor quality because it's easier to hear the artifacts in pristine audio than in audio with noise. I guess we'll find out when they release the weights at the end of March.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...