chikim

@chikim@mastodon.social

Love music, technology, accessibility! Faculty at Berklee College of Music 👨🏻‍💻🎹🐕‍🦺

This profile is from a federated server and may be incomplete. Browse more on the original instance.

chikim, 13 days ago to accessibility

1/3 I tested some popular latest LLM UIs for accessibility with screen readers, including oobabooga text-generation-webui, Open WebUI (aka Ollama WebUI), GPT4All, LM Studio, Koboldcpp, and Llama.cpp server on Windows. The most accessible was Llama.cpp server, though it had the fewest features. Oobabooga was also good, except for the list box not announcing choices as you browse; however, you can check your selection afterward. #accessibility #LLM #AI

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 11 days ago to ai

Raspberry Pi Goes All In on AI With $70 Hailo Kit" for Raspberry Pi 5. It can process 13 TOPS in comparison to Apple M3 NPU which can process 18 TOPS. #AI #ML https://www.pcmag.com/news/raspberry-pi-goes-all-in-on-ai-with-70-hailo-kit

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 11 days ago to random

Hey, European musicians, what do you say about this nonsense? lol
Me: Could you explain what is H flat chord in harmony?
Gpt-4o: Certainly! In harmony, an H flat chord is typically known as a B flat chord in most musical contexts. The term "H" for B natural is used in some European countries, but "H flat" would be B flat in those regions as well.

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 11 days ago

Wow, maybe GPT-4O is correct? lol According to Wikipedia, "In Germany, Central and Eastern Europe, and Scandinavia...the note a semitone below C is called H." https://en.wikipedia.org/wiki/B_(musical_note)#:~:text=Variation%20of%20meaning%20by%20geographical%20region,-The%20referent%20of&text=However%2C%20in%20Germany%2C%20Central%20and,below%20C%20is%20called%20H.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 23 days ago to random

VOLlama v0.1.4-beta.1: System Prompt manager; Import Awesome ChatGPT Prompts; Partial support for GPT-4O (Throws an error for token counter in some cases but just ignore for now); Able to attach entire document and feed for long context model. https://chigkim.github.io/VOLlama/

reply

expand (30)

collapse (30)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ simon

chikim, 22 days ago

@simon I don't have Anthropic api, so it'd be hard to test and implement unfortunately. Do they have like pay as you go plan for API instead of monthly sub?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@simon OH Cool, they give you $5 free trial credit for 14 days. I'll look into it!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@jscholes @simon Q: Can I use the Claude API for individual use? A: No. Access to the API is subject to our Commercial Terms of Service and is not intended for individual use.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@jscholes @simon Risk is low, but I'm just one guy doing it for free. I'm just being caucious.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@jscholes @simon Who knows. Ask lawyers. It's open source, so if anyone interested, Free feel to fork and implement Anthropic api.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@jscholes @simon I also tried my wife's number, so it seems like they're blocking individual numbers or something. I tried... Sorry, I'm out. lol

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@simon @jscholes We use Google Fi.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 22 days ago

@serrebi @simon @jscholes Lookss like OpenRouter also uses OpenAI api end point, so you could try setting OPENAI_BASE_URL and OPENAI_API_KEY environment variable before running VOLlama. I haven't tried it.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 21 days ago

@simon @serrebi @jscholes Don't bother. I just tried, and it doesn't work. I use Llamaindex, and apparently it's very specifically written for OpenAI. LlamaIndex has OpenAILike module for generic OpenAI API that's not OpenAI.com, but it doesn't seem to work. I need to look into it more.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 21 days ago

@simon @serrebi @jscholes It's problem with LlamaIndex, not OpenRouter.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 23 days ago to llm

Llama.cpp now supports the distributed inference, meaning you can use multiple computers to speed up the response time! Network is the main bottleneck, so all machines need to be hard wired, not connected through wifi. ##LLm #AI #ML https://github.com/ggerganov/llama.cpp/tree/master/examples/rpc

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

jcsteh, 24 days ago to random

As I understand it, with all current LLMs, having a conversation involves feeding the model the entire conversation up to this point. That is, there is no memory: the prompt you feed it just gets longer and longer. So how does that work with something like GPT-4O which could be processing audio and/or video at a much faster rate? Surely the prompts must get very large very quickly with anything beyond a short interaction? Doesn't that mean the responses take longer and cost more as the conversation gets longer?

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jaybird110127

chikim, 24 days ago

@jcsteh I don't know how it works, so everything is my speculation. lol Anyways, ChatGPT has memory feature. Possibly Retrieval-Augmented Generation? Also when you are about to reach the context limit, maybe they ask model to summarize previous context, and discard the detail and keep the important ones. Also maybe multimodal has longer context? For example, Google Gemini 1.5 Pro has 2 millions context length!

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 24 days ago to random

BeMyEyes Privacy Policy 1/2: We record and store video streams and other images to enforce our Terms of Service, to promote and preserve safety, and to improve our Services and create new Services. We may provide recorded video streams or images to other organizations that are performing research or working to develop products and services that may assist blind and low-vision people or other members of the general public.

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 24 days ago

BeMyEyes Privacy Policy 2/2: If you use Be My AI, the images you submit will be processed by our third party artificial intelligence provider. If your video or image contains personal information - for example you say your name on the video or show mail with your home address - that information will be included in videos or images that we store and use as described above.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 24 days ago

@JonathanMosen Sorry for specifically tagging you, but do you have any thought on the BeMyEyes privacy policy re images and videos you submit to the platform? It's too long to include in one post, but I pasted the relevant quotes in this thread. Should blind folks just say no such thing as free lunch and move on?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 23 days ago

@twynn @JonathanMosen For paid customer, OpenAI has feature to opt out from getting your data used for training. Also if you use API, your data "do not become part of the training data unless you explicitly opt in." Your data is deleted within 30 days unless required for legal reasons, and is only accessible by authorized OpenAI employees, as well as specialized third-party contractors (that are subject to confidentiality and security obligations). https://www.maginative.com/article/openai-clarifies-its-data-privacy-practices-for-api-users/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

chikim, 24 days ago

Do not feed images of your online meetings to BeMyAI unless you have consent from everyone involved to use their faces and names for AI training. #accessibility #privacy https://www.bemyeyes.com/privacy

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ datajake1999

chikim, 23 days ago

@twynn @JonathanMosen Actually, if you have a free OpenAI account, you can turn off Improve the model for everyone in Settings > Data controls on this webpage. How much can we trust them? It's different story. lol https://chatgpt.com/#settings/DataControls

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Binder

chikim, 24 days ago to random

Finally release VOCR 2.0.0. So many new features since 1.0! You can download and checkout the demo here. https://chigkim.github.io/VOCR/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ simon, pitermach

chikim, 25 days ago to llm

Microsoft released Phi3 Small, Medium, and Vision! #LLM #AI #ML https://huggingface.co/collections/microsoft/phi-3-6626e15e9585a200d2d761e3

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ miki