@chikim@mastodon.social avatar

chikim

@chikim@mastodon.social

Love music, technology, accessibility! Faculty at Berklee College of Music 👨🏻‍💻🎹🐕‍🦺

This profile is from a federated server and may be incomplete. Browse more on the original instance.

chikim, to ai
@chikim@mastodon.social avatar

Maybe we have an open source competitor for ElevenLabs? Check out their demo which they switch between original and synthesized. I can't tell. lol Apparently they're going to fully open source codebase and model weights. #TTS #AI #ML https://jasonppy.github.io/VoiceCraft_web/

chikim, to llm
@chikim@mastodon.social avatar

Grok is a LLM from Elon Musk's xAI, and it's 638GB in fp16! Running on a consumer hardware will be pretty impossible anytime soon even with quantized. Maybe Mac Studio with 192GB. #LLM #AI #ML https://huggingface.co/hpcai-tech/grok-1

chikim,
@chikim@mastodon.social avatar

@ppatel It's just a base model which is pretty useless for chat. We need to wait for a fine tuned model. It's going to take a lot of GPU power, so open source teams with small budget won't be able to fine tune it.

bryansmart, to random

RVC was cool for changing your speaking and singing voice in to someone else, but, that's so last year. Someone crossed it with a TTS engine and Melodyne. Sing in your own voice, switch it to a Pop star's voice, fix all your bad notes, draw in vibrato and portamento etc. Or just type in lyrics and draw all the expression. It's like old Vocal Writer, but with natural human voices. Forget using Auto-Tune to make a bad singer sound good. Now, you don't even need a person. https://youtu.be/PCYTqDSUbvU?si=1D7RKF_Y0yWaSRSu

chikim,
@chikim@mastodon.social avatar

@bryansmart Last year, there were a lot of activities in AI singing voice conversion, but it's been pretty quiet this year.

JonathanMosen, to random
@JonathanMosen@tweesecake.social avatar

I am having a little play with a little Mac at the moment, which I have requisitioned temporarily for testing purposes. Man, software updates don’t seem to be as accessible as they were when I had an Intel Mac nearly a decade ago. I had to use Be My AI to see how much time was remaining, because the only thing VO would say was “Installer has no windows”.

chikim,
@chikim@mastodon.social avatar

@JonathanMosen Is it just OS update or update for an app? If it's inaccessible updater for an app, can I self advertise and recommend VOCR? lol It has OCR recognition like Jaws or NVDA, AI image description through Ollama, OpenAI, and so on. The quality is much better than the ones in Windows. Here's the latest beta build with bunch of new features I implemented beginning of this year. The main menu shortcut is command+control+shift+s. https://github.com/chigkim/VOCR/releases/tag/v2.0.0-beta.2

chikim,
@chikim@mastodon.social avatar

@JonathanMosen I need to update the documentation to match new beta features, but installing process is same as the official release. It's little complicated to get it going because all the security permission you have to allow for MacOS like screen capture, voiceover access, etc but you won't regret it once you get it going! :) Here's the readme for the official release. https://github.com/chigkim/VOCR/

bryansmart, to random

Every time I get a bunch of files from some Linux radical, I rage a little bit at how they_always_insist_on_using_underscores_in_file_names. Nobody_appreciates_this_fucking_stupidity. It makes me want to send them back files with emoji and unprintable Unicode characters, then force them to deal with that.

chikim,
@chikim@mastodon.social avatar

@bryansmart You know, they rather use 20 underscores instead of two quotes. lol

chikim, to random
@chikim@mastodon.social avatar

Trying out perplexity.ai as default Chrome search engine for a week! Let's see how it goes!

chikim, to llm
@chikim@mastodon.social avatar

Wow, Microsoft researchers published a paper for 1.58bit quantization (ternary parameters 1,0,-1) , for LLMs showing performance and perplexity equivalent to full fp16 models of same parameter size. Models with 120B parameters can fit into 24GB! Unfortunately, You can't take fp16 and quantize to 1.5bit. You have to train in 1.5bit. #LLM #ML #AI https://arxiv.org/abs/2402.17764

chikim, to ML
@chikim@mastodon.social avatar

NVIDIA announced a New LLM: Nemotron-4 15B. Trained on 8T tokens. Training took 13 days with 3,072 H100s. Model is not available yet, but hhere's the paper. #ML #LLM #AI https://huggingface.co/papers/2402.16819

chikim, to random
@chikim@mastodon.social avatar

Apple is abandoning its years-old effort to build an electric car shifting employees to focus on the booming artificial intelligence space. The almost 2,000 Apple employees tasked with developing an electric car—a secretive project sometimes called Titan—found out the project was being discontinued in an internal memo from Chief Operating Officer Jeff Williams and vice president of technology Kevin Lynch on Tuesday, Bloomberg reported, citing unnamed sources. https://www.forbes.com/sites/mollybohannon/2024/02/27/apple-abandoning-electric-car-quest-and-focusing-on-ai-report-says/

chikim, to random
@chikim@mastodon.social avatar

So it looks like the latest Ollama Windows Preview fixed the slow speed with cpu only on Windows.

ToniBarth, to random German
@ToniBarth@troet.cafe avatar

@chikim You know what i'm thinking about would be a-f******ng-mazing? I'm using a DMS called Paperless-ngx for all my documents which I scanned, OCR'ed and put on there, added tags, correspondants and more. This DMS contains everything important to me. Paperless-ngx got a pretty useful API. It'd just be awesome to have OLLama (or VOLLama) connect to it and learn from my documents stored in there so that I can ask it questions about... lliterally anything.

chikim,
@chikim@mastodon.social avatar

@ToniBarth You can if you download all your papers into a folder and have vollama to index them.

chikim,
@chikim@mastodon.social avatar

@ToniBarth Yeah just write a script to retrieve the latest from paperless api and put it into the folder.

chikim,
@chikim@mastodon.social avatar

@ToniBarth Probably beyond VOLlama can do because it ingest everything as unstructured data. But if you know python you can write your own rag system with llama-index. You can create metadata. It's relatively easy to write your own.

chikim, to random
@chikim@mastodon.social avatar

VOLlama v0.1.0-alpha.9 now has dedicated RAG menu. You have more controls with RAG settings which include chunk_size, chunk_overlap, similarity_top_k, similarity_cutoff, ragResponseMode, show_context.
https://github.com/chigkim/VOLlama/releases/tag/v0.1.0-alpha.9
@kaveinthran @Bri @FreakyFwoof @pixelate

chikim, to accessibility
@chikim@mastodon.social avatar

I filed an issue on github about status bar from WXWidget is accessible on Windows, but not on MacOS. Also I commented that I'm not sure how they implemented, but I've seen apps with accessible QT status bar on both platforms. The response I got " , QT does not use native controls and it is backed up by big coropration. Thank you." Any suggestions for response? #accessibility lol

chikim, to random
@chikim@mastodon.social avatar

VOLlama v0.1.0-alpha.5 addresses a critical bug introduced in alpha.4, which prevented the app from processing your most recent message. Also, it now starts speaking during the generation process, instead of waiting the response to complete. Pressing the Esc key stops both the generation and speech, and redirects focus to the prompt input field.
https://github.com/chigkim/VOLlama/releases/tag/v0.1.0-alpha.5
@vick21 @FreakyFwoof @tristan @KyleBorah @Bri @kaveinthran

FreakyFwoof, to random

Anyone know how to get a hold of the creator behind the NVDA AI content describer? Would like to make it use Ollama as an option. Save some API calls.

chikim,
@chikim@mastodon.social avatar

@FreakyFwoof If I'm thinking the same thing you're talking about, it already uses Llava with llama.cpp that Ollama uses. It's one layer less.

FreakyFwoof, to random

Now I wish #MacWhisper could use local LLM's instead of many, many tokens to Chat GPT. Would save money.

chikim,
@chikim@mastodon.social avatar

@ppatel @FreakyFwoof VOLlama works On Windows with Ollama running through docker. Openhermes, solar, neural-chat, and zephyr models are not too bad for just fun chat on Windows without even GPU.

chikim, to random
@chikim@mastodon.social avatar

Let's try again! I haven't found any UI for local LLMs that isn't annoying to use with screen readers, so I just made one for myself for Ollama called VOLlama. lol Hope someone finds it useful.
Windows users: follow the instruction on the release page to install Ollama with Docker.
Mac user: Install Ollama using the instruction on ollama.ai. Also, the app is not signed.
https://github.com/chigkim/VOLlama/releases/tag/v0.1.0-alpha.1
@vick21 @freakyfwoof @tristan @KyleBorah @Bri

chikim,
@chikim@mastodon.social avatar

@FreakyFwoof Welcome to the darkside of local model! It's not nearly as big as chat gpt users, but it's a pretty huge community!

simon, to accessibility

I've lost all hope of HCaptcha being a company that cares about accessibility.
I had major trouble getting the accessibility cookie to work in Firefox yesterday, though I eventually solved it by disabling both Privacy Badger and the enhanced tracking protection built into Firefox.
So I e-mailed the company with an accessibility inquiry. I suggested that when requesting an accessibility cookie by e-mail, the user should also be given a code they can enter into the HCaptcha challenge. This would save users from having to deal with cookie problems, and would also allow them to solve a captcha in something like Discord, where the captcha is embedded in the app and there's no way to use the cookie at all.
Support responded and said that it was up to each app developer to implement a way to use the accessibility cookie in their app.
I responded with the following:
> Now it sounds like we're shifting the burden of accommodating HCaptcha onto developers instead of users. Developers want to implement a solution that is accessible already. If they have to design their own UI for accommodating the accessibility cookie, the solution is not accessible. Why is HCaptcha so opposed to solving this problem once so that developers do not have to solve it over and over again?

And they responded:
> The reason is because cookies are supposed to be used in a web browser. If you open Discord or Signal in a web browser, it will work. However since the apps aren't web browsers they won't be able to consume it. We have other clients that have implemented ways to consume the accessibility cookie in their apps, so it's up to the developer.

Am I crazy or does this go 0% of the way to addressing anything I said in the previous e-mail?

chikim,
@chikim@mastodon.social avatar

@FreakyFwoof @simon I believe this is the same thing that discord uses. As far as I know, I can't install hcaptcha cookie on desktop app, so I can't sign into discord with desktop app. Or, can you? I suppose I can install a camera on my desktop and wave around my phone and atempt to scan the code from mobile app. I just gave up. lol

chikim,
@chikim@mastodon.social avatar

@simon @FreakyFwoof Nope tried both suggestion, Discord still presents hcaptcha, and there's no text question based option. Just checkbox for accessibility cookie which I can't install on the desktop app. This is so BS!

chikim,
@chikim@mastodon.social avatar

@simon @FreakyFwoof Actually I just tried signing with another browser, and it asked me to verify with my email. When I did that, it let me sign in with desktop. They should just make desktop app to ask you to verify with email in first place instead of this crappy hcaptcha that doesn't work for screen reader users!

chikim,
@chikim@mastodon.social avatar

@simon @FreakyFwoof After I click the link I received via email, I logged in on a browser. It didn't ask for captcha. Then I logged in regularly with username and password on desktop app, and it didn't ask for captcha either.

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • InstantRegret
  • mdbf
  • ethstaker
  • magazineikmin
  • cubers
  • rosin
  • thenastyranch
  • Youngstown
  • osvaldo12
  • slotface
  • khanakhh
  • kavyap
  • DreamBathrooms
  • provamag3
  • Durango
  • everett
  • tacticalgear
  • modclub
  • anitta
  • cisconetworking
  • tester
  • ngwrru68w68
  • GTA5RPClips
  • normalnudes
  • megavids
  • Leos
  • lostlight
  • All magazines