this enables full-text search for posts you haven't interacted with, as well as full-text search for accounts, and includes several advanced filtering operators and parser fixes.
I've noticed a lot of chatter about setting up Elasticsearch for Mastodon 4.2's new full text search over the last few days, including what hardware is required, how difficult is it, etc.
So I thought I’d write down my experience, including the hardware I'm running Elasticsearch on for my single user instance:
#Gentoo mailing list archives are broken for almost 3 months now. While marc.info is advocated as a stop-gap solution, it doesn't cover all our mailing lists (I don't think any of the third-party archives do). We really need someone to fix this.
Hmmm 🤔 #mastodon memory leak or I just don't understand how things work. #Sidekiq seemingly can handle all the jobs quite easily, yet after a couple days of my instance running without restarting, RAM usage grows quite high, talking 80% with #elasticsearch enabled on a 16 GB machine. Is it normal? Or does that mean memory leak somewhere? Maybe I should spend a week figuring out how to run another instance and load balance, but I feel this shouldn't be necessary for a single user instance...🤷🏻♂️
I am thinking of breaking #Postgres and #ElasticSearch to a separate server instead of increasing my current VPS resources.
I'll have two servers. My main, which will run Mastodon, and another that will have Postgres and ElasticSearch on it.
Will I benefit by doing this or should I just move ElasticSearch to its own server? What is the minimum CPU/ram I can get for the second server & what should I get?
After reading a massive tome about #ElasticSearch earlier this week I realised it was complete overkill and just used the full-text capabilities of #PostgreSQL instead.
Currently PieFed has 46,000 posts and results are fast. It'll be interesting to see how well it copes when there are more posts. Anyone want to make a guess when it'll bog down?
I'm planning to write an updated homelab guide on my blog this year but I think I'm about to rebuild some parts for a new purpose 😅
It might be time to try out OpenCTI given what I do in my lab should be representative of what I do during < dayjob >. That also means I need to tear down Wazuh and configure an ELK stack instead (resource constraint). #Homelab#ELK#CTI#ThreatIntel#elasticsearch
I'm currently running the Elasticsearch update after upgrading to Mastodon 4.2.0, and it was running really fast until it got to "PublicStatusesIndex" — now it's still importing documents, but it's really slow. It says 406 docs/s, w/ 54M to go, and the ETA keeps getting longer. Anyone else experience this?
can confirm. The new fulltext search works great, and is awesome! I just searched for "footiMac" which I know only I use and it returned results very very quickly. I can go way back. Including to it's very first mention back in January!
This ability is so very important for the usability and attractiveness of Mastodon!
So I'm deploying #ElasticSearch on my #selfhost server right now. It’s importing the "accountsIndex”. It says ###/561466. Does that mean 561,466 accounts have interacted with my server in some way? If so. That's pretty wild. But also, if Mastodon ever got big... that number would likely go up exponentially and my little server would?? 🔥🤯🤪 #MastoAdmin
Q for people who have used Elastic Search, esp. for a Mastodon instance: how should I configure it to use less memory (while still having enough)? Right now it seems to eat as much as it wants (~4 GB)...
It's a single-user instance, so the total data size it reports is 40 MB now.
Great to see more people catching on to ClickHouseDB. We’re using ClickHouse at @honeybadger to power our upcoming logging/observability tool (Honeybadger Insights).
We’re also benchmarking a replacement backend for #Elasticsearch. Looks like quite a performance gain so far!
Will hopefully have more to share soon, but in the meantime we discussed this on the latest episode of @FounderQuest. Give it a listen: