not anymore, my friends from #hackernews , not anymore . @midzer built thumbnails and WEBP-support for #flohmarkt today. and i helped him integrate it. it's amazing to see how fast stuff is loading now :)
in backend news, we made the communication with SMTP-servers more resilient, so your outbound mails just take a beer from the fridge and chill if your mailserver isn't available for a few moments.
Postgrest is of course written in #Haskell. But what I didn't know was that #Supabase is based on Postgrest, and they employ the lead developer of postgrest to work on it fulltime, which means Supabase is also based on Haskell!
Ask Microsoft: Are you using our personal data to train AI?
"We had four lawyers, three privacy experts, and two campaigners look at Microsoft's new Service Agreement, which will go into effect on 30 September, and none of our experts could tell if Microsoft plans on using your personal data – including audio, video, chat, and attachments from 130 products, including Office, Skype, Teams, and Xbox – to train its AI models..."
I've reached out to https://dm.hn and I think we're about to get a huge #OPML list of most of the blogs that were submitted to the original #HN thread!
It is really not so repulsive to see the poor asking for money as to see the rich asking for more money. And advertisement is the rich asking for more money. A man would be annoyed if he found himself in a mob of millionaires, all holding out their silk hats for a penny; or all shouting with one voice, “Give me money.” Yet advertisement does really assault the eye very much as such a shout would assault the ear. “Budge’s Boots are the Best” simply means “Give me money”; “Use Seraphic Soap” simply means “Give me money.” It is a complete mistake to suppose that common people make our towns commonplace, with unsightly things like advertisements. Most of those whose wares are thus placarded everywhere are very wealthy gentlemen with coronets and country seats, men who are probably very particular about the artistic adornment of their own homes. They disfigure their towns in order to decorate their houses.
Thinking about Hacker News but sprinkled with #activitypub
imagine being able to reply and participate to any #HN post from the #fediverse and with #webmentions have fediverse comments mingled with native HN activity.
I could easily do a test with this idea using RSS to autofeed a #Discourse or #Nodebb forum and then consume the firehose from #Fediverse, just for fun!
My article on long-term perspectives of important information on the web gained additional momentum (and great reading rates) with the self-inflicted demise of #reddit:
A question about what states were most-frequently represented on the HN homepage had me do some quick querying via Hacker News's Algolia search ... which is NOT limited to the front page. Those results were ... surprising (Maine and Iowa outstrip the more probable results of California and, say, New York). Results are further confounded by other factors.
HN provides an interface to historical front-page stories (https://news.ycombinator.com/front), and that can be crawled by providing a list of corresponding date specifications, e.g.:
So I'm crawling that and compiling a local archive. Rate-limiting and other factors mean that's only about halfway complete, and a full pull will take another day or so.
But I'll be able to look at story titles, sites, submitters, time-based patterns (day of week, day of month, month of year, yearly variations), and other patterns. There's also looking at mean points and comments by various dimensions.
Among surprises are that as of January 2015, among the highest consistently-voted sites is The Guardian. I'd thought HN leaned consistently less liberal.
The full archive will probably be < 1 GB (raw HTML), currently 123 MB on disk.
Contents are the 30 top-voted stories for each day since 20 February 2007.
If anyone has suggestions for other questions to ask of this, fire away.
NY is highly overrepresented (NY Times, NY Post, NY City), likewise Washington (Post, Times, DC). Adding in "Silicon Valley" and a few other toponyms boosts California's score markedly. I've also got some city-based analytics.
I'm working on parsing. Playing with identifying countries most often mentioned in titles right now, on still-partial data (missing the past month or so's front pages).
Countries most likely to be confused with a major celebrity and/or IT/tech sector personality: Cuba & Jordan.
Country most likely to be confused with a device connection standard: US (USB).
Raw stats, top-20, THERE ARE ISSUES WITH THESE DATA:
Hacker News Analytics: ~3% of submissions reach front page, with half of comments on FP articles
This is a finding based on maths and a previous study by Whaly in 2022 based on HN 2021 activity, rather than my own crawl, though it's informed by the latter.
The HN front page is a limited resource --- there are 365 * 30 == 10,950 front-page slots in a year, another 30, or 10,980, in a leap year, and regardless of site activity over a year, those slots are fixed. It's somewhat of a reminder that regardless of how much information we can access, our time to process that information is finite. Or as Herbert Simon observed: what information consumes is attention.
Whaly saw 386,663 total story submissions for 2021. I'm pretty sure that this is net of moderation (user flags, auto-kills, spam detection, voting-ring detection and the like). But it works out to a hair under 3% of stories not catching on any of those tripwires which then land on the HN front page.
Mind that that's actually a somewhat low estimate, as a story may appear for part of the day on the front page but not be represented on the end-of-day front-page archive.
I'm now thinking of doing some spot checks to see what kinds of success rates individual submitters have in landing on the front page. From what I've seen, even well-known and popular members have at best a modest chance of success.
Whaly also give a total number of comments: 3,769,520. That I can compare to my own front-page stats for 2021: 1,859,933, or 49.34% of all comments. That is, half of HN comments appear on the 3% of stories which reach the front page. That percentage is lower than what I'd have expected, though it's still a very strong bias toward the front page.
(Now I want to complete another analysis I'd thought of: mean votes and comments by story position (1--30), by year. Hrm...)