SenseException, to random German
@SenseException@phpc.social avatar

It's 2024 and spam mails still don't get umlauts right.

Luke, to ascii
@Luke@typo.social avatar

First heard it at Newsvine from Mark Budos and @mikeindustries more than a decade ago:

“Character encoding is (the worst)” (paraphrased)

…and it’s certainly true.

When working with all the various formats of text, and html entities, and UTF-8, etc., etc.… it can be very hard to get it right all of the time.

#ascii #utf8 #text #typography #markdown #html

xtaran, to keyboards
@xtaran@chaos.social avatar
xtaran,
@xtaran@chaos.social avatar

It arrived albeit fucked up all in my shipping address, even in two different ways:

Once they obviously took as and once they seem to have taken one charset as another 8-bit charset, probably as . 🙄

@Venty once described systems that have such issues as . 😉

But at least it arrived! I heard from a keyboard dealer that they stopped sending parcels to Europe with FedEx due to too much packet loss…

Excerpt from a shipping letter where instead of any u-umlaut there is a capital A with a tilde and a glyph for "one fourth" shown.

gereon, to random German

Liebe #Sozialversicherung, wie kann man im Jahr 2024 immer noch Encodings so derart verkacken? #Digitalisierung #UTF8 #Fail

zirias, to KDE

Remember these specific misrenderings using the wrong ? Made actually a good testcase after adding codepage selection to my new "dos2ansi" tool 😜
https://github.com/Zirias/dos2ansi

Screenshot from () running on and using Microsoft's font.

zirias,

Tested it on now. Ok, it works on some Windows-10 machine, the terminal nowadays can do both output and interpret some sequences ...

But: You have to enable both explicitly in your code using some proprietary Console APIs 🤯
https://github.com/Zirias/dos2ansi/commit/5a85965d96d4456d1e739427122f66e89fb358b6

zirias,

New pre-release of dos2ansi: v0.2

Still a few things to add, e.g. use #termcap/#terminfo or Windows Console API for "color output" when applicable ... we will see 😎

https://github.com/Zirias/dos2ansi/releases/tag/v0.2

zirias,

Just released dos2ansi v0.4, with lots of #DOS #codepage s supported and a testmode to display them.

The next nice feature would be to use the actual terminal capabilities if output goes there. Very simple on *nix-like systems (#Linux, #FreeBSD, ...), just link #curses and use the termcap functions.

Thinking about #Windows again, either I keep relying on #UTF8 support (since #win7 IIRC? and still a bit buggy) and #ANSI sequences support (since #win10) .... OR I attempt to use the native #Console #API there (using special functions to write in #UTF16 and other special functions to set colors, which would require a major refactoring first 🙄)

https://github.com/Zirias/dos2ansi/releases/tag/v0.4

zirias,

Released v0.6.1 just fixing one stupid little regression from refactoring: The terminfo writer must never be used when output does not go to stdout 🙈

So far so good, I just thought there was yet another bug, testing redirected output on #Windows. I always got a file encoded as #UTF16LE, no matter which format I chose (and with #UTF8 chosen, the output was just broken). Meditated on the code for a while. Looks all perfectly good.

Finally, I tested in good old #CMD. And it worked perfectly well.

So, #Powershell is messing with the encoding of my stdout stream on redirects. Really? Really??? WTF, #Microsoft? How should that ever do any good? 🤯

https://github.com/Zirias/dos2ansi/releases/tag/v0.6.1

LangerJan, to random German
@LangerJan@chaos.social avatar

2023 wird das Jahr des #utf8
#BaldursGate3

bentolor, to programming German

Excellent writeup on #Unicode and it's various encodings like #UTF8 and others in a beautiful and precise style.

A must-read for any software developer in 2023! #programming #i18n

https://tonsky.me/blog/unicode/

markush, to random
@markush@chaos.social avatar

This page has THE best explanation of how and works https://tonsky.me/blog/unicode/

vwbusguy, (edited ) to Ansible
@vwbusguy@mastodon.online avatar

My playbook failed because an otherwise legitimate variable input broke the lineinfile regex parser and I couldn't report about it in the PR review, since it broke some kind of validation for issue comments in . is hilarious.

vwbusguy,
@vwbusguy@mastodon.online avatar

LOL! It was the 😂 emoji that broke @forgejo ! Multi-codepoint #utf8 strikes again!

unixtippse, to random German
@unixtippse@mastodon.online avatar

Die Selbsthilfegruppe trifft sich heute im Raum Wolfsburg.

echo -n • | iconv -f cp1252 -t utf8

unixtippse,
@unixtippse@mastodon.online avatar

Leute, ihr versteht das falsch. Die wissen bei VW durchaus, wie man eine HTML unordered list benutzt und formatiert. #UTF8 #Mojibake

ovid, to Software
@ovid@fosstodon.org avatar

Years ago I was pushed out of a job because management was upset with me. I pointed out that what they were doing wasn't scalable.

A few years later, with new management, they retained me as a consultant for two years fixing some of the problems I told them about. I was paid considerably more money and they lost many developers in the interim.

Have you ever been doubted at work? What happened?

#consulting #scalability #software

dboehmer,

@ovid Few years ago I had a potential long-term #Perl client. We made a first small contract. In their #CGI hell I found they had #UTF8 in their database and told browsers to do so but both I/O were not configured. It seemed to work but internally all their strings were garbage. Nobody understood. Some string operations were done in #SQL because there it "magically" worked.

I tried to explain but they didn't understand or even believed me there was any issue at all. I do not miss that client.

lorddimwit, to random
@lorddimwit@mastodon.social avatar

I’ve always wondered this. I’ve researched it briefly off and on, but never found a good answer. I’m sure it’s answered on marc.info somewhere, but:

Why is luit, the tool used by #xterm et al to convert legacy encodings to #UTF8 on the fly, called luit?

In French, “luit” means “glowing” but I don’t know if that’s what was intended (and if so, I’ve always mispronounced the name of the program).

@cross this seems like something you’d know or at least be able to lie convincingly about.

#unix

slink, (edited ) to random
@slink@fosstodon.org avatar

An Exhaustive Test Program for the Legendary Höhrmann UTF-8 Decoder

“The Höhrmann Decoder, implemented as a deterministic finite state machine, needs only a handful lines of C code and 364 bytes for a combined character class and state transition table.

[...]

hoehrmann-utf8-test.c exhaustively tests all possible inputs to the decoder, that makes 269492416 different byte sequences[^1], out of which 1112063[^2] are accepted.”

https://git.sr.ht/~slink/hoehrmann-utf8
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/
#unicode #utf8 #c

MichalBryxi, to random
@MichalBryxi@veganism.social avatar

I just love it how it's 2023 and we still can't do UTF-8 right 🫠

slink, to random
@slink@fosstodon.org avatar

Serious question to #unicode people: is there a current consensus about how to decode #utf8 correctly in the various error scenarios?

vwbusguy, to random
@vwbusguy@mastodon.online avatar

Have you ever stopped and pondered how insane #utf8 is and marveled at just how many different parsers and renderers there are for that madness?

koalie, to random
@koalie@mastodon.social avatar

I’m taking you back in time today to February 2005, when we had a batch of hundreds of shirts printed for the @w3c annual conference, where all the people who develop standards for the web met face to face.

The printer couldn’t read the ®️ sign on the design files and we ended up with #wtf8 instead of #utf8 😳
#oops

The back of the T-shirt had the full text w3c logo, with the right sign for “registered”, the huge pale blue 5, and text “fifth annual technical plenary week 2005, Boston MA USA”

epilys, to email
@epilys@chaos.social avatar

I just received some #email gore: a text/html email in greek auto-converted to a text/plain alternative is actually almost every greek character as an html escape code, sample:

> ο εοό άά . Ευούε

fell,
@fell@ma.fellr.net avatar

@epilys I don't get how people can still not use #UTF8...

EvanHahn, to programming

You might've heard of ASCII or UTF-8. These character encodings built by very smart people.

I just built “UTF-21”, an impractical alternative that only a fool would use. Read about it (with a short Unicode crash course) here: https://evanhahn.com/utf-21/

#Unicode #UTF8 #ASCII #CharacterEncoding #programming

luap42, to random German
@luap42@chaos.social avatar

Ich präsentiere: die aktualisierte Univerwaltungsplattform.

"Eine Eingabe von Sonderzeichen ist aktuell nicht möglich."

#paulstudiert #utf8 #sonderzeichen

fell, to random
@fell@ma.fellr.net avatar

Whoops! Looks like I'm gonna have to work on #utf8 or some other #unicode #encoding for my @sqfmi #watchy. (It's supposed to read März in #german) #epaper #esp32 #smartwatch

  • All
  • Subscribed
  • Moderated
  • Favorites
  • JUstTest
  • mdbf
  • ngwrru68w68
  • InstantRegret
  • magazineikmin
  • thenastyranch
  • rosin
  • khanakhh
  • tacticalgear
  • Youngstown
  • slotface
  • Durango
  • kavyap
  • DreamBathrooms
  • anitta
  • ethstaker
  • GTA5RPClips
  • modclub
  • tester
  • provamag3
  • osvaldo12
  • cisconetworking
  • everett
  • cubers
  • normalnudes
  • megavids
  • Leos
  • lostlight
  • All magazines