#utf8 - kbin.social

SenseException, 9 days ago to random German

It's 2024 and spam mails still don't get umlauts right. #utf8

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ heiglandreas

Luke, 2 months ago to ascii

First heard it at Newsvine from Mark Budos and @mikeindustries more than a decade ago:

“Character encoding is (the worst)” (paraphrased)

…and it’s certainly true.

When working with all the various formats of text, and html entities, and UTF-8, etc., etc.… it can be very hard to get it right all of the time.

#ascii #utf8 #text #typography #markdown #html

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

xtaran, 3 months ago to keyboards

#ShiftHappens happend! @mwichary's impressive book about #keyboards and #typewriters today arrived here, too. 🤩 #MechanicalKeyboard meets #Bookstodon

A book slipcase with a black and white photography of some blank key caps.
Front view of the Shift Happens slipcase withe two thick and one thinner orange book inside. The two thicker books have beige key caps with legends 1 and 2 as label for volume 1 and 2. There's also a thin, stapled leaflet on the very right end of the slipcase visible.
Black and white cover of Shift Happens volume 1, showing a woman sitting in front of a typing device from probably the 50s.

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

xtaran, 3 months ago

It arrived albeit #FedEx fucked up all #umlauts in my shipping address, even in two different ways:

Once they obviously took #UTF8 as #ISOLatin and once they seem to have taken one #8bit charset as another 8-bit charset, probably #ISOLatin1 as #CP850. 🙄 #schei_encoding #mojibake

@Venty once described systems that have such issues as #UmlautlyChallenged. 😉

But at least it arrived! I heard from a keyboard dealer that they stopped sending parcels to Europe with FedEx due to too much packet loss…

Excerpt from a shipping letter where instead of any u-umlaut there is a capital A with a tilde and a glyph for "one fourth" shown.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

gereon, 4 months ago to random German

Liebe #Sozialversicherung, wie kann man im Jahr 2024 immer noch Encodings so derart verkacken? #Digitalisierung #UTF8 #Fail

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jpmens, dirksteins

zirias, 4 months ago to KDE

Remember these specific #ASCIIart misrenderings using the wrong #codepage? Made actually a good testcase after adding codepage selection to my new "dos2ansi" tool 😜
https://github.com/Zirias/dos2ansi

Screenshot from #konsole (#KDE) running on #FreeBSD and using Microsoft's #Consolas font.

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

zirias, 4 months ago

Tested it on #Windows now. Ok, it works on some Windows-10 machine, the terminal nowadays can do both #utf8 output and interpret some #ansi sequences ...

But: You have to enable both explicitly in your code using some proprietary Console APIs 🤯
https://github.com/Zirias/dos2ansi/commit/5a85965d96d4456d1e739427122f66e89fb358b6

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zirias, 4 months ago

New pre-release of dos2ansi: v0.2

Works on #Windows, win32 binary (cross-compiled on #FreeBSD) attached

Selectable input #codepage (so far only #cp437, #cp850 and #cp858)

Selectable output format, #utf8, #utf16 or #utf16le, with or without #BOM

Still a few things to add, e.g. use #termcap/#terminfo or Windows Console API for "color output" when applicable ... we will see 😎

https://github.com/Zirias/dos2ansi/releases/tag/v0.2

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zirias, 4 months ago

Just released dos2ansi v0.4, with lots of #DOS #codepage s supported and a testmode to display them.

The next nice feature would be to use the actual terminal capabilities if output goes there. Very simple on *nix-like systems (#Linux, #FreeBSD, ...), just link #curses and use the termcap functions.

Thinking about #Windows again, either I keep relying on #UTF8 support (since #win7 IIRC? and still a bit buggy) and #ANSI sequences support (since #win10) .... OR I attempt to use the native #Console #API there (using special functions to write in #UTF16 and other special functions to set colors, which would require a major refactoring first 🙄)

https://github.com/Zirias/dos2ansi/releases/tag/v0.4

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

zirias, 4 months ago

Released v0.6.1 just fixing one stupid little regression from refactoring: The terminfo writer must never be used when output does not go to stdout 🙈

So far so good, I just thought there was yet another bug, testing redirected output on #Windows. I always got a file encoded as #UTF16LE, no matter which format I chose (and with #UTF8 chosen, the output was just broken). Meditated on the code for a while. Looks all perfectly good.

Finally, I tested in good old #CMD. And it worked perfectly well.

So, #Powershell is messing with the encoding of my stdout stream on redirects. Really? Really??? WTF, #Microsoft? How should that ever do any good? 🤯

https://github.com/Zirias/dos2ansi/releases/tag/v0.6.1

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

LangerJan, 7 months ago to random German

2023 wird das Jahr des #utf8
#BaldursGate3

reply

expand (4)

collapse (4)

report

activity

copy /kbin url

copy original url

open original url

Loading...

bentolor, 7 months ago to programming German

Excellent writeup on #Unicode and it's various encodings like #UTF8 and others in a beautiful and precise style.

A must-read for any software developer in 2023! #programming #i18n

https://tonsky.me/blog/unicode/

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ reiver

markush, 8 months ago to random

This page has THE best explanation of how #Unicode and #UTF8 #encoding works https://tonsky.me/blog/unicode/

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jochen, Rhaedas

vwbusguy, 8 months ago (edited 8 months ago) to Ansible

My #Ansible playbook failed because an otherwise legitimate variable input broke the lineinfile regex parser and I couldn't report about it in the PR review, since it broke some kind of validation for issue comments in #Forgejo. #regex is hilarious.

reply

expand (11)

collapse (11)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ Di4na

vwbusguy, 8 months ago

LOL! It was the 😂 emoji that broke @forgejo ! Multi-codepoint #utf8 strikes again!

$rendering \xF0\x9F\x98\x82$

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

unixtippse, 8 months ago to random German

Die Selbsthilfegruppe #WTF8 trifft sich heute im Raum Wolfsburg.

echo -n • | iconv -f cp1252 -t utf8

#utf8 #mojibake

reply

expand (5)

collapse (5)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ honze_net, kkarhan

unixtippse, 8 months ago

Leute, ihr versteht das falsch. Die wissen bei VW durchaus, wie man eine HTML unordered list benutzt und formatiert. #UTF8 #Mojibake

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

ovid, 8 months ago to Software

Years ago I was pushed out of a job because management was upset with me. I pointed out that what they were doing wasn't scalable.

A few years later, with new management, they retained me as a consultant for two years fixing some of the problems I told them about. I was paid considerably more money and they lost many developers in the interim.

Have you ever been doubted at work? What happened?

#consulting #scalability #software

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

dboehmer, 8 months ago

@ovid Few years ago I had a potential long-term #Perl client. We made a first small contract. In their #CGI hell I found they had #UTF8 in their database and told browsers to do so but both I/O were not configured. It seemed to work but internally all their strings were garbage. Nobody understood. Some string operations were done in #SQL because there it "magically" worked.

I tried to explain but they didn't understand or even believed me there was any issue at all. I do not miss that client.

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

lorddimwit, 8 months ago to random

I’ve always wondered this. I’ve researched it briefly off and on, but never found a good answer. I’m sure it’s answered on marc.info somewhere, but:

Why is luit, the tool used by #xterm et al to convert legacy encodings to #UTF8 on the fly, called luit?

In French, “luit” means “glowing” but I don’t know if that’s what was intended (and if so, I’ve always mispronounced the name of the program).

@cross this seems like something you’d know or at least be able to lie convincingly about.

#unix

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

slink, 9 months ago (edited 9 months ago) to random

An Exhaustive Test Program for the Legendary Höhrmann UTF-8 Decoder

“The Höhrmann Decoder, implemented as a deterministic finite state machine, needs only a handful lines of C code and 364 bytes for a combined character class and state transition table.

[...]

hoehrmann-utf8-test.c exhaustively tests all possible inputs to the decoder, that makes 269492416 different byte sequences[^1], out of which 1112063[^2] are accepted.”

https://git.sr.ht/~slink/hoehrmann-utf8
http://bjoern.hoehrmann.de/utf-8/decoder/dfa/
#unicode #utf8 #c

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

MichalBryxi, 9 months ago to random

I just love it how it's 2023 and we still can't do UTF-8 right 🫠

#utf8 #encoding #iCanBreakItForYou

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ jernej__s

slink, 9 months ago to random

Serious question to #unicode people: is there a current consensus about how to decode #utf8 correctly in the various error scenarios?

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ slink

vwbusguy, 9 months ago to random

Have you ever stopped and pondered how insane #utf8 is and marveled at just how many different parsers and renderers there are for that madness?

reply

expand (6)

collapse (6)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ mmu_man

koalie, 11 months ago to random

I’m taking you back in time today to February 2005, when we had a batch of hundreds of shirts printed for the @w3c annual conference, where all the people who develop standards for the web met face to face.

The printer couldn’t read the ®️ sign on the design files and we ended up with #wtf8 instead of #utf8 😳
#oops

The back of the T-shirt had the full text w3c logo, with the right sign for “registered”, the huge pale blue 5, and text “fifth annual technical plenary week 2005, Boston MA USA”

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ honze_net, slothrop

epilys, 11 months ago to email

I just received some #email gore: a text/html email in greek auto-converted to a text/plain alternative is actually almost every greek character as an html escape code, sample:

> ο εοό άά . Ευούε

reply

expand (2)

collapse (2)

report

activity

copy /kbin url

copy original url

open original url

Loading...

fell, 11 months ago

@epilys I don't get how people can still not use #UTF8...

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

EvanHahn, 1 year ago to programming

You might've heard of ASCII or UTF-8. These character encodings built by very smart people.

I just built “UTF-21”, an impractical alternative that only a fool would use. Read about it (with a short Unicode crash course) here: https://evanhahn.com/utf-21/

#Unicode #UTF8 #ASCII #CharacterEncoding #programming

reply

expand (1)

collapse (1)

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dirksteins, heiglandreas, perkinsy

luap42, 1 year ago to random German

Ich präsentiere: die aktualisierte Univerwaltungsplattform.

"Eine Eingabe von Sonderzeichen ist aktuell nicht möglich."

#paulstudiert #utf8 #sonderzeichen

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...

+ dirksteins

fell, 1 year ago to random

Whoops! Looks like I'm gonna have to work on #utf8 or some other #unicode #encoding for my @sqfmi #watchy. (It's supposed to read März in #german) #epaper #esp32 #smartwatch

reply

report

activity

copy /kbin url

copy original url

open original url

Loading...