Unicode is fucking stupid - the definitive article.

Tuesday, 29 November, Year 8 d.Tr. | Author: Mircea Popescu

This piece isn't going to focus on the technical reasons Unicode is stupid, because if it's not directly obvious to you that a data structure may not include an "encoding" ie its own interpretation rules, there's very little that can be said to help youi. You must be beaten, often and hard.

Instead, we'll discuss the social and cultural reasons that allowed Unicode to exist, and the motivations and projects of the hostis humani generisii that push it.

In general, computers are inadherent to language, while humans are inadherent to anything elseiii. To bridge this fundamental gap, it was observed that sufficiently lengthy mathematical structures can have their constituent parts labeled with words plucked out of natural language, and while it makes about as much sense to call the resultant concoction "a language" as it makes to call it "a bridge" or "an anthill", nevertheless it is the case that people don't, subjectively, so much mind all the waysiv in which a computer language is not in fact a language.

This human tolerance for failure (that'd have raised so many exceptions, errors and failed assertions as to fill everyone's hard drives if they spoke computer) is readily misunderstood by iliterate people, from MIT and the farms alike, who to this day think computers actually use language, ever. These same mentally inadequate excrements of the human race also make up the bureaucracy - but I repeat myself. To the bureaucratic mind, there exists little better than Unicode, of course, but we'll get to this in a minute.

Entirely separate from the above discussion of what language computers use, there's also the discussion of computer graphics. From the perspective of the iliterate user, computers come in two kinds - the ones with a screen and the ones withoutv. The ones with a screen can display whatever you want them to, which (if you recall your early years) is universally a major selling point of computing altogether. The things they display may be numerous and varied, all which variation strictly reduces to pornography.vi Now the exact types of pornography the luser prefers to make its computer display vary - some are into "babes" with their panties on while others prefer 2girls1cup. Still others prefer to torture the poor machine into spitting out "War and Peace", or the garish colors and screeching sounds of Heroes II. But the machine doesn't care just as it doesn't understand. Like a mongoloid puppy with nothing left but an ovewhelming desire to please, it will. It will do anything - display any idiocy, in any manner at any time. Epistola non erubiscit is nothing, you should see computers!

But that's okay, and in all honesty we absolutely prefer it that way - which is how the "Internet Neutrality" got to be such an obviously correct cause to support for such large a section of derps bereft of the most minute understanding of the matters involved. To them, Net Neutrality was evidently right good and proper, and opponents necessarily evilvii because computers don't understand language and we like it that way.

It was at some point observed by purveyors of truly exotic pornography, as found in the Far East, Russia and other such allegedly-to-exist places that computers don't actually display any porn. They shy away from truly scandalous corners and edges of the explicit world, they're inadherent to truly filthy stuff like தᛃ ܐܪܡܝܐ강 etc. Seeviii ?

This then was offensive, and we agree - it is. If computers are going to be masturbation toys then god damn it let them display women of all skin shades as a camera may be ever pointed at, not print out purple and green blobs where truly gut-wrenchingly ugly tarts should be.

The respective barbarians & assorted alleged-people however, being just as iliterate as the MIT students & unemployed farmhands, misrepresented their displeasure with "mah computobox dun show my wife, what is she, a vamp[ire]?!?!?" as "hello sir this computer not speak language right sir." This is wrong, obviously, but also the best their moderate intellect could support, so we'll have to do with it.

The problems began when the "computer scientist community", which is to say its least valuableix and therefore most vocal component, the MIT students & unemployed farmhands, being just as iliterate as random barbarians & assorted alleged-people carried on the misunderstanding, instead of re-encoding it as would have been sensible.

Out of nowhere an entirely imaginary problem of "computers don't write languages correctly" sprang into "being", which is funny considering the absolute and fundamental bars to its actual existence. This problem, as with all imaginary problems springing around decaying ivory towersx, was immediately appropriated by the bureaucrats, and a "foundation" dedicated to the nonsensical task of "creating an infinite alphabet"xi was established. You could be a member too, if you pay your dues!

The nonsense then spread, with various groups of mouthbreathers introducing "support" for this infinite-alphabet, the exact opposite of the notion of alphabet and in this fundamentally obscurantist and essentially anti-cultural. The mouthbreather's goal is quite evident : imagine a world in which you're given a codebase to maintain, but can't do any work not only because of the current reasons (dependency hell, poorly thought out reference schemes, data models, calling paradigms etc) but also for the entirely novel reason that... you first have to have it translated. Because half the functions are in Nazgul and the guy who wrote the networking interface was from Washitistan, so that whole module is written in poop. I don't mean the comments include a lot of scatological references, nor do I mean the function names look like someone put them through a blender. I mean literally, the guy's from Washitistan, they write things with their own excrement there, and the Unicode Foundation introduced actual excrement in the standard so now whenever someone asks for the networking code in your project they are delivered physical faeces on cardboard. About fifty eight acres of it. Where would you like this put, sir ?

So - not only do all the bureaucrats get to hire all their stupid girlfriends, wives and daughters to do translation work between insane, pointlessly complicated versions of what should have been the same codexii, but also you get too afraid to even ask to review anything.

That's the Unicode world, a world where impotence was legislated a litte further, and as always under the guise of you know, "being nice to people". Apparently I'm supposed to be so disinclined to tell a Washitistani that he literally has shit for a language and should a) wash and b) learn a proper one, in that order, that I will tolerate the situation where a demand for code results in a latrine flood in my office.

Forget all this crap. No computer ever made spoke any language. The matter of how you display your pornography is entirely separate from the matter of programming computers. Support for shitlangs is not worth anyone's time.

Forget about Unicode. It's a waste of your best years, years that you will never ever get back. It's a stupid waste, a pretend-problem, created exactly by the sort of people who can't tackle actual problems, and with the express purpose of giving their pretense that actual problems can't even be approached as much solid ground to sit on as at all possible.

Forget about Unicode ; but shoot anyone who pushes it in the fucking head. Which, to be perfectly accurate, is most likely to be found so far up their ass they can't even hear you coming.xiii

———
  1. Chiefly because whatever may be said won't come in the "encoding" you use because you're broken in the head, and so you'll just ignore it as "unreadable". []
  2. To be perfectly clear - support for Unicode is not merely the wrong choice, it is actively evil. It is not unethical, it is anti-ethical. It is not immoral, it is anti-moral. It is not fucking one child that one time because you really liked her and she really wanted it - it is fucking children. Not in general. All of them. Supporting Unicode is fucking all children, in fact and in principle, forever. Including two year olds. And foetuses via deep aminocentesis. Meditate on that for a second. []
  3. This is no small matter. I didn't just come up with a throwaway dichotomy because unlike you, I'm not the intellectual product of cracked.com

    Stop to consider that human experience is always mediated, review the sad failure of the AI lab, read your 1800s Austrians and so on and so forth, there's millions of pages detailing the situation. []

  4. Which ways are those ? All of them. There is exactly no way whatosever in which a computer language is actually a language. They stay and remain forever Zahlrings, or if you prefer the set theory approach... you would, wouldn't you ? Go read Wittgenstein already. []
  5. Ask your girlfriend or your mother what she thinks "server" means, and then find out if she perceives the need for more categories in the field than "computers with monitor" and "computers without monitor".

    Oh, I'm sorry, I shouldn't assume you spawned out of a dumbass and fuck a dumbass (when she feels like being fucked) ? Why not ? It's the case, isn't it ? []

  6. To understand each other on the topic of pornography : in the case a) where I fuck your girlfriend and in the case b) where she tells you all about how it was for me to fuck her we're not "both having sex with your girlfriend". I'm having sex with your girlfriend ; you're watching your girlfriend have sex. With me. These two aren't the same, and in fact they aren't even related. []
  7. Which still translates in their mind to "money grubbing" via "Jews", because hey, they're all antifa and shit. []
  8. Which of those don't you see properly ? The first's from Tamil, the 2nd's the rune for "good harvest", then there's some Aramaic bs and then some Hangul. []
  9. Least valuable in the sense that -1 is the least positive number in the set {1, 3, 5}. []
  10. See the "global warming" nonsense as a fine equivalent. []
  11. The very point of the alphabet is that in being finite, it forces complexity to manifest where it belongs*, rather than where it inconveniences.
    ------
    * To quote F.O.C.A., "la ma-ta-n uter, ca le place unde pute". []
  12. Imagine that delightful situation where you don't know whether the bug is a bug or a translation error.

    The Romanian word for this sort of utter, irredeemable mess is "varza", ie, cabbage. In a characteristic display of Romanian language wisdom (because yes the Romanians themselves are dumb as doornails - but their language inexplicably is not), the superlative of cabbage is called... "varza de Bruxelles". That's what you got here in the making : a fine Brussels mess. []

  13. Hillary 2016! []
Comments feed : RSS 2.0. Leave your own comment below, or send a trackback.

14 Responses

  1. There's in the of this webpage.

  2. From the source code of this page:

    meta http-equiv="Content-Type" content="text/html; charset=UTF-8"

    :-)))))

  3. I might just be up for a good beating, so

    > a data structure may not include an "encoding" ie its own interpretation rules

    But what about ASCII? Or S-exprs for that matter. I mean yes, everybody can serialize their data structures however they want, but the whole point of "encoding" (or rather "format") is that it allows two agents A and B to agree on a common interpretation (or rather machine-understandable representation). On the contrary, I would say that Unicode is stupid from a technical point of view precisely because it's anything but "a common interpretation".

    So I'm probably misreading. How?

  4. Mircea Popescu`s avatar
    4
    Mircea Popescu 
    Thursday, 15 December 2016

    ASCII doesn't include its encoding in the datastructure ; nor do s-expr. The fact that ~you know~ this is ASCII, or Romanian, or whatever the hell is not the same as the text starting with "Si acum urmeaza un scurt text in limba Romana!".

  5. Oh! My bad, I was reading "data structure" as "data structure specification" instead of thinking about the actual object.

  6. Unicode Power Symbol

    Update! The power symbols are now part of Unicode 9.0!

    Unicode contains many useful and interesting characters – ☺, ☃, ❄, ☆, ♫, ⌨, ☏, ☂, ⏏ etc.

    We felt that it was missing a crucial character which has practical value in everyday life – The IEC Power Symbol.

    bla bla

    This table shows the HTML Entity Number for each symbol.

    Character Escape Code Symbol
    Power ⏻ ⏻
    Toggle Power ⏼ ⏼
    Power On ⏽ ⏽
    Power Off ⭘ ⭘
    Sleep Mode ⏾ ⏾

    Am ânvins, am ânvins!

  7. Gypsy Kings`s avatar
    7
    Gypsy Kings 
    Tuesday, 3 December 2019

    muie html

    Character       Escape Code     Symbol
    Power           ⏻        ⏻
    Toggle Power    ⏼        ⏼
    Power On        ⏽        ⏽
    Power Off       ⭘        ⭘
    Sleep Mode      ⏾        ⏾

  1. [...] This "universality" snake oil is not new in the world of technology. See for example Popescu's article on Unicode.↩ [...]

  2. [...] This "universality" snake oil is not new in the world of technology. See for example Popescu's article on Unicode. ↩ [...]

  3. [...] things that it does NOT intend to support include Unicode and SSL. Notable missing features it ought to have, quoting the manual, [...]

  4. [...] measurement. It works not in the systematic manner you expect measurements to work, but in a very Unicode-like manner : it includes its own encoding scheme, resulting in knight's fees that were larger or [...]

  5. [...] What cares the cotton ginny whether you are white or black skinned ? And this is what all the Unicode wastage is : they, the lost souls of a dead world, are trying to make computers more like cotton ginnies. [...]

  6. [...] it's meanwhile grown into the entire web. What else is there ? [↩]Meanwhile it became Unicode, a genuine systems language in and of and by itself. [↩]The only pantsuit agenda for foreign [...]

  7. [...] perfectly thread-safe mechanism of an array of constant error strings? I expect it's to allow for internationalization, you know, so that subtle terms of art like "Text file busy" can be butchered into the vernacular [...]

Add your cents! »
    If this is your first comment, it will wait to be approved. This usually takes a few hours. Subsequent comments are not delayed.