Post by Tak To
I'll let Linux experts comment on supporting UTF-8 and/or
all Unicode characters on the various Linux platforms.
Well, I'm far from a Linux expert, but the default position has been
for some time that really only UTF-8 (or proper ASCII) locales are
supported. However, most text-processing programs may no attempt to
transcode character sets and just assume that whatever text you're
looking at is either ASCII or whatever your locale setting's encoding
is. Of course, if you're old-school and using an ISO 8859 character
set, all UTF-8 text then becomes mojibake.
That's somewhat orthogonal to the question of "supporting ... all
Unicode characters", because many utilities, especially GNU utilities,
will refuse to process text if it contains "invalid characters".
Furthermore, the notion of a "font fallback chain" is rarely supported
by graphical utilities, so it's up to the user to choose a font whose
repertoire contains all of the characters they wish to render
usefully. (This is true of nearly all "classic" Xlib and Xt
applications, for example.) And the Unicode standard changes much
faster than most typefaces can keep up, so the missing characters may
be rendered as blanks, tofu, or some other "no glyph assigned" glyph.
This situation is somewhat improved with more modern graphical
applications that support text shaping; typically these will use a
fallback mechanism so that Arabic or Hindi text, for example, can be
rendered in a font that has those characters, while still allowing
Latin characters in the same text to be rendered in the preferred
Garrett A. Wollman | "Act to avoid constraining the future; if you can,
***@bimajority.org| act to remove constraint from the future. This is
Opinions not shared by| a thing you can do, are able to do, to do together."
my employers. | - Graydon Saunders, _A Succession of Bad Days_ (2015)