Unicode greatly enhances the interoperability of information, removing the very problematic need to associate a character set with your data. You really should make an effort to use it when possible. Unicode in combination with standard presentation languages like xhtml will obviate the need for proprietry information formats like Microsoft Word for example. Joel Spolsky has a nice timeline summary of character encodings, and how we got to unicode and UTF8. I started using unicode as of red hat 9, where it was enabled by default. Everything just worked™. For the record my LANG environment variable is set to en_IE.UTF-8 I think it's worth mentioning how windows, since it's so ubiquitous, handles unicode. Windows in english speaking and western european countries uses the non unicode windows-1252 (ms-ansi) charset. This is iso-8859-1 plus some other NON standard stuff. It's nothing to do with ANSI. See the recode section here for examples to convert between this and utf8. How notepad handles unicode is useful info also. There is a good description of commonly confused characters which is more of a problem now we have unicode. There is a very handy character picker applet that comes as standard with gnome. You can populate it by pasting from this page or the excellent gnome-character-map for e.g. This is a screenshot of the characters I've populated it with. Once you click on your required character it's copied to the clipboard for pasting into the application of your choice.Note the easiest way I found to extract the user defined character palettes from that applet is:
$ gconftool-2 -R /apps/panel/applets | grep chartable chartable = [←↑↓→↔©®™°☺☹…,¤£¥¢$€,¹²³¼½¾,±×÷≈≠≡≤≥∴βπµ∞,➊➋➌➍➎➏➐➑➒➓,␀␛␈␡␉␊␍␠␣␝␞␟,─│┌┐└┘├┤┬┴┼] |
© Feb 23 2006
Note the easiest way I found to extract the user
defined character palettes from that applet is: