This post originated from an RSS feed registered with Java Buzz
by Elliotte Rusty Harold.
Original Post: The Ten Commandments of Unicode
Feed Title: The Cafes
Feed URL: http://cafe.elharo.com/feed/atom/?
Feed Description: Longer than a blog; shorter than a book
1. I am Unicode, thy character set. Thou shalt have no other character sets before me.
2. Thou shalt carefully specify the character encoding and the character set whenever reading a text file.
3. Thou shalt not refer to any 8-bit character set as “ASCII”.
4. Thou shalt ensure that all string handling functions fully support characters from beyond the Basic Multilingual Plane. Thou shalt not refer to Unicode as a two-byte character set.
5. Thou shalt plan for additions of future characters to Unicode.
6. Thou shalt count and index Unicode characters, not UTF-16 code points.
7. Thou shalt use UTF-8 as the preferred encoding wherever possible.
8. Thou shalt generate all text in Normalization Form C whenever possible.
9. Thou shalt avoid deprecated characters.
10. Thou shalt steer clear of the private use area.