This post originated from an RSS feed registered with Agile Buzz
by James Robertson.
Original Post: Encoding Issues
Feed Title: Cincom Smalltalk Blog - Smalltalk with Rants
Feed URL: http://www.cincomsmalltalk.com/rssBlog/rssBlogView.xml
Feed Description: James Robertson comments on Cincom Smalltalk, the Smalltalk development community, and IT trends and issues in general.
Dare Obasanjo talks about the specs and reality, and the variances thereof, in the encoding of xml docs on the web:
All files are sent with a content type of text/xml and no encoding specified in the charset parameter of the Content-Type HTTP header. According to RFC 3023 which Mark Pilgrim quoted in his article that%A0clients should treat them as us-ascii. With the above examples this behavior would be wrong in all four cases.
He then goes on the list the way a client application actually needs to deal with this conundrum - check for:
the encoding given in the charset parameter of the Content-Type HTTP header, or
the encoding given in the encoding attribute of the XML declaration within the document, or
utf-8.
Which is what I stumbled on for BottomFeeder awhile back. I wish Dare had posted this back when I was stumbling in the dark :)