This post originated from an RSS feed registered with .NET Buzz
by Scott Hanselman.
Original Post: The Myth of XML Purity?
Feed Title: Scott Hanselman's ComputerZen.com
Feed URL: http://radio-weblogs.com/0106747/rss.xml
Feed Description: Scott Hanselman's ComputerZen.com is a .NET/WebServices/XML Weblog. I offer details of obscurities (internals of ASP.NET, WebServices, XML, etc) and best practices from real world scenarios.
Here's a hypothetical. Say there is an client I'm working with that needs to
return Valid XML from their system. They've given me XML Schemas and said
they a representative of the XML returned. Since Valid follows Well-Formed,
sounds good.
Then someone mentions, "oh, well, we can't guarantee that there won't be some <
or > or & in the element content. But, that's no problem, right?"
I said, "Well, then technically you are not sending us XML. If you can't escape
(or CDATA) out the stray content with < >, then you're not even returning less-than/greater-than
delimited files. What if I gave you content like this "123123324","2003-04-05","Scott
",Hans,"elman","Portland?" We have to agree on some fundamentals here.
The XML 1.0 spec (and all tools based on it) is very specific." (They won't even CDATA
the stuff)
The response? "Well, that's a purist's viewpoint."
I guess I got too mired in the Judeo-Christian Ethic of "Thou shalt not return
malformed XML."
QUESTION: What level of Dante's Inferno would I be relegated to if
I pre-process this XML-y (pronounced: 'smelly')
file to make it well-formed?