<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-37990127</id><updated>2012-01-10T14:04:29.497-08:00</updated><title type='text'>XML Sucks</title><subtitle type='html'></subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>8</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-37990127.post-8134246927409369617</id><published>2007-10-14T20:35:00.000-07:00</published><updated>2007-10-14T20:37:11.699-07:00</updated><title type='text'>Kindred Spirit</title><content type='html'>A gentleman named Doug Hoyte wrote to me to point out his in-depth page which details some of the many reasons that he thinks XML sucks.  You can visit his page &lt;a href="http://hcsw.org/XML.html"&gt;here&lt;/a&gt;.  Thanks for your email, Doug, and your link.  May the farce be with you.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-8134246927409369617?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/8134246927409369617/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=8134246927409369617' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/8134246927409369617'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/8134246927409369617'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2007/10/kindred-spirit.html' title='Kindred Spirit'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116591010548859115</id><published>2006-12-11T23:55:00.000-08:00</published><updated>2006-12-11T23:55:53.833-08:00</updated><title type='text'>File Formats</title><content type='html'>&lt;h1&gt;Scribe&lt;/h1&gt;&lt;br /&gt;&lt;DIV&gt;&lt;SPAN class="Apple-style-span"&gt;According to &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://en.wikipedia.org/wiki/Markup_(computer_programming)" target="_blank"&gt;Wikipedia's entry on Markup languages&lt;/A&gt;&lt;/SPAN&gt;, Scribe is the first markup language to make the distinction between structure and presentation.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;BR class="khtml-block-placeholder"&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN class="Apple-style-span"&gt;Scribe was my brother Brian's thesis project at Carnegie Mellon way back before anybody had ever heard of any of this stuff (the project was begun in 1976). It was later productized by Unilogic (around 1985). Scribe used syntax like &lt;B&gt;@b(phrase)&lt;/B&gt; to mark bold text, for example, or more importantly, &lt;B&gt;@head(Heading)&lt;/B&gt; which decoupled the semantic concept of a "heading" from specific font/size details. There was also a concept of style sheets so you could define what "italic" meant in a separate place.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;BR class="khtml-block-placeholder"&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN class="Apple-style-span"&gt;Scribe had output drivers for troff, some plotters, and laser printers (one of the first PostScript drivers in the world was coupled with Scribe, and in fact the first two Adobe books, the "red book" and "blue book", were typeset with Scribe). There are few remnants of Scribe remaining, though I found an &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://quimby.gnus.org/internet-drafts/2-scribe.template" target="_blank"&gt;Internet RFC&lt;/A&gt;&lt;/SPAN&gt; document type for Scribe documents (from 1991) and an old PostScript driver optimization &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://partners.adobe.com/public/developer/en/ps/sdk/5042.Opt_Case_Study.pdf" target="_blank"&gt;case study&lt;/A&gt;&lt;/SPAN&gt; from 1992 (oddly, written by yours truly).&lt;/SPAN&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;BR class="khtml-block-placeholder"&gt;&lt;/DIV&gt;&lt;DIV&gt;Some more history on Scribe is &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://www.sis.pitt.edu/~spring/courses/is2770syl031.html" target="_blank"&gt;here&lt;/A&gt;&amp;#xA0;and &lt;/SPAN&gt;&lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://www.prenhall.com/electronic_publishing/html/chapter5/05_1.html" target="_blank"&gt;here&lt;/A&gt;&lt;/SPAN&gt;&lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;TeX / LaTeX&lt;/h1&gt;&lt;br /&gt;As Steve Hirsch noted in a comment on this blog (before I moved it and lost the comments), I forgot TeX.&lt;DIV&gt;&lt;BR class="khtml-block-placeholder"&gt;&lt;/DIV&gt;&lt;DIV&gt;TeX I think came after Scribe (I should check my history here but I'm too lazy).&amp;#xA0; TeX was invented by Don Knuth at Stanford to help solve the problem of typesetting mathematics, which was (and still is) very hard to do. Coupled with Leslie Lamport's LaTeX macros (which were &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://www.linuxgazette.com/issue74/spiel.html" target="_blank"&gt;modeled on Scribe&lt;/A&gt;&lt;/SPAN&gt;) it is a very powerful markup language, specific to typesetting, as many of the early markup languages were.&lt;DIV&gt;&lt;BR class="khtml-block-placeholder"&gt;&lt;/DIV&gt;&lt;DIV&gt;&lt;SPAN class="Apple-style-span"&gt;More on TeX at the &lt;SPAN class="Apple-style-span" style="vertical-align: baseline; "&gt;&lt;A href="http://www.tug.org/" target="_blank"&gt;User's Group&lt;/A&gt;&lt;/SPAN&gt; link.&lt;/SPAN&gt;&lt;/DIV&gt;&lt;/DIV&gt;&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;PDF (Portable Document Format)&lt;/h1&gt;&lt;br /&gt;A whole book could be written about PDF.  In fact, one has.  Several.&lt;p&gt;&lt;br /&gt;It's powerful, I guess, but it sure is complicated.  PDF would have been successful 10 years earlier if reading/writing the format was easier.  Even the commercial libraries that purport to import/read PDF files don't work very well, for the most part.&lt;p&gt;&lt;br /&gt;Part of this is the richness of the imaging model supported by PDF.  But not all of it.  There are too many options, too many compression schemes, a binary form, a non-binary form...&lt;p&gt;&lt;br /&gt;Enough said.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;EPSF (Encapsulated PostScript)&lt;/h1&gt;&lt;br /&gt;EPSF is a file format that I designed myself, back in about 1987, when I ran Adobe's Developer Program, yet I will take potshots at it, for the sake of argument.&lt;p&gt;&lt;br /&gt;PostScript is (was?) a programming language, and as such, didn't make for a great file format.  But there was a strong need to include PostScript "clip art" into larger pages, composed by PageMaker and all the page layout apps that followed.&lt;p&gt;&lt;br /&gt;Since PageMaker and the rest could not be expected to interpret the PostScript, there was a separate set of metadata that accompanied the PostScript file that allowed it to be "placed".  The metadata included a bitmap preview of the graphic (so it could be placed in a relatively WYSIWYG way), plus bounding box information, font information, etc.&lt;p&gt;&lt;br /&gt;This extra metadata was embedded in the header of the file with special comment syntax, like this:&lt;p&gt;&lt;br /&gt;%%BoundingBox: 0 0 612 792&lt;p&gt;&lt;br /&gt;A line-oriented file format, easy to parse, easy to use, but somewhat error-prone.  It's been in continuous use for 18 year so it can't be completely broken, I suppose.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;RTF (Rich Text)&lt;/h1&gt;&lt;br /&gt;Rich text file format has a structure to it with open close { } braces to delineate sections. Suitable for whole files, streams poorly, syntax errors have wide side-effects.&lt;br /&gt;&lt;br /&gt;&lt;h1&gt;SGML / HTML&lt;/h1&gt;&lt;br /&gt;Embedded tags in a flow of text. The tags imply &lt;i&gt;mode changes&lt;/i&gt; that are sticky until the tags are closed.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116591010548859115?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116591010548859115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116591010548859115' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116591010548859115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116591010548859115'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/file-formats.html' title='File Formats'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590995132321092</id><published>2006-12-11T23:52:00.001-08:00</published><updated>2006-12-11T23:52:31.323-08:00</updated><title type='text'>How Dare I?</title><content type='html'>I have designed, and parsed, quite a large number of file formats over my 20-year career, everything from line-oriented formats like troff, to Scribe, SGML, MML, etc.  I designed Adobe's PPD and EPS file formats 18 years ago or so.  iMovie's project format.  iPhoto's albumlist (which in fact is in XML format).  Dozens of little things in between.&lt;p&gt;&lt;br /&gt;Many of the file formats I've designed have been in use for over a decade.  Most have been through multiple revision levels and are backward- and forward-compatible (you can read an iMovie 4 project into iMovie 1, though it obviously won't understand and preserve all of what's in there).&lt;p&gt;&lt;br /&gt;The whole reason I've established this site is to call attention to the Emporer's New File Format and spark a conversation about information design.  XML is not a good file format, yet it is widely used.  Let's come up with something better.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590995132321092?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590995132321092/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590995132321092' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590995132321092'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590995132321092'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/how-dare-i.html' title='How Dare I?'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590992704500694</id><published>2006-12-11T23:52:00.000-08:00</published><updated>2006-12-11T23:52:07.046-08:00</updated><title type='text'>XML is not extensible</title><content type='html'>XML does not deserve the "X" in its name.&lt;P&gt;&lt;br /&gt;Extensible means (to me) that it can be extended beyond its original design scope by adding new mechanisms.&lt;/P&gt;&lt;P&gt;&lt;br /&gt;I claim that this is not the case. XML has pre-defined syntax (begin/end tags with attributes that can be set within a tag). As such you can define any tags you want, and add any attributes you want, but that's not extensibility, it's in the original design.&lt;/P&gt;&lt;P&gt;&lt;br /&gt;There's no way I can see to extend the format without rewriting all the existing XML parsers.&lt;/P&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590992704500694?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590992704500694/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590992704500694' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590992704500694'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590992704500694'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/xml-is-not-extensible.html' title='XML is not extensible'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590990842550326</id><published>2006-12-11T23:51:00.002-08:00</published><updated>2006-12-11T23:51:48.426-08:00</updated><title type='text'>XML is not a markup language</title><content type='html'>XML does not deserver its "ML", or even its "X".  But first, the "ML" part.&lt;p/&gt;&lt;br /&gt;I am one of the world's leading experts on markup languages.  I'll start there.  I'm a 20-year veteran of desktop publishing, am personally related to the author of one of the very first markup languages in the world (Scribe), and have actually used SGML, MML, HTML, and most of the other markup languages that came along decades before XML.&lt;p/&gt;&lt;br /&gt;So I know what I'm talking about.  XML is not a markup language.&lt;p/&gt;&lt;br /&gt;A markup language is predicated on the idea that the markup is an &lt;i&gt;exception&lt;/i&gt; in a river of text. That is, the markup is a departure from the state that existed at the time the markup was encountered.&lt;p/&gt;&lt;br /&gt;One of the first instances of this was the TROFF mechanism in UNIX, used for formatting "man pages". A simple example was that a line that started with &lt;b&gt;.i&lt;/b&gt; was italic.  So you might format a sentence with an italic word in it like this:&lt;p/&gt;&lt;br /&gt;&lt;pre&gt;Here is an&lt;br /&gt;&lt;b&gt;.i&lt;/b&gt; emphasized phrase&lt;br /&gt;and back to  normal text&lt;/pre&gt;&lt;br /&gt;The same basic approach is used in HTML, except that it's not line-oriented, so you need a "close delimiter" other than carriage return (which is actually a pretty handy closing delimiter, but I digress).  So the same thing in HTML is:&lt;p/&gt;&lt;br /&gt;&lt;pre&gt;Here is an &lt;b&gt;&amp;lt;i&amp;gt;&lt;/b&gt;emphasized phrase&lt;b&gt;&amp;lt;/i&amp;gt;&lt;/b&gt; and back to normal text.&lt;/pre&gt;&lt;br /&gt;The idea of &lt;i&gt;markup&lt;/i&gt; is that you literally mark up a text, "circling" things, if you will, giving instructions to the typesetter (or parser, or other) that this snippet of text is to be treated somehow differently.&lt;p/&gt;&lt;br /&gt;Another tenet of a markup language is that only the &lt;i&gt;syntax&lt;/i&gt; is specified. The semantics of what the markup means is implicit (HTML) or described earlier (Scribe) or some combination of the two (CSS).&lt;p/&gt;&lt;br /&gt;But here's the real kicker: a pure ASCII text file is a valid example of &lt;i&gt;any&lt;/i&gt; markup language. That underscores the notion that the markup is a departure from the river of text.  So a plain text file is technically a valid HTML file (though they ruined that purity with XHTML and CSS by requiring tags in it, but that's because they too didn't really know what a markup language was).&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590990842550326?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590990842550326/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590990842550326' title='4 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590990842550326'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590990842550326'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/xml-is-not-markup-language.html' title='XML is not a markup language'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>4</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590988446853891</id><published>2006-12-11T23:51:00.001-08:00</published><updated>2006-12-11T23:51:24.470-08:00</updated><title type='text'>Heavyweight Parser</title><content type='html'>The contents of XML files varies a lot, of course.  And the need to parse them varies accordingly. But a fairly common scenario is to "need just one piece of data" that's contained in an XML file somewhere.  How do you get it?&lt;p&gt;&lt;br /&gt;Any data contained in a file needs to be "parsed" back out.  You open the file, you read it in, recognizing the file format attributes along the way, and look for what you need.&lt;p&gt;&lt;br /&gt;XML parsers are "fully general", in that they know how to recognize tags in general, and pull out the data in between, but they don't know what the data are all about.  They're fairly big beasts, consume memory, take time to initialize, and you can't just whip one up yourself in an hour or two.&lt;p&gt;&lt;br /&gt;Furthermore, you have to teach it how to extract the one piece of data you want, or to read the whole thing in (as in the MacOS X parser, which gives you an NSDictionary), pick out your data, and throw the whole thing away.  Very expensive and time-consuming operation, and it fails silently (and often) if there's anything amiss in the data itself.&lt;p&gt;&lt;br /&gt;By contrast, a line-oriented file format can be parsed with five lines of code, using "fgets" and "sscanf" to look for the data you need, and you can skip anything that's not interesting.  Very, very fast, zero memory use, and no overhead.&lt;p&gt;&lt;br /&gt;So think carefully about who will be reading the data, and why, and design a file format that suits their needs.  My bet is that 8 times out of 10, XML is not the right format.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590988446853891?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590988446853891/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590988446853891' title='2 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590988446853891'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590988446853891'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/heavyweight-parser.html' title='Heavyweight Parser'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>2</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590986697006344</id><published>2006-12-11T23:51:00.000-08:00</published><updated>2006-12-11T23:51:06.973-08:00</updated><title type='text'>XML as a "container"</title><content type='html'>XML is most often used as a kind of &lt;i&gt;container&lt;/i&gt; to hold structured data of some kind. The semantic nature of the data is not defined by XML itself, but typically is carried separately as a data definition or simply by being programmed into the model itself, which is the more common approach (e.g. "this XML file contains preference data" or "this XML file contains a Technorati Ping").&lt;p/&gt;&lt;br /&gt;There is one big problem with XML as a container. Its syntax, which is borrowed from HTML and SGML, involves angle brackets and a begin/end paradigm. The problem with this is that you can't embed similar data inside the XML file without &lt;i&gt;escaping&lt;/i&gt; all the angle brackets. That gets messy very fast. It also is impossible to &lt;i&gt;nest&lt;/i&gt; to arbitrary depth. That is, you can't have an XML file that contains an XML file that contains an HTML file without knowing &lt;i&gt;beforehand&lt;/i&gt; how many times to un-escape the data when parsing it.&lt;p/&gt;&lt;br /&gt;It also makes it essentially impossible to embed binary data in an XML file because you can't know whether or not to escape the XML sequences within the binary data (you should NOT, if the binary data is to be respected).&lt;p/&gt;&lt;br /&gt;This is a classic problem with file formats which require parsing of the data and in which the delimiters themselves might be embedded.  You have to recognize nested delimiters and/or escape them.&lt;p/&gt;&lt;br /&gt;There are many other approaches to file formats which might have been better choices. For example, instead of a begin/end paradigm, specifying &lt;i&gt;type&lt;/i&gt; and &lt;i&gt;length&lt;/i&gt; data allows unambiguous parsing. It is not, however, easy to compose by hand, which is probably why it's not used.&lt;p/&gt;&lt;br /&gt;Another approach is to simply have characters that are considered &lt;i&gt;illegal&lt;/i&gt; in a data stream, and use those as delimiters. This is how C strings are represented (the illegal character is a byte with value 0): they're called null-terminated strings.  This approach has been used widely for decades and has its advantages.&lt;p/&gt;&lt;br /&gt;The bottom line is that syntactically XML is not a particularly good choice as a &lt;i&gt;container&lt;/i&gt; format, and yet that is how it is most often used.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590986697006344?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590986697006344/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590986697006344' title='6 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590986697006344'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590986697006344'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/xml-as-container.html' title='XML as a &quot;container&quot;'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>6</thr:total></entry><entry><id>tag:blogger.com,1999:blog-37990127.post-116590976629313301</id><published>2006-12-11T23:49:00.000-08:00</published><updated>2006-12-11T23:49:26.303-08:00</updated><title type='text'>What started it all...</title><content type='html'>I first blogged about XML a while ago and started to catch grief about this unpopular point of view.  I've been defending it more and more over the past few months.  Yesterday I was on a panel with Steve Gillmor who has an initiative entitled "Attention.xml", so naturally he wanted to give me grief about it as well.&lt;p&gt;&lt;br /&gt;Thus was born the idea to create a blog devoted to what's wrong with XML.  I'm not sure how much growth there will be in it, but (not surprisingly) the URL "xmlsucks.com" was available, so I jumped on it, shall we say.&lt;p&gt;&lt;br /&gt;Welcome to my rant.&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/37990127-116590976629313301?l=xmlsucks.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://xmlsucks.blogspot.com/feeds/116590976629313301/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://www.blogger.com/comment.g?blogID=37990127&amp;postID=116590976629313301' title='3 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590976629313301'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/37990127/posts/default/116590976629313301'/><link rel='alternate' type='text/html' href='http://xmlsucks.blogspot.com/2006/12/what-started-it-all.html' title='What started it all...'/><author><name>Glenn Reid</name><uri>http://www.blogger.com/profile/11057576586723521829</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='17' height='32' src='http://3.bp.blogspot.com/_pwkDNNxqnhI/SR0bNMLjfJI/AAAAAAAAACk/4h8VT8hihqI/S220/glenn.jpg'/></author><thr:total>3</thr:total></entry></feed>
