XML Abstraction at the Wrong Level

 

Over the last month I've encountered two applications that use XML at the wrong level of abstraction. Instead of tailoring the schema to their needs, they use a very abstract schema, and encode their elements at a meta level within the XML data. This approach hinders the verification and manipulation of the corresponding XML files.

The two culrpits I have identified are the iTunes digital jukebox, and the Dia drawing program.

Here is an excerpt of an iTunes library:

<dict>
	<key>Track ID</key><integer>37</integer>
	<key>Name</key><string>Two ladies</string>
	<key>Artist</key><string>C.C. Productions</string>
	<key>Album</key><string>Highlights from Cabaret</string>
	<key>Genre</key><string>Musical</string>
	<key>Kind</key><string>MPEG audio file</string>
	<key>Size</key><integer>4694217</integer>
	<key>Total Time</key><integer>195526</integer>
	<key>Track Number</key><integer>3</integer>
	<key>Track Count</key><integer>11</integer>
	<key>Year</key><integer>2002</integer>
	<key>Date Modified</key><date>2004-02-28T07:39:45Z</date>
	<key>Date Added</key><date>2005-06-05T14:00:29Z</date>
	<key>Bit Rate</key><integer>192</integer>
	<key>Sample Rate</key><integer>44100</integer>
	<key>Track Type</key><string>File</string>
	<key>File Folder Count</key><integer>-1</integer>
	<key>Library Folder Count</key><integer>-1</integer>
</dict>
Notice that all the elements of the track (name, artist, album, size, and so on) are not part of the XML schema, but are encoded as strings in key tags. A proper XML schema would have an album tag with embedded track tags. Each tag would encode the track ID as an attribute, and have underneath it tags for things such as its name, length, and size.

The same is also true with the Dia storage format.

    <dia:object type="UML - Class" version="0" id="O1">
      <dia:attribute name="obj_pos">
        <dia:point val="14,6"/>
      </dia:attribute>
      <dia:attribute name="obj_bb">
        <dia:rectangle val="13.95,5.95;18.7,7.45"/>
      </dia:attribute>
      <dia:attribute name="elem_corner">
        <dia:point val="14,6"/>
      </dia:attribute>
      <dia:attribute name="elem_width">
        <dia:real val="4.6499999999999995"/>
      </dia:attribute>
      <dia:attribute name="elem_height">
        <dia:real val="1.3999999999999999"/>
      </dia:attribute>
      <dia:attribute name="name">
        <dia:string>#Suitability#</dia:string>
      </dia:attribute>
      <dia:attribute name="stereotype">
        <dia:string>##</dia:string>
      </dia:attribute>
      <dia:attribute name="comment">
        <dia:string>##</dia:string>
      </dia:attribute>
      <dia:attribute name="abstract">
        <dia:boolean val="false"/>
      </dia:attribute>
    </dia:object>
Notice that here, again, all the elements of a drawing element are encoded through dia:attribute tags with their name as an attribute and their value as a type. A proper XML schema would have seperate elements with names such as bounding_box, corner, width, name, stereotype, and comment.

It would be unfair to blame the designers of these very good applications for the format they adopted. My impression is that the language and library support for extensible XML schemas is currently lacking; the two examples simply demonstrate a larger problem.

Comments   Toot! Share


Last modified: Thursday, June 23, 2005 11:52 am

Creative Commons Licence BY NC

Unless otherwise expressly stated, all original material on this page created by Diomidis Spinellis is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.