XML

 


COIN78 - XML Lesson 3: Creating an XML Data Model


Putting it Together

In this week we are going to add more complexity to your data structure, and use these data structures to represent more complicated data models. In this week we'll be developing models that go beyond the address book, including recipe books, and ontolgies that represent families and more complicated models.

  1. Empty Elements If you have had any HTML experience (of course you do) you have seen cases in older versions in which an opening tag and a closing tag is not really needed. One of the best cases is the image tag, <img src="some url">. The paragraph tag <p>, bold <b>, break <br> are other examples. Since, XML requires every element to have an ending element, a special syntax was devised to handle, "empty tags". The syntax of an empty tag would be, "<img src="some url" />" , <br />. When I send email, I often use <snip> to divide sections for meaning. Technically I should use <snip />. Sorry, oops, <sorry />.

    <?xml version="1.0" ?>  
    <quiz>
    <question type="multiple" number="1">
    Which form of life lifes under harsh conditions?
    <answers>
    <answer choice="a" text="bacteria"/>
    <answer choice="b" text="fungi"/>
    <answer choice="c" correct="true" text="archaea"/>
    <answer choice="d" text="animal"/>
    </answers>
    </question>
    </quiz>

  2. To Nest or Not to Nest One of the main questions that everyone has to go through as they design an XML document is to use nesting or to use attribute. My reasoning has been do not nest unless you have to. Meaning if you need the hierarchical structure you should nest otherwise attributes are a bit more efficient to parse. Some types of information lend themselves better if they are organized in a hierarchical order rather than sequential. However, you must note that W3C has said the performance should not be a criteria in design of XML - you be the judge.

    <?xml version="1.0" ?>  
    <album>
    <image>
    <year_taken>083079</year_taken>
    <location>London, England</location>
    <src>http://www.familyimage.com/23456/21</src>
    </image>
    <image year_taken="082879" location="Eastborn, England" src="http://www.familyimage.com/23456/21"/> </album>

  3. Mixed elements - Think of empty and nested as hot and cold water. Most of us use 'warm' water, which we produce by mixing some hot and some cold. Plumbing only comes in hot and cold, so that's where we start. The *majority* of human designed XML code is in a nested format, and the *majority* of machine generated code is empty. The code I use (in biological applications) is mostly empty with some nested, and a fair amount of 'mixed' character. Take a close look at the six example files below, they will help you understand each style.
  4. Ordinal Counting - now this is where it gets a little tricky. Before you read any further, make sure that you have looked at the six files (below). Some data models have an element of 'counting' to them. For instance, in a recipe, you might have ingredients and steps. What you want to avoid doing is having numbers or any notion of counting in an element. Elements like <ingredient_1>, <ingedient_2>, <ingedient_3>, and <step_1>, <step_2>, and <step_3> are such an example. If you find yourself 'counting' in your element names, in either the nested or the empty model, you probably should be using attributes, or 'mixed' character in your nested and empty models. Take a look at recipe_counting.xml . That's the *wrong* way to do it.

    <?xml version="1.0" encoding="UTF-8"?>  
    <recipe_counting>
    <ingredients>
    <ingredient_1>ingredient one</ingredient_1>
    <ingredient_2>ingredient two</ingredient_2>
    <ingredient_3>ingredient three</ingredient_3>
    <ingredient_4>ingredient four</ingredient_4>
    <ingredient_5>ingredient five</ingredient_5>
    </ingredients>
    <steps>
    <step_1>this is step one</step_1>
    <step_2>this is step two</step_2>
    <step_3>this is step three</step_3>
    <step_4>this is step four</step_4>
    <step_5>this is step five</step_5>
    </steps>
    </recipe_counting>

    Now take a look at recipe_attributes.xml - that's the *right* way to do it.

    <?xml version="1.0" encoding="UTF-8"?>  
    <recipe_attributes>
    <ingredients>
    <ingredient item="1">ingredient one</ingredient>
    <ingredient item="2">ingredient two</ingredient>
    <ingredient item="3">ingredient three</ingredient>
    <ingredient item="4">ingredient four</ingredient>
    <ingredient item="5">ingredient five</ingredient>
    </ingredients>
    <steps>
    <step number="1">this is step one</step>
    <step number="2">this is step two</step>
    <step number="3">this is step three</step>
    <step number="4">this is step four</step>
    <step number="5">this is step five</step>
    </steps>
    </recipe_attributes>

    If you have a model where there is a notion of order, try looking at these files: recipe_attributes_nested.xml and recipe_attributes_empty.xml. Lastly, if you *really* want to explore a complicated model, take a look at the knitting_patterns folder.

Comparing XML Structures

There are no rules about when to use empty, and when to use nested elements. My experience is that empty elements are handy in HTML, but in XML you should try to avoid them. Use nested elements if the information feels like data.

The following is an example of a nested model.

<?xml version="1.0" encoding="UTF-8"?>
<Family>
<Parent>
<Type>Father</Type>
<Name>Dad</Name>
</Parent>
<Parent>
<Type>Mother</Type>
<Name>Mom</Name>
</Parent>
<Child>
<Type>Brother</Type>
<Name>John</Name>
</Child>
<Child>
<Type>Sister</Type>
<Name>Sue</Name>
</Child>
</Family>

The following is an example of an empty model.

<?xml version="1.0" encoding="UTF-8"?>
<Family>
<Parent type="father" name="Dad"/>
<Parent type="mother" name="Mom"/>
<Child type="brother" name="John"/>
<Child type="sister" name="Sue"/>
</Family>

The following is an example of a mixed model which combines nested with empty elements.

<?xml version="1.0" encoding="UTF-8"?>
<Family>
<Parent type="father">
<Name>Dad</Name>
</Parent>
<Parent type="mother">
<Name>Mom</Name>
</Parent>
<Child type="brother">
<Name>John</Name>
</Child>
<Child type="sister">
<Name>Sue</Name>
</Child>
</Family>

In summary here is a list of some of the problems with using the empty elements (attributes) model .

If you use attributes as containers for data, you end up with documents that are difficult to read and maintain. Try to use elements to describe data. Use attributes only to provide information that is not relevant to the data. An exception is to use an attribute as an id which is just a counter. In the address book it would be used to count the records, each address book entry. In this case the id is being used as metadata, providing information about the data, which is a great way to use attributes.

<?xml version="1.0" encoding="UTF-8"?>
<!--address book using nested elements-->
<address_book>
<record ID="1">
<name>
<first_name>first name</first_name>
<middle_name>middle name</middle_name>
<last_name>last name</last_name>
<nick_name>nick name</nick_name>
</name>
<address>
<street_address>street address goes here</street_address>
<street_address_detail>apartment number goes here</street_address_detail> <city>city goes here</city>
<state>state goes here</state>
<zipcode>zipcode goes here</zipcode>
</address>
<contact>
<home_phone>home phone goes here</home_phone>
<work_phone>work phone goes here</work_phone>
<cell_phone>cell phone goes here</cell_phone>
<fax_number>fax number goes here</fax_number>
<email_address>email address goes here</email_address>
</contact>
<comments>
<misc_comments>comments go here</misc_comments>
</comments>
</record>
<record ID="2">
<name>
<first_name>first name</first_name>
<middle_name>middle name</middle_name>
<last_name>last name</last_name>
<nick_name>nick name</nick_name>
</name>
<address>
<street_address>street address goes here</street_address>
<street_address_detail>apartment number goes here</street_address_detail> <city>city goes here</city>
<state>state goes here</state>
<zipcode>zipcode goes here</zipcode>
</address>
<contact>
<home_phone>home phone goes here</home_phone>
<work_phone>work phone goes here</work_phone>
<cell_phone>cell phone goes here</cell_phone>
<fax_number>fax number goes here</fax_number>
<email_address>email address goes here</email_address>
</contact>
<comments>
<misc_comments>comments go here</misc_comments>
</comments>
</record>
</address_book>

 


Example Files

Links to XML Related Sites

  1. XML.COM
  2. WDVL XML tutorial
  3. Sun Java XML Introduction
  4. IBM'S XML Website
  5. Google Directory on XML

Up Arrow Top