Sam Pickard's picture

Creating XML Documents

I've recently been working on a project to integrate SyncML into a MoSync application. SyncML is fantastically simple - there is only one service end point - if you can provide the SyncML server with an XML description of the data you've got and the data you want.

MoSync's XML API (MTXml) is a SAX-like parser for reading XML documents. It doesn't have a traversable XML document model, and it can't help in writing XML. I put six hours aside, and created the objects to fill this space. This way, I can easily create my XML requests for the SyncML server.

Firstly, a quick warning. I've not destruction tested this model. I've not even read the requirements for creating XML processors. It certainly doesn't validate XML, and it probably won't stop you using it to create incorrect XML. If you use it sensibly, then it should work for 99% of the XML you want to use.

XMLDocument, XMLNode, XMLAttribute & XMLNamespace

I've created a traversable XML model. At the heart of this is the class XMLDocument.

class XMLDocument 
{
public:
XMLDocument();
~XMLDocument();

XMLNode* rootNode;
XMLDocument
* addNamespace(String prefix, String URI);
Map<String, XMLNamespace*> namespaces;
};

This contains a pointer to the root XML node (the root element), and a map of all the namespaces supported in the document.

The XMLNode is the class which performs most of the work

class XMLNode
{
public:
XMLNode();
~XMLNode();

String name;
String value;
Vector<XMLNode*> nodes;
Vector<XMLAttribute*> attributes;
XMLNamespace* xmlNamespace;
XMLNode* parent; XMLNode* root;
//Fluent methods for creating child nodes and attributes
XMLNode
* addNode(String name, String value = "", XMLNamespace* xmlNamespace = NULL);
XMLNode* addNode(XMLNode* node);
XMLNode* addAttribute(String name, String value = "", XMLNamespace* xmlNamespace = NULL);
XMLNode* addAttribute(XMLAttribute* attribute);

//Just like addNode, except that the pointer you get back is to the new node

XMLNode* addChild(String name, String value = "", XMLNamespace* xmlNamespace = NULL);
XMLNode
* addChild(XMLNode* node);

Vector<XMLNode*> find(const String& key);};

The XMLNode object holds the name and the value of the current node as strings, and has a collection of XMLNode* as children. It also has a collection of XMLAttributes

struct XMLAttribute
{

public:
String
name;
String
value;
XMLNamespace
* xmlNamespace;
}
;

XMLDocument, XMLNode and XMLAttributes can all refer to an XMLNamespace. It does not validate the namespace or apply any rules from it, but it is aware of namespaces and handles them.

struct XMLNamespace 
{

public:
String prefix;
String URI;
};

Creating a new XMLDocument

XMLDocuments are created with a default, empty root node. Don't create a new one and set the pointer.

XMLDocument doc;
doc.
rootNode->name = "beatles";

You can add nodes to existing nodes by passing the pointer to the node, or by calling one of the methods to add nodes.

There are two methods for adding a new node - addNode and addChild.

Essentially they both do the same thing - add a new node as a child node. The difference between them is that addNode will return a pointer to itself, and addChild will return a ponter to the new node.

This is important, as it allows you to create your XML using a fluent programming style. Your source code can reflect the data you're storing, making it much easier to understand the XML you've got.

doc.rootNode->
addChild(
"member")->
addNode(
"name", "john")->
addNode(
"instrument", "guitar")->
parent
->
addChild(
"member")->
addNode(
"name", "paul")->
addChild(
"instrument", "bass guitar")->
addAttribute(
"manufacturer", "hofner")->
root
->
addChild(
"member")->
addNode(
"name", "george")->
parent
->
addChild(
"member")->
addNode(
"name", "ringo")->
parent
->
addChild(
"non-member")->
addNode(
"name", "yoko");

You can read the source code as easily as you can read XML. parent is a pointer to the node's parent, and root is a pointer to the root node.

The addAttribute method returns a pointer to XMLNode as well, so you can create many attributes at the same time

addChild("instrument", "bass guitar")->
addAttribute(
"manufacturer", "hofner")->
addAttribute(
"model", "hofner 500/1")->

Reading XML

Alongside this model, I've created an XMLReader class which will read in XML and create the XMLDocument for you. You can use this in two ways. Firstly, it is an XmlListener, you can use it with the existing MTXml objects and create and XMLDocument from a connection. Secondly, you can pass it a complete XML string.

Warning - You will run out of memory really quickly if you start loading large XML documents. Using the MTXml library is still the best way of handling XML in MoSync.

This is some XML from Yahoo! I've loaded

<?xml version="1.0" encoding="UTF-8"  standalone="yes" ?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
<channel>
<title>Yahoo! Weather - London, GB</title>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html</link>
<description>Yahoo! Weather for London, GB</description>
<language>en-us</language>
<lastBuildDate>Wed, 10 Mar 2010 4:50 pm GMT</lastBuildDate>
<ttl>60</ttl>
<yweather:location city="London" region="" country="United Kingdom"/>
<yweather:units temperature="C" distance="km" pressure="mb" speed="km/h"/>
<yweather:wind chill="-1" direction="30" speed="22.53" />
<yweather:atmosphere humidity="70" visibility="9.99" pressure="1015.92" rising="0" />
<yweather:astronomy sunrise="6:26 am" sunset="5:55 pm"/>
<image>
<title>Yahoo! Weather</title>
<width>142</width>
<height>18</height>
<link>http://weather.yahoo.com</link>
<url>http://l.yimg.com/a/i/us/nws/th/main_142b.gif</url>
</image>
<item>
<title>Conditions for London, GB at 4:50 pm GMT</title>
<geo:lat>51.51</geo:lat>
<geo:long>0.07</geo:long>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html</link>
<pubDate>Wed, 10 Mar 2010 4:50 pm GMT</pubDate>
<yweather:condition text="Mostly Cloudy" code="28" temp="4" date="Wed, 10 Mar 2010 4:50 pm GMT" />
<description>
<![CDATA[<img src="http://l.yimg.com/a/i/us/we/52/28.gif"/><br />
<b>Current Conditions:</b><br />
<BR /><b>Forecast:</b><BR />
Wed - Fog Late. High: 6 Low: 2<br />
Thu - AM Fog/PM Sun. High: 6 Low: 2<br /><br />
Mostly Cloudy, 4 C<BR />
<a href="http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html">
Full Forecast at Yahoo! Weather</a><BR/><BR/>
(provided by <a href="http://www.weather.com" >The Weather Channel</a>)<br/>
]]>
</description>
<yweather:forecast day="Wed" date="10 Mar 2010" low="2" high="6" text="Fog Late" code="20" />
<yweather:forecast day="Thu" date="11 Mar 2010" low="2" high="6" text="AM Fog/PM Sun" code="20" />
<guid isPermaLink="false">UKXX0865_2010_03_10_16_50_GMT</guid>
</item>
</channel>
</rss><!-- api2.weather.ch1.yahoo.com compressed/chunked Wed Mar 10 09:22:06 PST 2010 -->

I can load it into the XMLReader like this

XMLReader reader;
XMLDocument& yahooWeather = reader.parseXML(<XML removed for clarity>);

I can test my XML in several ways, and the easiest is to write it out again and compare it.

Writing XML

If I've got a populated XMLDocument, then I can serialise it to a String.

XMLWriter xmlw;
String
weather = xmlw.toString(yahooWeather);

When I do so, the XMLWriter produces this XML (indentations and line breaks have been added to make it more readable)

 <?xml version="1.0"?>
<rss version="2.0" xmlns:yweather="http://xml.weather.yahoo.com/ns/rss/1.0" xmlns:geo="http://www.w3.org/2003/01/geo/wgs84_pos#">
<channel>
<title>Yahoo! Weather - London, GB</title>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html</link>
<description>Yahoo! Weather for London, GB</description>
<language>en-us</language>
<lastBuildDate>Wed, 10 Mar 2010 4:50 pm GMT</lastBuildDate>
<ttl>60</ttl>
<yweather:location yweather:city="London" yweather:region="" yweather:country="United Kingdom" />
<yweather:units yweather:temperature="C" yweather:distance="km" yweather:pressure="mb" yweather:speed="km/h" />
<yweather:wind yweather:chill="-1" yweather:direction="30" yweather:speed="22.53" />
<yweather:atmosphere yweather:humidity="70" yweather:visibility="9.99" yweather:pressure="1015.92" yweather:rising="0" />
<yweather:astronomy yweather:sunrise="6:26 am" yweather:sunset="5:55 pm" />
<image>
<title>Yahoo! Weather</title>
<width>142</width>
<height>18</height>
<link>http://weather.yahoo.com</link>
<url>http://l.yimg.com/a/i/us/nws/th/main_142b.gif</url>
</image>
<item>
<title>Conditions for London, GB at 4:50 pm GMT</title>
<geo:lat>51.51</geo:lat>
<geo:long>0.07</geo:long>
<link>http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html</link>
<pubDate>Wed, 10 Mar 2010 4:50 pm GMT</pubDate>
<yweather:condition yweather:text="Mostly Cloudy" yweather:code="28" yweather:temp="4" yweather:date="Wed, 10 Mar 2010 4:50 pm GMT" />
<description>
<img src="http://l.yimg.com/a/i/us/we/52/28.gif"/>
<br />
<b>Current Conditions:</b>
<br />
Mostly Cloudy, 4 C<BR /><BR />
<b>Forecast:</b>
<BR />
Wed - Fog Late. High: 6 Low: 2<br />
Thu - AM Fog/PM Sun. High: 6 Low: 2<br />
<br />
<a href= "http://us.rd.yahoo.com/dailynews/rss/weather/London__GB/*http://weather.yahoo.com/forecast/UKXX0865_c.html">Full Forecast at Yahoo! Weather</a>
<BR/><BR/>
(provided by <a href="http://www.weather.com" >The Weather Channel</a>)
<br/>
</description>
<yweather:forecast yweather:day="Wed" yweather:date="10 Mar 2010" yweather:low="2" yweather:high="6" yweather:text="Fog Late" yweather:code="20" />
<yweather:forecast yweather:day="Thu" yweather:date="11 Mar 2010" yweather:low="2" yweather:high="6" yweather:text="AM Fog/PM Sun" yweather:code="20" />
<guid isPermaLink="false">UKXX0865_2010_03_10_16_50_GMT</guid>
</item>
</channel>
</rss>

The main things to note are the way the way the writer applies the namespace explicitly to all of the attributes in a tag with a namespace, and the way that the CData specifier has been removed. Adding the namespace doesn't alter the semantic meaning of the XML, and it will parse with another XML processor. Secondly, CData is not passed from the MTXml library, so I never see it to load it into my model. This output XML has been successfully validated at http://validator.w3.org/check

Reading and writing XML like this is not the main use for these classes. MTXml is still the best way to process incoming XML data. Where these classes are useful is in creating XML when you need it to integrate a new service. To go back to The Beatles example at the top, the XML writer produces this output. You can see how the source code above is reflected in the XML. Again, the line breaks and indentations have been added for clarity.

<?xml version="1.0"?>
<beatles>
<member>
<name>john</name>
<instrument>guitar</instrument>
</member>
<member>
<name>paul</name>
<instrument manufacturer="hofner" model="hofner 500/1">bass guitar</instrument>
</member>
<member>
<name>george</name>
</member>
<member>
<name>ringo</name>
</member>
<non-member>
<name>yoko</name>
</non-member>
</beatles>

Extracting XML from the document

I've not built an XPath parser. It is enormously more complex than any of these classes. There is a way to search the XMLDocument though, and that is through its find method.

Vector<XMLNode*> find(const String& key);

If you call the find method, it returns a Vector of XMLNode* where the tag exactly matches the search criteria in the children of the XMLNode you are searching. It doesn't walk the entire model looking for results, and it doesn't search attributes or values.

For example, if I were to count the number of children my root node has, I'd think that there were five members of The Beatles.

printf("There are %d  members of The Beatles?\n", doc.rootNode->nodes.size());

However, I can filter these by matching against the child tag 'member'.

Vector<XMLNode*>  vbeatles = doc.rootNode->find("member");
printf
("There are actually %d members of The Beatles\n", vbeatles.size());

I can also use it to find deep results.

XMLNode*  paulsInstrument = doc.rootNode->find("member")[1]->find("instrument")[0];
printf
("Paul plays the %s", paulsInstrument->value.c_str());

And in the console I see:

Paul plays the bass guitar
AttachmentSize
Tutorial - Creating XML.zip8.71 KB

SyncML Project

Did you succeed in building the SyncML functionality? I am looking at building a mobile client for our enterprise app and using SyncML to keep on device data and the enterprise synchronized. Can SyncML be used for this purpose?



Share on Facebook