Utils: Parsing XML
“The major difference between a thing that might go wrong and a thing that cannot possibly go wrong is that when a thing that cannot possibly go wrong goes wrong it usually turns out to be impossible to get at or repair” – Douglas Adams, Mostly harmless
What a waste of time!
I keep seeing developers writing a lot of XML – one to define this, one to define that. This is not too bad. What’s bad is that these XML files are then loaded in different parts of the application, where data is carefully (read: custom) pulled out. I know, E4X has made dealing with XML much easier, but why bother custom-parsing it at all? It’s boring, time-consuming, and most important: code becomes dependent on the data format.
What if the data format was to be changed?
It could be sent as JSON or even binary, but you’ve never considered that. Oh no! The whole data layer has to be restructured for the new format. I can hear the developer scream: “But I’ve already done it for XML!”
Why care about where data comes from?
Solution is to separate your application’s data structure from the data source. This can be done using DTO and DAO design patterns:
- DataAccessObject (DAO) abstracts data access implementation to enable transparent access to the data source [1]
- DataTransferObject (DTO) carries data between application’s subsystems [2]
What you get
- All data access and parsing are centralized in one place
- Reduced code complexity in business objects
- Implementing a new data format becomes very simple – just create a new DAO, which will access and serialize data from that specific format, no business logic has to be changed
- Enables easy Unit testing / bug fixing
- Being very generic, DAOs can be reused in other projects
- An extra layer of objects (DAOs and DTOs), which is seen by some developers as a waste of time. I see it rather as creating a contract between data source and data client, which leads to cleaner, easier to maintain and more flexible code. It will save you a lot of time later on the project.
To the point
We have been using this approach for quite a while now and are happy with the results. We have created a number of DAOs for different connection types and a few parsing strategies to format the data we receive into DTOs.
Here is our generic XML parser:
XMLParser.as
import ch.forea.dto.AbstractDTO;
import flash.utils.Dictionary;
import flash.utils.describeType;
import flash.utils.getDefinitionByName;
public class XMLParser {
public function parse(xml:XMLList, responseType:String):* {
//1. instantiate response class
var dto : *;
try{
var responseClass:Class = getDefinitionByName(responseType) as Class;
}catch(e:Error){
throw new Error("XMLParser.request: couldn't instantiate "+responseType);
}
dto = new responseClass();
//2. fill out public vars and accessors of the new instance with data from xml
//2.1. create a dictionary (dict[name]=type) based on the response class's public variables and accessors
var dtoDescription:XML = describeType(dto);
var variables:XMLList = dtoDescription..variable;
var accessors:XMLList = dtoDescription..accessor;
var dtoprops:Dictionary = new Dictionary();
var child:XML;
for each (child in variables){
dtoprops[child.@name.toString()] = child.@type.toString();
}
for each (child in accessors){
if(child.@access.toString() == "readwrite"){
dtoprops[child.@name.toString()] = child.@type.toString();
}
}
//2.2. for each of the public vars/accessors find an XML tag with the same name,
// parse tag's value and assign it to the the response dto's variable/accessor
var value:XMLList;
var type:String;
for(var propertyName:String in dtoprops){
value = xml[propertyName];
delete xml[propertyName];
if(!value.length()) {
if(xml.attribute(propertyName).toString() != ""){
value = xml.attribute(propertyName);
delete xml.@[propertyName];
}
}
type = dtoprops[propertyName];
if(value && value.toXMLString() != "") {
try{
dto[propertyName] = parseSingleObject(XMLList(value), type);
}catch(e:Error){
throw new Error("XMLParser.parseXMLResponse() failed, e: "+e);
}
}
}
//3. if response class is dynamic, add extra properties from xml as dynamic properties
//NOTE: parses ONLY through attributes and interprets them as Strings!!!
if(dtoDescription.@isDynamic.toString() == "true") {
var attName:String;
for each(var node:XML in xml.attributes()){
if(node.toString() != ""){
attName = node.name().toString().replace("@","");
dto[attName] = node.toString();
}
}
}
return dto;
}
private function parseSingleObject(value:XMLList, type:String) : * {
var primitiveTypes : RegExp = /^Boolean$|^int$|^Number$|^String$|^uint$/;
var array:RegExp = /^Array$/;
var vector:RegExp = /^__AS3__.vec::Vector.</;
var intCheck:RegExp = /^-{0,1}\d+$/;
var uintCheck:RegExp = /^\d+$/;
var numberCheck:RegExp = /^-{0,1}\d*\.{0,1}\d+$/;
var booleanCheck:RegExp = /^1|0|true|false$/i;
var vectorToClassName:RegExp = /(__AS3__.vec::Vector.<)([A-Za-z0-9.]*)(?:::)*([A-Za-z0-9]*)(>)/g;
var elements:XMLList;
var child:*;
//PRIMITIVES
if(primitiveTypes.test(type)) {
//just fill out
switch(type){
case "int":
if(!intCheck.test(value)) throw new Error("XMLParser.parseSingleObject property type value:" + value + " should be of type '" + type + "'. Correct xml");
return new int(parseInt(value.toString()));
break;
case "uint":
if(!uintCheck.test(value)) throw new Error("XMLParser.parseSingleObject property type value:"+value+" should be of type '"+type+"'. Correct xml");
return new uint(parseInt(value.toString()));
break;
case "Number":
if(!numberCheck.test(value)) throw new Error("XMLParser.parseSingleObject property type value:"+value+" should be of type '"+type+"'. Correct xml");
return Number(value.toString());
break;
case "String":
return value.toString();
break;
case "Boolean":
if(!booleanCheck.test(value)) throw new Error("XMLParser.parseSingleObject property type value:"+value+" should be of type '"+type+"'. Correct xml");
return (value.toString() == "true" || value.toString() == "1");
break;
}
//ARRAYS
}else if(array.test(type)){
var a:Array = [];
if((value as XMLList).hasComplexContent()){
//parse through elements and fill them with xml values
elements = value.children();
for each(child in elements){
a.push(child);
}
}else{
//assume value is a comma delimited string - split it and fill out array
a = value.toString().split(",");
}
return a;
//VECTORS
}else if(vector.test(type)){
//instantiate vector and parse each of its members
var c:Class;
var v:*;
try{
c = getDefinitionByName(type) as Class;
}catch(e:Error){
throw new Error("XMLParser.parseSingleObject, class could not be instantiated: "+type+", error: " + e.message);
}
v = new c();
var valueType:String = type.replace(/(__AS3__.vec::Vector.<)([A-Za-z0-9.:]*)(>)/, "$2");
if(value.hasComplexContent()){
elements = value.children();
for each(child in elements){
v.push(parseSingleObject(XMLList(child), valueType));
}
return v.length ? v : null;
} else{
//assume value is a comma delimited string - split it and fill out array
var values:Array;
if(value.toString() != ""){
values = value.toString().split(",");
}
for each(var val:String in values) {
v.push(parseSingleObject(XMLList("<primitive>"+val+"</primitive>"), valueType));
}
return v.length ? v : null;
}
//CUSTOM OBJECTS
}else{
return parse(XMLList(value), type);
}
}
}
}
XML parser has a parse method, which requires two arguments:
- an XMLList object to parse,
- a qualified classname of a DTO the xml should be parsed to.
Briefly, the parser instantiates the DTO class, runs through its public variables and fills them with values from identically named xml tags.
Note, that:
- XML parser deals only with custom DTOs, where all the properties are defined, anonymous objects will not be parsed
- properties of primitive type can be written as either childNodes or attributes:
and
are interchangeable and will be both parsed as testCustomObject.greeting = “hello”
- using Vectors instead of Arrays (in your DTOs) makes it possible to determine type of data contained in the Vector and correctly parse it.
What you get
Here is an example of parsing xml to objects of different types:
ParserTest.as
import ch.forea.parsing.XMLParser;
import flash.display.Sprite;
public class ParserTest extends Sprite {
public function ParserTest() {
var xml:XML = <xml>
<testString>hello world</testString>
<testInt>-10</testInt>
<testUint>10</testUint>
<testNumber>10.5</testNumber>
<testBoolean>true</testBoolean>
<testCustomObject>
<greeting>hello world!</greeting>
</testCustomObject>
<testVector1>a,b,c,d</testVector1>
<testVector2>
<elem>a</elem>
<elem>b</elem>
<elem>c</elem>
<elem>d</elem>
<elem>e</elem>
<elem>f</elem>
</testVector2>
<testVectorVector1>
<vector>g,h,i,j,k,l</vector>
<vector>m,n,o,p</vector>
</testVectorVector1>
<testVectorVector2>
<vector>
<testCustomObject>
<greeting>hey</greeting>
</testCustomObject>
<testCustomObject>
<greeting>hi</greeting>
</testCustomObject>
<testCustomObject>
<greeting>good morning</greeting>
</testCustomObject>
</vector>
<vector>
<testCustomObject>
<greeting>hello</greeting>
</testCustomObject>
<testCustomObject>
<greeting>nice weather!</greeting>
</testCustomObject>
</vector>
</testVectorVector2>
</xml>;
var xmlParser:XMLParser = new XMLParser();
var dto:TestDTO = xmlParser.parse(XMLList(xml), "ch.forea.test.TestDTO") as TestDTO;
trace(dto);
}
}
}
TestDTO.as
import ch.forea.dto.AbstractDTO;
public class TestDTO extends AbstractDTO {
public var testString:String;
public var testInt:int;
public var testUint:uint;
public var testNumber:Number;
public var testBoolean:Boolean;
public var testCustomObject:CustomObjectDTO;
public var testVector1:Vector.<String>;
public var testVector2:Vector.<String>;
public var testVectorVector1:Vector.<Vector.<String>>;
public var testVectorVector2:Vector.<Vector.<CustomObjectDTO>>;
}
}
CustomObjectDTO.as
import ch.forea.dto.AbstractDTO;
public class CustomObjectDTO extends AbstractDTO {
public var greeting:String;
}
}
Result trace
testBoolean = true,
testUint = 10,
testCustomObject = ch.forea.test::CustomObjectDTO(greeting = hello world!),
testVectorVector1 = g,h,i,j,k,l,m,n,o,p,
testVector2 = a,b,c,d,e,f,
testNumber = 10.5,
testVectorVector2 = ch.forea.test::CustomObjectDTO(greeting = hey),
ch.forea.test::CustomObjectDTO(greeting = hi),
ch.forea.test::CustomObjectDTO(greeting = good morning),
ch.forea.test::CustomObjectDTO(greeting = hello),
ch.forea.test::CustomObjectDTO(greeting = nice weather!),
testString = hello world,
testInt = -10,
testVector1 = a,b,c,d)
Quite handy, isn’t it?
XML creator
Now that we don’t want to deal with XML anymore, we’ll stop writing it by hand and start auto generating it based on our DTOs. It’s a reverse of parsing XML:
XMLCreator.as
import flash.utils.Dictionary;
import flash.utils.describeType;
import flash.utils.getQualifiedClassName;
public class XMLCreator {
private var primitiveTypes : RegExp = /^Boolean$|^int$|^Number$|^String$|^uint$/;
public function parse(dto:*, name:String = null):XML {
var dtoName:String = name ? name : getQualifiedClassName(dto).replace(/(.*)\:/,"");
var xml:XML = new XML("<"+dtoName+"/>");
//1. list dto's public variables and accessors in a dictionary (dict[name]=type)
var dtoDescription:XML = describeType(dto);
var variables:XMLList = dtoDescription..variable;
var accessors:XMLList = dtoDescription..accessor;
var dtoprops:Dictionary = new Dictionary();
var child:XML;
for each (child in variables){
dtoprops[child.@name.toString()] = child.@type.toString();
}
for each (child in accessors){
if(child.@access.toString() == "readwrite"){
dtoprops[child.@name.toString()] = child.@type.toString();
}
}
//2. for each of the public vars/accessors create an XML node with the same name and parse the value
var type:String;
for(var propertyName:String in dtoprops){
type = dtoprops[propertyName];
//if value is a primitive type, assign it to an attribute to save space
if(primitiveTypes.test(type)){
xml.@[propertyName] = dto[propertyName].toString();
} else{
xml.appendChild(parseSingleObject(dto[propertyName], propertyName, type));
}
}
return xml;
}
private function parseSingleObject(value:*, name:String, type:String) : XML {
var xml:XML = new XML("<"+name+"/>");
var array:RegExp = /^Array$/;
var vector:RegExp = /^__AS3__.vec::Vector.</;
//PRIMITIVES
if(primitiveTypes.test(type)) {
xml.appendChild(value.toString());
//ARRAYS
}else if(array.test(type)){
for each(var obj:* in value){
xml.appendChild(parseSingleObject(obj,"element","object"));
}
//VECTORS
}else if(vector.test(type)){
var valueType:String = type.replace(/(__AS3__.vec::Vector.<)([A-Za-z0-9.:]*)(>)/, "$2");
for each(var obj:* in value){
xml.appendChild(parseSingleObject(obj,"element",valueType));
}
//CUSTOM OBJECTS
}else{
return parse(value, name);
}
return xml;
}
}
}
Now, let’s add a few lines to ParserTest.as and compare the original handwritten XML to the XMLCreator result:
Test
var newXml:XML = xmlCreator.parse(dto);
trace(newXml);
Test result
<testCustomObject greeting="hello world!"/>
<testVector1>
<element>a</element>
<element>b</element>
<element>c</element>
<element>d</element>
</testVector1>
<testVectorVector1>
<element>
<element>g</element>
<element>h</element>
<element>i</element>
<element>j</element>
<element>k</element>
<element>l</element>
</element>
<element>
<element>m</element>
<element>n</element>
<element>o</element>
<element>p</element>
</element>
</testVectorVector1>
<testVectorVector2>
<element>
<element greeting="hey"/>
<element greeting="hi"/>
<element greeting="good morning"/>
</element>
<element>
<element greeting="hello"/>
<element greeting="nice weather!"/>
</element>
</testVectorVector2>
<testVector2>
<element>a</element>
<element>b</element>
<element>c</element>
<element>d</element>
<element>e</element>
<element>f</element>
</testVector2>
</TestDTO>
XMLCreator can be easily compiled to an Air app, which will parse your DTOs to XML files and save them on your hard drive.
Hope I will never see any handwritten and custom parsed XML again!
You are an ActionScript (not an XML!) developer after all!
Enjoy!
References:
[1]http://en.wikipedia.org/wiki/Data_Transfer_Object
[2]http://java.sun.com/blueprints/corej2eepatterns/Patterns/DataAccessObject.html
Adam 11:26 pm on Monday, May 3rd, 2010 Permalink
Grooovy!