Typed XML programmer -- Where do you want to go tomorrow?

Typed XML programmer -- Where do you want to go tomorrow?

Rate This
  • Comments 6

This post starts a series (of blog posts) on what I would like to call “Typed XML programming”. The overall goal of the series is to engage in a discussion on requirements, scenarios and priorities around typed XML programming. The first post sets up some real basics, poses some questions, and hopefully triggers appetite in getting back to this thread.

 

What is typed XML programming anyway?

 

In elevator speech, I mean by that “XML programming in mainstream OO languages like C#, Java and VB while leveraging XML schemas as part of the XML programming model”. I am trying to scope out XSLT, XQuery and other DSLs in the present series, if you don’t mind.  Otherwise, I would like to go for a broad definition of XML programming including scenarios such as (i) consuming XML as input for an application; (ii) producing XML as output of an application; (iii) operating on in memory representations of XML; (iv) streaming over XML; (v) accessing XML in the database, and what have you.

 

Let’s start with ��untyped’ XML programming. Here is an archetypal C# function that takes an (in-memory) XML tree with purchase orders and calculates the total over all order items (i.e., sum up price times quantity for all items):

 

 

// Use your favorite XML API (such as DOM or … XLinq in my case)

 

public static double GetTotalByZip(XElement os, int zip)

{

   double total = 0.0;

   foreach (XElement o in os.Elements("order"))

     if ((int)o.Attribute("zip") == zip)

       foreach (XElement i in o.Elements("item"))

         total += (double)i.Element("price")

                * (int)i.Element("quantity");

   return total;

}

 

 

It is somewhat discriminatory to label the above code as ‘untyped’ since the mere use of the XML API is still subjected to static type checking; also, the look-up of elements and attributes is sort of dynamically checked. Likewise, I would like to avoid restricting ‘typed’ XML programming to a narrow notion of static typing. Instead, XML types (aka XML schemas) may contribute to the XML programming model in various ways such as validation protocols, precondition checking, exception handling, intellisense, tool tips and others. For now, let me just do the most obvious thing -- assume a C# object model for the kind of elements in the purchase-order example. (The object model may have been derived from an XML schema by a code generator like xsd.exe.) Based on such an object model, the above ‘untyped’ XLinq code is transcribed to a ‘typed’ C# function as follows:

 

 

// We presume object types for order collections, orders and order items.

 

public static double GetTotalByZip(orders os, int zip)

{

   double total = 0.0;

   foreach (order o in os.order)

     if (o.zip == zip)

       foreach (item i in o.item)

         total += i.price * i.quantity;

   return total;

}

 

 

For clarity, let’s show the diff on the untyped vs. typed versions.

I strike through ‘untyped slack’:

 

 

public static double GetTotalByZip(XElement orders os, int zip)

{

   double total = 0.0;

   foreach (XElement order o in os.Elements("order"))

     if ((int)o.Attribute("zip") == zip)

       foreach (XElement item i in o.Elements("item"))

         total += (double)i.Element("price")

                * (int)i.Element("quantity");

   return total;

}

 

 

So in this instance of typed XML programming, we managed to get rid of all casts, all string-encoded element names and attributes, and we might have enjoyed intellisense and tool tips as we typed in the code. Furthermore, type checking prevented us from several kinds of typos, but we had to type in considerably less code anyhow. Finally, we also enjoy the object types at run-time helping us in debugging and dispatching efforts. It sounds like typed XML programming is a good idea, but I am of course aware of contrary opinions (and I promise to get back to them later in the series). Let me say that typed XML programming gets a lot of attention. For instance, check out the sheer number of technologies for XML data binding and research efforts on programming languages for typed XML programming (cf. Comega, XJ, Xtatic, etc.).

 

 

Requirements? Scenarios? Priorities?

 

I haven’t provided much context yet for a deep discussion, but let’s assume that readers of this blog have a certain understanding of “Typed XML programming -- today”. So what I would like to do now is pose some questions, which can be summarized as follows: “Typed XML programmer -- Where do you want to go tomorrow?

 

  1. Do we expect OO developers to understand XML types?
  2. Is XML Schema the right basis for typed XML programming?
  3. What are the MoSCoW requirements for typed XML programming?
  4. What are the key weaknesses of current XML data-binding technologies?
  5. What are the expectations or reservations regarding XML/OO `language cocktails’?
  6. How much do we care about X/O mapping when compared to O/R mapping?
  7. How do we (programmatically or otherwise) mediate between given XML and OO types?
  8. What other questions should have been posed here?

 

In a few days, I am getting back to you.

My plan is to mumble a bit about “Typed XML programming -- today”.

 

Ralf Lämmel

  • Looking forward to your future posts on the subject.  While I'm mostly an o/r guy I'm also very interested in what the typed programming story will be for XML in the future.
  • You've seen Ralf Lämmel's post starting a series about our research and prototyping efforts...
  • Ralf,

    I am also interested in this topic.  I've been building
    adaptable applications based solely on XML using
    a native XML db.   I often take XML structures and
    wrap them in an object to get the "typed" effect.
    Using a native XML db for storage has save me
    lots of "XML/DB mapping" time and might be another
    topic of discussion.

    Dan


  • Hi,

    We have been developing our product, Liquid XML since 2001, and as such I'd like to provide some (hopefully unbiased) initial answers to your questions...

    1. Do we expect OO developers to understand XML types?

    XML Data Binding abstracts this requirement from the OO developer so they can concentrate on writing business logic to achieve there specific project goals more quickly and efficiently.


    2. Is XML Schema the right basis for typed XML programming?

    No, it is the right basis for defining XML data. In the same way that Relational to Object mappings need to be applied to database schema, XML to Object mappings need to be applied to XML Schema.


    3. What are the MoSCoW requirements for typed XML programming?

    M - Multi Language Support. Many of our customers use our technology to bridge the gap between heterogeneous data sources, e.g. C++ to Java. Been able to use a single tool to generate the same interface in multiple languages is very important.

    S - Object to Schema mapping.

    C - Generate HTML Help to describe the generated code library.

    W - See our next release ;-)


    4. What are the key weaknesses of current XML data-binding technologies?

    Many products available can't cope with the more complex aspects of the XSD standard (and it is complex), luckily Liquid XML copes with pretty much the lot.


    5. What are the expectations or reservations regarding XML/OO `language cocktails'?

    When a schema is in its early stages, and is still very volatile, the need to re-generate the objects can be annoying. However, these issues get highlighted at compile time not runtime as would be the case when using DOM or SAX - saving hours in the debugger.


    6. How much do we care about X/O mapping when compared to O/R mapping?

    Ultimately XML is typically created or consumed by an application written in a high level langage. The application typically has to deal with the data in some way before it is consumed. So if the XML data is in a more consistent ‘object’ form then the application is simplified. X/O mapping does just that.

    Mapping from object to a relational database is a completely different problem, and one that potentially complements the use of X/O mappings, i.e. XML -> X/O -> Logic -> O/R -> DB


    7. How do we (programmatically or otherwise) mediate between given XML and OO types?

    There is a fairly logical mapping between simple types (see http://www.liquid-technologies.com/Products_XMLDataBinding_Spec.aspx).

    Complex types and there usage (e.g. choice, sequence) by there nature may require multiple classes within an OO environment (see http://www.liquid-technologies.com/Products_XMLDataBinding_ClsLkCd.aspx).

    The problem we faced in the early days is that people designed schemas in rather verbose way, leading to many redundant layers. If these schema entities are translated directly to objects then you typically end up with about 50% of the classes adding no value. Because of this we created an optimisation step that removes these redundant layers producing a simpler class library while remaining true to the source schema.


    8. What other questions should have been posed here?

    What are the performance implications of using X/O binding?


    Andrew
  • This post continues the series on “Typed XML programmer -- Where do you want to go tomorrow?”. This time,...
  • As I am working on the 3 rd post, I thought I should offer an index as follows: Typed XML programmer

Page 1 of 1 (6 items)