XML Schema Design: Part 1 April 7
Introduction
This post and the posts that follow are to provide some of my guidelines and best practices for creating and utilizing an enterprise-wide XML Schema. I will start off with some background in this post, then move on to the guidelines and best practices in future posts. Jack Van Hoof has a great article about Canonical Data Models (CDMs) and what they are good for on his blog. This enterprise-wide XML Schema is an implementation of a CDM and will hereafter be referred to as a Canonical XML Schema.
Background
The primary requirement of a Canonical XML Schema and the related data model is to provide a standard format for which all content will be distributed thereby requiring applications to adhere to this common format. If a new application is added to the platform, only a transformation between the Canonical XML Schema will be needed to allow it to produce or consume the required content.
In addition, 5 criteria should be considered:
- Completeness – The entirety of elements in the source schemas should be present in the new schema.
- Minimalism – Each element should be defined only once.
- Expandability - The schema should be able to anticipate data that may not have originally been found in any of the source schemas, that is, it should allow its use to grow and not hinder the use of it in the future.
- Comprehension – The schema should be formulated in a way the allows for easy browsing and querying.
- Performance - Understanding how the content in the XML documents supported by the schema will be used can help in determining some of the structure within the schema. For instance, if one intended use of the produced XML is to provide rapid searching, then the schema should be structured to support fast searches.
Keep in mind that these criteria are often at odds with one another. For example, designs that emphasize expandability do so at the risk of deemphasizing performance and comprehension.
Why Guidelines?
A current problem with the XML content that is currently being produced and consumed by various applications within many enterprises is a lack of standards and guidelines for the creation of such content. A Canonical XML Schema will enforce adherence to a singular structure thereby enforcing adherence to the guidelines and best practices set forth by the schema itself. In addition, the Canonical XML Schema must be built following guidelines and best practices. The guidelines and best practices need to be documented to allow producers and consumers of XML content to understand why the model is designed the way it is and how to expand upon that design when it is necessary to do so.
Think about a group of systems that have grown over the years and are communicating with each other via XML (or even without XML). Once there are more than 2 systems talking to each other, it makes sense to develop as much of a generic communication pipeline as possible and a Canonical XML Schema will help you do that.
Communication without a Canonical XML Schema
You can see in the picture above, that in the enterprise described there are 9 translations of data being performed, one for each pairing of applications. As applications are added, the number of translations grows exponentially.
Communication using a Canonical XML Schema
In the second diagram, only 6 translations are being performed and the number of translations that need to be performed as new systems are put online grows in a linear fashion. As new applications are added, only one translation of data needs to be performed, either from the new application to the Canonical XML Schema (if it’s a producer) or from the Canonical XML Schema to the new application (if it’s a consumer).
Next…
Part 2 will describe some of the best practices and guidelines and Part 3 will go into more depth around abstraction of elements and walking the thin line between expandable and understandable.
Related posts:
- XML Schema Design: Part 3 This is Part 3 of a 3 part series on...
- XML Schema Design: Part 2 Now that we’ve gotten the whats and whys out of...
- NetBeans - Working with XML Schemas Lately I have been doing a lot of work with...
- Inserting Text into an Oracle Database Schema: ORA-01461 Today I ran into an interesting problem. A portion of...


