Layered Programming


This article presents Layered Programming as a method for solving a variety of problems that people involved in software development are faced with on a day-to-day basis.

In a nutshell, layered programming means adding useful information to source code in such a way that it doesn't interfere with neither the code nor with other layers. The code itself is the information layer which provides the basic structure.

In most programming languages, comments make up an information layer that fits around the structure of the code. There are specialised tools that make use of code comments, such as JavaDoc and DoxyGen. Other tools such as the compiler ignores the comments, operating solely on the code. This already illustrates the usefulness of a clean separation of information layers.

Taking the concept of information layers a step or two further, we'll look at how XML Namespaces can provide effective relief from inflexible source code syntax. The article is written for users of o:XML, people with an interest in the theory of software engineering, and curious programmers in general.

Developmental Problems

Increasingly, software developers are required to manage additional information related to their code. This typically involves deployment and configuration data, persistence details, cross-platform details and suchlike. This information often ends up in separate configuration files, which can be a nightmare to maintain.

One alternative is to put the data in code comments, which is often more convenient than maintaining a superabundance of configuration files. Examples include conventions of several J2EE environments such as WebSphere and WebLogic, and certain types of Java metadata, such as the "@deprecated" keyword.

However, the lack of clear separation between documentation and important metadata causes several problems. Making changes becomes error-prone - updating the documentation can lead to an unexpected failure later. Existing documentation tools must be instructed to ignore the extra information, or be expected to produce nonsensical output. And it forces the deployment data to be expressed in an awkward way to fit into the commenting style. Despite the many drawbacks, the alternative of managing numerous overlapping, cross-referencing configuration and source files is even more daunting.

Another common approach is to extend the code format to hold bespoke or specialised data. There are countless examples, from Oracle's Pro*C and Qt's Meta-Objects to modern Aspect Oriented Development systems. Compilers don't tend to like this unless they've also been extended, therefore the code must be pre-processed before compilation. This is usually done through some flavour of macro-expansion, substituting our bespoke directives with actual code. Not only is this a fragile process in itself, there remains the problem of content 'bleed'. The two information layers are co-dependent and inseparable - changes in one are likely to invalidate the other. To make matters worse, the extended code format can only be parsed by a bespoke pre-processor (or bespoke compiler), which complicates the development process and rules out independently developed tools.

XML To The Rescue

So what can XML do to improve the situation? Firstly it is worth noting that XML is a general-purpose structured data meta-format, meaning that we can find or devise an XML vocabulary for almost any purpose. Secondly, what is important for us here is that multiple vocabularies can be combined, without content bleed, using XML namespaces. This simple and elegant device keeps multiple layers of information nicely separated without compromising on expressivity.

There are more benefits of XML. Using markup means that the information is transparent, easily harvested with standardised tools. Writing code analysis and processing tools no longer requires interfacing with complex parsers, but can be done using DOM, SAX, XSLT and similar commonplace, generic technologies. Whatever your programming language of choice, XML processing libraries are readily available.

The first challenge we face is to express the code itself in XML. If you're using o:XML, this is clearly not an issue. If not, then something like MLML can be used to capture the outline of the code - the classes, function definitions etc.

Layers of Information

Other articles on this website explain how to integrate documentation and unit tests with o:XML source code, using the XML namespace feature. Both of these orthogonal extensions form a layer of information on top of the existing code structure.

Example 1: o:XML code with documentation and unit tests

<o:function name="foo">
  <doc:p>The foo() function does very little.</doc:p>
  <o:do><-- very little --></o:do>
    ...test definition...

The same example in SML would look like this:

o:function name="foo"{
  doc:p{"The foo() function does very little."}
    // very little
    "...test definition..."

The function definition in the example is overlaid with documentation and unit-test information, both existing in their own, well-defined namespaces. Neither extension interferes with the execution of the program - in fact no layer interferes with any other. The documentation is directly accessible to standard XML tools - producing nicely formatted HTML or PDF documentation requires little more than an XSL stylesheet. Unit tests are also extracted using XSLT, both to generate executable test suites and statistics about test coverage.

Thanks to the versatility of XML, it is possible to create information extensions that capture essential information without affecting other data - hence the term orthogonal.

This opens up endless possibilities for software engineering, where a variety of data surrounding the planning, design, implementation, testing and deployment of an application has to be carefully managed.

XML has been adopted as the data representation for wide-ranging types of information, from UML diagrams to digital signatures, even animated graphics and sounds. Using namespaces, different vocabularies can be combined freely.

Another important point is that layering information with XML allows specific vocabularies to be designed that capture the necessary information in the format most suitable for that domain. No compromises are needed to avoid similarities with programming constructs or keywords. The same method could not easily be used with keyword-based programming languages, unless an XML (or XML-like) structure is put in place (using for example MLML).

In the above examples the code provides the structure for all extensions, but it is equally possible to embed code in an external superstructure. An example of this is literal programming, where code is embedded in the wider narrative of documentation.

For a more detailed discussion of the pros and cons of XML code representations, with examples from Extreme Programming, Design By Contract and Aspect Oriented Programming, see the Extreme Markup 2003 article XML and the Art of Code Maintenance.