| date: | 2007-01-02 15:55:27 |
|---|---|
| category: | The Lure of XML |
While there are lame reasons for using XML, there are good reasons also. The good reasons, however, aren’t delightfully clear. They seem to be clouded by the lame issues. I’m trying to sort out the best and most logical reasons for using XML.
Here are 8 typically lame reasons for using XML. The “Stamp on the ants ” posting noted this exchange in The Server Side, “Raven 1.1: Build Java with Ruby ” thread. Note that this is confined to build tools, where scripted actions are an integral part of the problem. A lot of this discussion doesn’t apply to other problem domains.
I found these to be typical of many technology decisions: Incumbency , False Dichotomies , Inflated Opportunity Cost , a flat-out Misrepresentation are the essence of these arguments for XML.
There is, however, a good value proposition for XML, even in the build-tool domain where scripting is an essential part of the problem and any solution. In “A Good Reason for XML ”, I tried to state it as “Semantic Richness”, that is, “how well it describes the problem.” This isn’t really very good, and another comment challenged this glib simplification.
Declarative vs. Imperative Knowledge Representation.
The big issue in software-world is knowledge representation. Ant or SCons have some sophistication because they have two levels of knowledge representation: a collection of algorithms plus a control file that is used by those algorithms. Generally, we capture knowledge about building a program and represent it in the control file. We don’t tinker much with the knowledge embodied in the algorithms.
We have two strategies for knowledge representation: declarative and imperative . This is sometimes called the “What vs. How” distinction. We can declare what we want – the desired end-state – and leave it to our collection of algorithms to reason out how it gets done; the tool derives the imperative steps to get there. Or, we can just write down how to do the job; we manually develop the imperative steps.
The first build systems were purely imperative: we recompiled the world. This was, often, an ineffective use of precious computing resources, so we invented more sophisticated tools. I implemented one back in the early ‘80’s that ran in Univac’s Exec-8, using really ancient and obscure software tools. The purpose then – as it is now – is to recompile just enough and no more.
GNU Make , Ant , SCons , Raven , Maven , etc., have two explicit purposes: to minimize recompilation, and automate the myriad of packaging steps required by our deployment architectures. They have grown, however to embrace an additional requirement: represent metadata about the software being built. This additional requirement is captured in “XML is first class, scripting languages are second class ”. How do we report or analyze the information captured in our build tools?
The Preference for Declarative Knowledge.
Declarative knowledge representation is, clearly, preferred. “Imperative tools (whether based on XML ala Ant or a “real” scripting language ala Raven) are inevitably going to be less productive than a declarative tool...” This is also an implication in the “XML is first class ” response.
The declarative style has a number of advantages. Primarily, it allows us to analyze the declarations to create reports about our application software. Since the job is information capture, we have every reason to demand full value from that knowledge. Also, a declarative style can allow swapping the toolset without breaking the declared relationships in the build configuration file.
This declarative ideal can be met a number of ways:
Murky Ant vs. Maven Issue.
While declarative knowledge has all the advantages, here’s an interesting quote: “Imperative tools (whether based on XML ala Ant or a “real” scripting language ala Raven) are inevitably going to be less productive than a declarative tool, and this was a large part of the reason I switched from Ant to Maven some years ago.” This is a bit confusing.
This quote sounds like Ant is more imperative than Maven, and therefore less desirable. However, I’m confused because of the following:
If anything, I’d think that Maven would be less declarative. Clearly, I’m missing something. Likely, I don’t understand enough of the DTD (or Schema) for Maven to see how it is more declarative in spite of the inclusion of Jelly.
Bottom Line.
It’s clear that purely declarative knowledge is ideal. It’s also just as clear that imperative knowledge is essential to success.
Interestingly, build systems seem to typify applications that can’t be done as purely declarative knowledge. We’re always adding imperative hooks to control files. Purely imperative knowledge (e.g., a shell script) are undesirable.
Further, the more we look, the more we see different mixtures of declarative vs. imperative knowledge representation techniques. XML is imperative light (even with jelly), Python/SCons is declarative with easy addition of imperative scripts, the GNU/Make DSL is imperative heavy.
The Ant vs. Maven distinction still needs some clarification. However, the preference for a declarative knowledge representation makes compelling sense. XML’s best for representing declarative knowledge.