Archive

XSLT

NB. These comments refer to XSLT 1.0 and may not be applicable to the recently ratified XSLT 2.0 languages.

XSL (The Extensible Stylesheet Language) is a W3C recommendation for a formatting language for converting XML documents from one format to another. This is useful in a variety of software applications, but is particularly useful on the web. (Strictly speaking XSL refers to everything, including the process, and XSLT refers to the language).

By separating the underlying content from the presentational code, you can arrange that a much smaller amount of data is transferred for each page (as an XML document), because the XSLT files can be cached on compatible user agents. Furthermore you gain maintainability over your site. By changing the XSL you change the rendering of every page on your site. And you can change the content without bothering about its rendering. You can replace the XSL to change the style of a site or you can replace the content to create a new site with the same style.

XSL specifications can be found at http://www.w3.org/Style/XSL/ . They aren’t bad specifications to read and refer to, so print them out. You’ll need XSLT and XPath specifications (I haven’t looked into the other one, XSL-FO, because it sounds superfluous if you know and like CSS).

There are many implementational difficulties with XSL, and it’s these I’ll discuss so that you may follow this route more effectively.

There are two options when dealing with browser support. One way is to choose the minimum set of capabilities which you know to be supported. This way would see you always do the XSL transformation server-side and serve tag soup HTML 4. The better but more difficult way is to serve the best technology you know to be supported – XSLT and XML to those browsers which support it and likewise transform to XHTML instead of HTML 4 where you know it can be read.

XSL Support

Not all browsers support XSL. So first off the bat, you’ll need to find some way of doing the XSL transformation server-side for UAs which don’t support it.

Unfortunately there is currently no way of determining whether XSLT is supported from the HTTP headers. The ideal way would be to parse the HTTP Accept: header. But XSLT doesn’t have a MIME type, (might be application/xslt+xml someday) and Accept: application/xml doesn’t indicate that an XSL processor is available.

So you’ll either have to look up compatibility with the HTTP User-Agent: header or do the transformation regardless of browser support. MSIE6 is known to have good XSL support. Firefox 0.9 has a bug in its form handling when using XSLT: you can’t always focus an input field or change the value of a select field.

I’m using the XSLT extension in PHP 4. Unfortunately, this has been removed in PHP5 in favour of this Pear package, so I’ll have to change my code if I upgrade. But there are many different ways, such as Apache Cocoon.

XHTML Support

XHTML support in browsers is a mess. Generally all the problems are due to MIME handling. Needless to say, Mozilla support is flawless, and in principle everyone should be migrating to XHTML. But MSIE is completely b0rked. Firstly, it has a feature called ‘Moniker’ which is a trendy way of saying its MIME type support is broken. Secondly even if you do get it to render XHTML instead of presenting you with the option to download it, it will go into Quirks mode if you include the <?xml?> processing instruction (which is optional). You don’t want Quirks mode (in any browser). Quirks mode implies various bits of non-conformance to the specifications, which are non-constant across browsers and designed to make your life harder (or make your life easier if you’ve done all the web design you’re ever going to do).

What I currently do is use XHTML with the application/xhtml+xml MIME type for browsers that support it (which I currently assume to be everything other than IE) and just use Tagsoup HTML with the text/html MIME type for IE.

This is harder than it sounds. The doctype and output mode for the XSL processor is configured from the <xsl:output> element within the XSLT. So if you want to change it, you have to redirect the browser to an alternative XSLT file with a different <xsl:output> tag specifying HTML output and doctypes. An Apache rewrite rule can help, but I use a more maintainable method of generating that XSLT file with PHP (make sure it’s cacheable if you do it like that). (Actually, I modify the XSLT file with PHP as I serve it, but that’s just the way that fits best with my platform for serving the XSL).

XSLT/XPath 1.0 Language

XSLT 1.0 is a limited and verbose language. But many things are possible; you just have to work out how to achieve them. XSLT2 and XPath2 do address these concerns, but they are still working drafts at the moment.

For example, a common task would be to group block-level elements into cells of a table, for example to show a page of thumbnails with captions. It’s bad to do it in a table because tables are not for layout. In theory you can do this in CSS 2.1 with display: inline-block, which not only groups things into cells as required but fills the width of the container with cells before wrapping to a new line (so you don’t have to choose how many columns to display). But there’s almost no browser support for it, so in the meantime we have to workaround with tables.

It’s hard to work out how break it into lines in XSLT, but after much experimentation, this is a method which works:

 <xsl:template ...>      <xsl:param name="columns">4</xsl:param>      <table>          <xsl:for-each select="item[[position()|mod $columns = 1]]">              <tr>                  <xsl:for-each select=".|following-sibling::item[[position()|&amp;lt; $columns]]">                      <td>                          <xsl:apply-templates select="."/>                      </td>                  </xsl:for-each>                  <xsl:if test="count(following-sibling::item) &amp;lt; ($columns-1)">                      <td colspan="{$columns - count(following-sibling::item) - 1"></td>                  </xsl:if>              </tr>          </xsl:for-each>      </table>  </xsl:template>

Notice that the outer for-each repeats once for every row that must be rendered and the inner loop iterates over the elements that should be on that row. XSL makes this hard because the only loop construct, <xsl:for-each> operates on a node-set, not on a variable or a fixed number of times. You have to construct an XPath query that repeats exactly the number of times you need.

A simple issue that might confuse you is that there is an <xsl:if> but no corresponding <xsl:else>. In fact, the way to do it if you need if/else logic is to use <xsl:choose> with <xsl:when> for the if clause and <xsl:otherwise> for the else clause. <xsl:choose> looks disconcertingly like a switch statement from C-like languages, but in fact is equivalent to a chain of if () {} else if () {} else {} statements.

There is no support for iterative constructs within XSL, but you can use tail-recursion. This is a good method for small iterations but my guess is that XSL is not really aimed at doing this, and at least one XSL implemenation doesn’t optimise tail-recursion (which effectively screws it up for everybody). That means that the stack is of limited length and different on all XSL processors. So I wouldn’t recommend using this technique for very deep iteration.

For example… display the numbers 1 to 10:

 <xsl:template name="number">     <xsl:param name="n">1</xsl:param>     <xsl:value-of select="$n"/>     <xsl:if test="$n &amp;lt; 10">        <xsl:call-template name="number">           <xsl:with-param name="n" select="$n + 1"/>        </xsl:call-template>     <xsl:if>  </xsl:template>

Finally, you ought to try to learn all of XSLT at once. There is often not more than one way to do it, which means you’ll struggle unless you know all the techniques available to you from the outset.

`<xsl:copy/>` and namespaces

<xsl:copy> copies the current node in the source tree to the result tree.

I wanted to do something a bit like this:

 <Product ID="5">    <Name>ACME Widget</Name>    <Description>      <!-- raw html in here -->      <html>        <ul>          <li>...</li>        </ul>      </html>    </Description>  </Product>

and use templates like

 <xsl:template match="html">    <xsl:apply-templates select="@*|node()" mode="raw"/>  </xsl:template>
 <xsl:template match="@*|node()" mode="raw">    <xsl:copy>      <xsl:apply-templates select="@*|node()" mode="raw"/>    </xsl:copy>  </xsl:template>

to copy the contents of the HTML node from source to result XML trees.

Unfortunately the issue of XML namespaces complexifies the whole process. Try it and you’ll find that the resulting XML document has the namespace specifically unspecified for the XHTML elements, which makes XHTML-compliant webbrowsers fail to recognise them. Specifically, the default HTML stylesheets don’t apply and replaced elements like <img/> and <<BR>> are omitted.

This is because the source document is of an empty default namespace and the result document is of a non-empty default namespace, so the namespace is qualified as empty within the result document.

The solution to this problem is fairly easy but non-obvious.

Step One: recognise the fact that your namespace is changing in the source document. Effectively the first document was malformed, as the namespaces were broken. It’s trying to invoke XHTML semantics on XML outside of the XHTML namespace.

 <Product ID="5">    <Name>ACME Widget</Name>    <Description>      <!-- raw html in here -->      <html xmlns="http://www.w3.org/1999/xhtml">        <ul>          <li>...</li>        </ul>      </html>    </Description>  </Product>

This means that the namespace for the <html> element and all its descendants is http://www.w3.org/1999/xhtml (the XHTML namespace).

Step 2: At this point you’ll find that you can’t get an XSL template to match the <html> element at all. That’s because XPath 1.0 matches by namespace. You’ve moved the <html> element out of the empty namespace into the XHTML namespace. To fix this you must bind a prefix to the namespace so that you can match with it.

First add a prefix binding like xmlns:xhtml=”http://www.w3.org/1999/xhtml” to the document element of the XSL:

 <xsl:stylesheet version="1.0" xmlns:xhtml="http://www.w3.org/1999/xhtml" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns="http://www.w3.org/1999/xhtml">  ...  </xsl:stylesheet>

This binds the prefix xhtml to XHTML.

Then replace the match rule with

 <xsl:template match="xhtml:html">    <xsl:apply-templates select="@*|node()" mode="raw"/>  </xsl:template>

This allows the XPath to match on the XHTML namespace.

Leave a Reply