Node:Step 4, Next:, Previous:Step 3, Up:Tutorial



Step 4: Children annotations

In the previous step, the rule that applied the XSLT transformation had performed an implicit step. The XSLT processing step needs XML data as its input, so it needed to first parse the source file into XML before applying the XSLT to it. The explicit processing step to parse raw data into XML is called xml-load. So the same rule could be written explicitly as the following:

<rule source-suffix=".xml" target-suffix=".html">
  <xml-load annotate-with-source="no"/>
  <xslt stylesheet="scripts/tr04.xsl"/>
</rule>

The raw data from the source file goes into the xml-load processing step and comes out as XML data. The XML data goes into the xslt processing step and (in this case) comes out as XHTML XML. Finally, this XML data is serialized and written into the target file.

As a bit of trivia, if you had a rule with just a single xml-load processing step in it, the behaviour will be similar to a straight file copy. The difference being the XML file might not be an exact syntactical copy because of changes caused by the XML parsing and serialization process (e.g. processing entity references, changing character encodings, etc.) And, of course, it will fail to work on non-XML source files.

It should be pointed out that you can have many processing steps in a rule. For example, you could load the XML, transform it with XSLT, transform that result with a different XSLT script, apply yet another XSLT script, and then save the result to the target file. See the reference section for further details. However, one XSLT script is usually sufficient.

The xml-load processing step can be used to annotate the parsed XML. The XSLT script can then use those annotations to generate a richer Web page. There are a number of different annotations that can be applied. We'll start by examining the children annotation. The annotations are placed inside the xml-load element, changing the rule to:

<rule source-suffix=".xml" target-suffix=".html">
  <xml-load>
    <children file-name="index.xml"/>
  </xml-load>
  <xslt stylesheet="scripts/tr04.xsl"/>
</rule>

The children annotation is useful for creating navigation links to lower subsections of the Web site. It is here where the directory structure of the source tree becomes important. The children annotation starts with the directory where the currently processed source file is in. It looks in all the subdirectories under that directory, and finds the list of files whose name exactly matches the file-name attribute. For example, if it was processing the ~source~/hardware/index.xml file, the children files are ~source/hardware/mp100/index.xml, ~source/hardware/mp110/index.xml, ~source/hardware/mp120/index.xml, ~source/hardware/mp130/index.xml, and ~source/hardware/omp/index.xml.

The children annotation will append an element to the contents of the root element of the parsed source file. This element will have the name children and come from the Transbuild annotation namespace of http://hoylen.com/ns/xmlns/2002/transbuild/annotation Inside that element, it will place a copy of the root element from parsing those children files. Those root elements will be further annotated with a source attribute (from the annotations namespace) indicating which source file it came from. For example, with the ~source~/hardware/index.xml file, the output from the xml-load processing step will be XML data containing something like:

<article
 xmlns:TBA="http://hoylen.com/ns/xmlns/2002/transbuild/annotation"
 TBA:source="transbuild://hardware/index.xml">

  <title>Hardware</title>
  <para>Our products incorporate leading edge technology with award
  winning design which is both beautiful and functional.</para>

  <TBA:children>
    <article TBA:source="transbuild://hardware/mp100/index.xml">
      <title>MP100</title>
      <para>The MP100 improved on the original design.</para>
    </article>

    <article TBA:source="transbuild://hardware/mp110/index.xml">
      <title>MP110</title>
      <para>An upgrade of MP100.</para>
    </article>

    <article TBA:source="transbuild://hardware/mp120/index.xml">
      <title>MP120</title>
      <para>An upgrade of MP110.</para>
    </article>

    <article TBA:source="transbuild://hardware/mp130/index.xml">
      <title>MP130</title>
      <para>The MP130 is a more advance model.</para>
    </article>

    <article TBA:source="transbuild://hardware/omp/index.xml">
      <title>OMP</title>
      <para>The OMP is the first in our product range. A world first
            when it was released.</para>
    </article>
  </TBA:children>
</article>

The value of the TBA:source attributes appear in the form of URIs using the "transbuild" scheme. This is the same as a filename of a file in the source tree, and can be used in the same way (e.g. passed to the TBF:href function). If you ever see a URI of this form, treat it a reminder that it is something from the source tree file space.

Notice that the document's root element also has a TBA:source attribute added to it. It's value refers to the currently processed file. If you want to suppress it, add a annotate-with-source attribute to the xml-load element, and set its value to no (as was done in the first example in this step). If not present, the value is yes. You can also suppress the other TBA:source attributes by putting the same attribute in the children element. However, you'll probably never do this because the source attribute is very useful.

All TBA:source values generated by annotations are in a canonical form. This means you can test if two files are the same by using a string equality test on the TBA:source attributes.

If you want to see what annotations are added there are two ways to find out. The first is to remove the xslt processing step from the rule, causing the annotated XML to be written straight into the target file. It won't be pretty printed like the above example, so you may need an XML viewer or editor to make sense of it all (a browser like Mozilla 1.2 will view XML files if they have a .xml extension). The second method is to turn on the debugging trace (see the reference section for details).

In the XSLT stylesheet, the "article" template is modified to use the children annotations to create links to them. The TBF:href XPath function is used on the value from the TBA:source annotation attribute to create the hyperlink.

<xsl:template match="article">
  <body>
    <div class="nav-top">
      <ul>
        <li><a href="{TBF:href('/index.xml')}">Home</a></li>
      </ul>
    </div>

    <h1><xsl:value-of select="title"/></h1>

    <div class="nav-sub">
      <xsl:if test="/article/TBA:children/article">
        <ul>
          <xsl:for-each select="/article/TBA:children/article">
            <li>
              <a href="{TBF:href(@TBA:source)}">
                <xsl:value-of select="title"/>
              </a>
            </li>
          </xsl:for-each>
        </ul>
      </xsl:if>
      <xsl:text disable-output-escaping="yes">&amp;nbsp;</xsl:text>
    </div>

    <div class="main">
      <xsl:apply-templates/>
    </div>
  </body>
</xsl:template>

The xsl:text containing a single non-breaking space is a work-around so that the page will render properly when there are no children. (Is this a browser bug? Is there a better solution?)

Since the parsed XML data now has the extra annotations, we don't want them to be processed by xsl:apply-templates. An extra template is added to make sure that the XSLT processor will ignore them.

<xsl:template match="TBA:*">
  <!-- ignore annotation elements -->
</xsl:template>

This step can be tested by running Transbuild on the build script file tb-step04.xml.