Node:Step 4, Next:Step 5, Previous:Step 3, Up:Tutorial
In the previous step, the rule that applied the XSLT
transformation had performed an implicit step. The XSLT processing
step needs XML data as its input, so it needed to first parse the
source file into XML before applying the XSLT to it. The explicit
processing step to parse raw data into XML is called
xml-load. So the same rule could be written
explicitly as the following:
<rule source-suffix=".xml" target-suffix=".html"> <xml-load annotate-with-source="no"/> <xslt stylesheet="scripts/tr04.xsl"/> </rule>
The raw data from the source file goes into the
xml-load processing step and comes out as XML data.
The XML data goes into the xslt processing step and
(in this case) comes out as XHTML XML. Finally, this XML data is
serialized and written into the target file.
As a bit of trivia, if you had a rule with just a single
xml-load processing step in it, the behaviour will
be similar to a straight file copy. The difference being the XML file
might not be an exact syntactical copy because of changes caused by
the XML parsing and serialization process (e.g. processing entity
references, changing character encodings, etc.) And, of course, it
will fail to work on non-XML source files.
It should be pointed out that you can have many processing steps in a rule. For example, you could load the XML, transform it with XSLT, transform that result with a different XSLT script, apply yet another XSLT script, and then save the result to the target file. See the reference section for further details. However, one XSLT script is usually sufficient.
The xml-load processing step can be used to
annotate the parsed XML. The XSLT script can then use those
annotations to generate a richer Web page. There are a number of
different annotations that can be applied. We'll start by examining
the children annotation. The annotations are
placed inside the xml-load element, changing the
rule to:
<rule source-suffix=".xml" target-suffix=".html">
<xml-load>
<children file-name="index.xml"/>
</xml-load>
<xslt stylesheet="scripts/tr04.xsl"/>
</rule>
The children annotation is useful for creating navigation links
to lower subsections of the Web site. It is here where the directory
structure of the source tree becomes important. The children
annotation starts with the directory where the currently processed
source file is in. It looks in all the subdirectories under that
directory, and finds the list of files whose name exactly matches the
file-name attribute. For example, if it was
processing the ~source~/hardware/index.xml file,
the children files are
~source/hardware/mp100/index.xml,
~source/hardware/mp110/index.xml,
~source/hardware/mp120/index.xml,
~source/hardware/mp130/index.xml, and
~source/hardware/omp/index.xml.
The children annotation will append an
element to the contents of the root element of the parsed source
file. This element will have the name children and
come from the Transbuild annotation namespace of
http://hoylen.com/ns/xmlns/2002/transbuild/annotation
Inside that element, it will place a copy of the root element from
parsing those children files. Those root elements will be further
annotated with a source attribute (from the
annotations namespace) indicating which source file it came from. For
example, with the ~source~/hardware/index.xml
file, the output from the xml-load processing step
will be XML data containing something like:
<article
xmlns:TBA="http://hoylen.com/ns/xmlns/2002/transbuild/annotation"
TBA:source="transbuild://hardware/index.xml">
<title>Hardware</title>
<para>Our products incorporate leading edge technology with award
winning design which is both beautiful and functional.</para>
<TBA:children>
<article TBA:source="transbuild://hardware/mp100/index.xml">
<title>MP100</title>
<para>The MP100 improved on the original design.</para>
</article>
<article TBA:source="transbuild://hardware/mp110/index.xml">
<title>MP110</title>
<para>An upgrade of MP100.</para>
</article>
<article TBA:source="transbuild://hardware/mp120/index.xml">
<title>MP120</title>
<para>An upgrade of MP110.</para>
</article>
<article TBA:source="transbuild://hardware/mp130/index.xml">
<title>MP130</title>
<para>The MP130 is a more advance model.</para>
</article>
<article TBA:source="transbuild://hardware/omp/index.xml">
<title>OMP</title>
<para>The OMP is the first in our product range. A world first
when it was released.</para>
</article>
</TBA:children>
</article>
The value of the TBA:source attributes appear
in the form of URIs using the "transbuild"
scheme. This is the same as a filename of a file in the source tree,
and can be used in the same way (e.g. passed to the
TBF:href function). If you ever see a URI of this
form, treat it a reminder that it is something from the source tree
file space.
Notice that the document's root element also has a
TBA:source attribute added to it. It's value
refers to the currently processed file. If you want to suppress it,
add a annotate-with-source attribute to the
xml-load element, and set its value to
no (as was done in the first example in this
step). If not present, the value is yes. You can
also suppress the other TBA:source attributes by
putting the same attribute in the children element.
However, you'll probably never do this because the source attribute is
very useful.
All TBA:source values generated by
annotations are in a canonical form. This means you can test if two
files are the same by using a string equality test on the
TBA:source attributes.
If you want to see what annotations are added there are two ways
to find out. The first is to remove the xslt
processing step from the rule, causing the annotated XML to be written
straight into the target file. It won't be pretty printed like the
above example, so you may need an XML viewer or editor to make sense
of it all (a browser like Mozilla 1.2 will view XML files if they have
a .xml extension). The second method is to turn
on the debugging trace (see the reference section for details).
In the XSLT stylesheet, the "article" template is modified to use the
children annotations to create links to them. The TBF:href XPath
function is used on the value from the TBA:source annotation
attribute to create the hyperlink.
<xsl:template match="article">
<body>
<div class="nav-top">
<ul>
<li><a href="{TBF:href('/index.xml')}">Home</a></li>
</ul>
</div>
<h1><xsl:value-of select="title"/></h1>
<div class="nav-sub">
<xsl:if test="/article/TBA:children/article">
<ul>
<xsl:for-each select="/article/TBA:children/article">
<li>
<a href="{TBF:href(@TBA:source)}">
<xsl:value-of select="title"/>
</a>
</li>
</xsl:for-each>
</ul>
</xsl:if>
<xsl:text disable-output-escaping="yes">&nbsp;</xsl:text>
</div>
<div class="main">
<xsl:apply-templates/>
</div>
</body>
</xsl:template>
The xsl:text containing a single non-breaking
space is a work-around so that the page will render properly when
there are no children. (Is this a browser bug? Is there a better
solution?)
Since the parsed XML data now has the extra annotations, we
don't want them to be processed by
xsl:apply-templates. An extra
template is added to make sure that the XSLT processor will ignore
them.
<xsl:template match="TBA:*"> <!-- ignore annotation elements --> </xsl:template>
This step can be tested by running Transbuild on the build
script file tb-step04.xml.