Node:Step 9, Next:, Previous:Step 8, Up:Tutorial



Step 9: Processing different files differently

So far we have been transforming just one type of file. In this step we will add other types of files into the source tree.

For this step, extra files have been added to the source tree. You will find a new source tree under the directory source-b. It extends the previous source tree by adding a few more XML files and subdirectories, adding data-sheet XML files for the hardware products, and a timeline XML data file. These contain XML data using different XML vocabularies from the one we have been using so far. Also, some JPEG and PNG images have been added.

These different XML files need to be processed using different XSLT scripts. We have two main options:

In most cases, you'll choose the second option. With the second option, there must be a way to ensure that the correct rule is used according to the source file's XML vocabulary. Rules are picked according to the filename suffix, and the order they appear in the build script. Since we have already decided that all XML files will have a .xml extension, we will need longer extensions based on that. This is possible because the source-suffix value does not care about file-system extensions (the . has no special significance in it).

The filename convention this tutorial will use the suffix -hdat.xml for hardware data-sheets, and the suffix timeline.xml for files containing the timeline XML vocabulary. All other files (i.e. the index.xml files) will remain unchanged. It just happens that there is only one timeline vocabulary file in the whole site, and its entire name is timeline.xml (Transbuild does not care if the suffix matches the entire filename or just part of it.)

The new build script now looks like this:

<?xml version="1.0"?>

<build-script
 xmlns="http://hoylen.com/ns/xmlns/2002/transbuild/buildscript"
 version="1.0"
 source="source-a"
 target="target">

<rule source-suffix="-hdat.xml" target-suffix=".html">
  <xml-load>
    <dir path="transbuild://">
      <children file-name="index.xml"/>
    </dir>
    <ancestors file-name="index.xml"/>
    <children file-name="index.xml"/>
  </xml-load>
  <xslt stylesheet="scripts/tr09-h.xsl"/>
</rule>

<rule source-suffix="timeline.xml" target-suffix=".svg">
  <xslt stylesheet="scripts/tr09-t.xsl"/>
</rule>

<rule source-suffix=".xml" target-suffix=".html">
  <xml-load>
    <dir path="transbuild://">
      <children file-name="index.xml"/>
    </dir>
    <ancestors file-name="index.xml"/>
    <children file-name="index.xml"/>
  </xml-load>
  <xslt stylesheet="scripts/tr09-a.xsl"/>
</rule>

<rule source-suffix=".png"><file-copy/></rule>
<rule source-suffix=".jpg"><file-copy/></rule>
<rule source-suffix=".css"><file-copy/></rule>

<rule source-suffix="~"/>

</build-script>

Notice that we are still using the old rule that matches .xml suffixes. It is placed after the other rules with longer suffixes so they will match first if possible. File copy rules have been added for the two new image file types.

The XSLT stylesheet to process the hardware fact-sheets is similar to the one we have already created to process the index.xml files. The difference is in the input XML it processes.

For something different, the timeline.xml will be processed by a XSLT stylesheet to generate a SVG file. The Scalable Vector Graphics (SVG) format is an XML vocabulary for vector graphics. Don't worry if you don't know how to write SVG. However, you will need a browser plug-in to display it (such as the one from Adobe.)

The XSLT processor can transform XML data into different formats. We have been using XHTML, and now SVG - both XML vocabularies. Remember, XSLT can also generate normal HTML and arbitrary text files too. And if you need some other form of processing, there are other processing steps available besides XSLT (see the reference section for details).

We want to embed the SVG file and other graphic images into the XHTML pages. So, we'll extend the source XML vocabulary for the index.xml files to:

<!ELEMENT article (title, (para|imagedata)*)>
<!ATTLIST article status CDATA #IMPLIED>
<!ELEMENT title (#PCDATA)>
<!ELEMENT para (#PCDATA | ulink)>
<!ELEMENT ulink (#PCDATA)>
<!ATTLIST ulink url CDATA #REQUIRED>
<!ELEMENT image (#PCDATA)>
<!ATTLIST image
          fileref CDATA #REQUIRED
          format (PNG|JPEG|SVG) #REQUIRED
          width CDATA #IMPLIED
          height CDATA #IMPLIED>

The XSLT stylesheet has been modified to handle the image element. Some of the XML source files use the image element to reference image files.

The stylesheet has also been modified so that arbitrary external hyperlinks (using HTTP and FTP protocols) can be created as well as links to internal pages.

<xsl:template match="ulink">
  <xsl:choose>
    <xsl:when test="starts-with(@url, 'http:')">
      <a href="{@url}"><xsl:apply-templates/></a>
    </xsl:when>
    <xsl:when test="starts-with(@url, 'ftp:')">
      <a href="{@url}"><xsl:apply-templates/></a>
    </xsl:when>
    <xsl:otherwise>
      <a href="{TBF:href(@url)}"><xsl:apply-templates/></a>
    </xsl:otherwise>
  </xsl:choose>
</xsl:template>

This step has shown how different types of files can be distinguished by careful use of filename suffixes. It can be tested with the tb-step09.xml build script.