<?xml version="1.0" encoding="UTF-8"?>
<book version="5.0" xmlns="http://docbook.org/ns/docbook"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:xi="http://www.w3.org/2001/XInclude"
      xmlns:svg="http://www.w3.org/2000/svg"
      xmlns:m="http://www.w3.org/1998/Math/MathML"
      xmlns:html="http://www.w3.org/1999/xhtml"
      xmlns:db="http://docbook.org/ns/docbook">
  <info>
    <title>Lecture notes of Martin Goik</title>

    <author>
      <personname><firstname>Martin</firstname>
      <surname>Goik</surname></personname>

      <affiliation>
        <orgname>http://medieninformatik.hdm-stuttgart.de</orgname>
      </affiliation>
    </author>

    <legalnotice>
      <para>Source code available at <uri
      xlink:href="https://version.mi.hdm-stuttgart.de/git/GoikLectures">https://version.mi.hdm-stuttgart.de/git/GoikLectures</uri></para>
    </legalnotice>

    <annotation>
      <para>ToDo: Figures from old lecture slides.</para>

      <para>Images and streams, Stored procedures, Transactions</para>
    </annotation>
  </info>

  <glossary xml:id="glossary">
    <glossentry xml:id="gloss_Java">
      <glossterm><trademark
      xlink:href="http://www.oracle.com/us/legal/third-party-trademarks/index.html">Java</trademark></glossterm>

      <glossdef>
        <para>General purpose programming language with support for object
        oriented concepts.</para>
      </glossdef>
    </glossentry>

    <glossentry xml:id="gloss_Javadoc">
      <glossterm><trademark
      xlink:href="http://docs.oracle.com/javase/1.5.0/docs/guide/javadoc">Javadoc</trademark></glossterm>

      <glossdef>
        <para>Extracting documentation embedded in <link
        linkend="gloss_Ja"><trademark>Java</trademark></link> source
        code.</para>
      </glossdef>
    </glossentry>

    <glossentry xml:id="gloss_JPA">
      <glossterm><abbrev>JPA</abbrev></glossterm>

      <glossdef>
        <para><link
        xlink:href="http://www.javaworld.com/javaworld/jw-01-2008/jw-01-jpa1.html">Java
        Persistence Architecture</link></para>
      </glossdef>
    </glossentry>

    <glossentry xml:id="gloss_ORM">
      <glossterm><abbrev>ORM</abbrev></glossterm>

      <glossdef>
        <para>Object relational mapping.</para>
      </glossdef>
    </glossentry>

    <glossentry xml:id="gloss_XML">
      <glossterm><abbrev>XML</abbrev></glossterm>

      <glossdef>
        <para>The <link xlink:href="http://www.w3.org/XML">Extensible Markup
        Language</link>.</para>
      </glossdef>
    </glossentry>
  </glossary>

  <part xml:id="sda1">
    <title>Structured Data and Applications 1</title>

    <chapter xml:id="prerequisites">
      <title>Prerequisites</title>

      <section xml:id="resources">
        <title>Lecture resources</title>

        <glosslist>
          <glossentry>
            <glossterm>Lecture notes as PDF</glossterm>

            <glossdef>
              <para><uri
              xlink:href="http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/printversion.pdf">http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/printversion.pdf</uri></para>

              <caution>
                <para>Some figures including videos are left blank.</para>
              </caution>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm>List of exercises</glossterm>

            <glossdef>
              <para>The lecture notes contain exercises to be solved by you! A
              complete list is available at <uri
              xlink:href="http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/apb.html">http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/apb.html</uri>.
              You may also use the corresponding PDF version within <filename
              xlink:href="http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/printversion.pdf">printversion.pdf</filename>
              to keep track of your personal advance by filling in your
              completion status.</para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm><link
            linkend="gloss_Javadoc"><trademark>Javadoc</trademark></link>
            references and source code</glossterm>

            <glossdef>
              <para>The lecture notes contain a lot of <link
              linkend="gloss_Javadoc"><trademark>Javadoc</trademark></link>
              references. Most classes appearing within these lecture notes
              have <link
              linkend="gloss_Javadoc"><trademark>Javadoc</trademark></link>
              generated links to the source code as well. For example when
              clicking on the class name in
              <classname>sda.jdbc.intro.v1.SimpleInsert</classname> you will
              see the complete implementation.</para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm>Links to animated figures</glossterm>

            <glossdef>
              <para>The lecture notes' online version contains links to <uri
              xlink:href="http://www.mi.hdm-stuttgart.de/freedocs/topic/de.hdm_stuttgart.mi.sda1/jdbcWrite.html">PDF
              images</uri>. Clicking on <quote>Animated PDF Version</quote>
              takes you to a referenced PDF which in full screen mode of
              Acrobat Reader or <trademark>google-chrome</trademark> provides
              a slide like animation.</para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm><trademark>Virtualbox</trademark> image</glossterm>

            <glossdef>
              <para>A <productname
              xlink:href="https://www.virtualbox.org">Virtualbox</productname>
              image is available in the following formats: <glosslist>
                  <glossentry>
                    <glossterm>Split <command>rar</command> archive (100 MB
                    chunks):</glossterm>

                    <glossdef>
                      <para>You may want to use <productname
                      xlink:href="http://jdownloader.org">Jdownloader</productname>
                      or similar tools to download the chunked archive at <uri
                      xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/Rarformat">ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/Rarformat</uri>.</para>

                      <para>If you are using <productname
                      xlink:href="http://jdownloader.org">Jdownloader</productname>
                      a container file is <link
                      xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/Meta/lubuntu.dlc">available
                      here</link> for your convenience. If you have configured
                      the flashgot extension in a running <productname
                      xlink:href="http://jdownloader.org">Jdownloader</productname>
                      process you may trigger a download by clicking in <uri
                      xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/Meta/lubuntu.html">ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/Meta/lubuntu.html</uri>.</para>
                    </glossdef>
                  </glossentry>

                  <glossentry>
                    <glossterm>Uncompressed raw image:</glossterm>

                    <glossdef>
                      <para><uri
                      xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi">ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi</uri></para>
                    </glossdef>
                  </glossentry>
                </glosslist> It contains (hopefully) all related tools from
              the <link xlink:href="http://www.mi.hdm-stuttgart.de">CSM</link>
              department's lecture room Linux installation:</para>

              <itemizedlist>
                <listitem>
                  <para>Eclipse J2EE version with <productname
                  xlink:href="http://www.eclipse.org/datatools">Database
                  developer tools</productname>, <productname
                  xlink:href="http://git-scm.com">git</productname>,
                  <trademark
                  xlink:href="http://oxygenxml.com">Oxygenxml</trademark>,
                  <productname
                  xlink:href="http://testng.org/doc/eclipse.html">TestNG</productname>
                  and <productname
                  xlink:href="http://subversion.apache.org/">svn</productname>
                  plugins installed.</para>
                </listitem>

                <listitem>
                  <para>A running <productname
                  xlink:href="http://www.mysql.com/">Mysql</productname>
                  server preconfigured with user
                  <quote><code>hdmuser</code></quote>, password
                  <quote><code>XYZ</code></quote> and database
                  <quote><code>hdm</code></quote>.</para>
                </listitem>

                <listitem>
                  <para><productname
                  xlink:href="http://www.xmlmind.com/xmleditor">Xmlmind XML
                  editor</productname> for visually editing technical
                  documents based on <productname
                  xlink:href="http://docbook.org/tdg5/index.html">docbook</productname>
                  or <productname
                  xlink:href="http://en.wikipedia.org/wiki/Darwin_Information_Typing_Architecture">DITA</productname>.</para>
                </listitem>
              </itemizedlist>

              <caution>
                <para>This VM is only accessible from within the <orgname
                xlink:href="http://www.hdm-stuttgart.de">HdM</orgname>
                network. External downloads require <productname
                xlink:href="https://wiki.mi.hdm-stuttgart.de/wiki/VPN">OpenVPN</productname>.</para>
              </caution>

              <para>The virtual machine is based on the <productname
              xlink:href="http://lubuntu.net">Lubuntu</productname> fork of
              the <productname
              xlink:href="http://www.ubuntu.com">Ubuntu</productname> Linux
              distribution for resource saving reasons.</para>
            </glossdef>
          </glossentry>

          <glossentry xml:id="oxygenLicenseKey">
            <glossterm><uri>Oxygen Xml Editor</uri> license key</glossterm>

            <glossdef>
              <para>This is the only software component in this lecture
              requiring a license. Your <orgname>HdM</orgname> affiliation
              entitles you to use the <productname
              xlink:href="http://oxygenxml.com/">Oxygenxml</productname>
              software for educational (non-commercial) purposes. The
              corresponding key is available from <uri
              xlink:href="ftp://mirror.mi.hdm-stuttgart.de/Firmen/Eclipse/Plugins/Oxygen/Keys/Version14/Student/licensekey.txt">ftp://mirror.mi.hdm-stuttgart.de/Firmen/Eclipse/Plugins/Oxygen/Keys/Version14/Student/licensekey.txt</uri>.</para>

              <para>This license key is compatible both with the standalone
              and the eclipse plugin version of the product.</para>

              <caution>
                <para>The license key's <abbrev
                xlink:href="http://en.wikipedia.org/wiki/File_Transfer_Protocol">ftp</abbrev>
                URL is only accessible from within the <orgname
                xlink:href="http://www.hdm-stuttgart.de">HdM</orgname>
                network. External access requires <link
                xlink:href="https://wiki.mi.hdm-stuttgart.de/wiki/VPN">Vpn
                activation</link>.</para>
              </caution>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm>Source code of lecture resources</glossterm>

            <glossdef>
              <para>The complete lecture sources are available from <link
              xlink:href="https://version.mi.hdm-stuttgart.de/git/GoikLectures">https://version.mi.hdm-stuttgart.de/git/GoikLectures</link>.</para>

              <para>You may simply execute <quote><command
              xlink:href="http://git-scm.com/">git</command>
              <option>clone</option>
              <option>https://version.mi.hdm-stuttgart.de/git/GoikLectures</option>
              <option>.</option></quote> to check out the master tree.</para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm>Source code of exercises and examples</glossterm>

            <glossdef>
              <para>These sources contain a subdirectory ws/eclipse/Jdbc which
              can be imported as an eclipse project. This allows for browsing
              solutions to the exercises and executing sample applications.
              Import into eclipse works the following way:</para>

              <itemizedlist>
                <listitem>
                  <para>When starting eclipse choose
                  <filename>.../ws/eclipse</filename> as workspace</para>
                </listitem>

                <listitem>
                  <para>In eclipse click <quote>File --&gt; Import --&gt;
                  General --&gt; Existing Projects into Workspace</quote>.
                  After re-selecting the current workspace
                  <filename>.../ws/eclipse</filename> the folder
                  <filename>Jdbc</filename> should be on the list of
                  importable projects.</para>

                  <para>Depending on your eclipse installation you may have to
                  adjust the <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  system libraries. Right click on your project root in the
                  package explorer and choose <quote>Build Path --&gt;
                  Configure Buildpath</quote>. The <quote>JRE System
                  Library</quote> entry in the <quote>Libraries</quote> tab
                  may have to be changed to suit your eclipse's installation
                  needs. You may want to create a dummy <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  project to find the correct setting.</para>
                </listitem>
              </itemizedlist>
            </glossdef>
          </glossentry>
        </glosslist>
      </section>

      <section xml:id="tools">
        <title>Tools</title>

        <para>The subsequent sections describe tools being helpful to
        successfully carry out the exercises. These descriptions are suitable
        for current Linux/Ubuntu systems. However these tool are available for
        <trademark>Windows</trademark> or <trademark>Apple</trademark> systems
        as well. For the latter some command line hints may have to be
        replaced by using GUI based tools.</para>

        <para>You may want to use the <link
        xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi">corresponding</link>
        <link xlink:href="https://www.virtualbox.org">Virtualbox image</link>
        containing a complete system avoiding installation hassles. This
        should work well one reasonable current hardware systems.</para>

        <section xml:id="eclipse">
          <title><productname
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>
          and Eclipse</title>

          <para>So you like to take the hard way rather than using <link
          xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi">the
          virtualbox image</link>? Good! Real programmers tend to complicate
          things!</para>

          <para>The Eclipse IDE will be used as the primary coding tool
          especially for <link
          linkend="gloss_Java"><trademark>Java</trademark></link> and XML.
          Users may use different tools like e.g. <productname
          xlink:href="http://netbeans.org">Netbeans</productname> or
          <productname
          xlink:href="http://www.altova.com/de/xmlspy.html">XML-Spy</productname>.
          There are however some caveats:</para>

          <itemizedlist>
            <listitem>
              <para>Certain functionalities may not be provided</para>
            </listitem>

            <listitem>
              <para><orgname>HdM</orgname> staff support in case of troubles
              will be limited to coding and exclude tool support. In other
              words: You are on your own regarding tool related issues.</para>
            </listitem>
          </itemizedlist>

          <para>Installation of eclipse requires a suitable <link
          linkend="gloss_Java"><trademark>Java</trademark></link> Development
          Kit.</para>

          <caution>
            <para>Your<productname
            xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>
            selection may be affected by your system's hardware. On a 64 bit
            system you may install either a 32 bit or a 64 bit <productname
            xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>.
            If you subsequently install eclipse you must select the
            appropriate 32 or 64 Bit version matching your<productname
            xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>
            choice.</para>
          </caution>

          <para>Due to Oracle's (end-user unfriendly) licensing policy you may
          have to install this component manually. For <productname
          xlink:href="http://www.ubuntu.com">Ubuntu</productname> and
          <productname xlink:href="http://www.debian.org">Debian</productname>
          systems a standard (package manager compatible) procedure is being
          described at <uri
          xlink:href="http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html">http://www.webupd8.org/2012/01/install-oracle-java-jdk-7-in-ubuntu-via.html</uri>.
          This boils down to (being executed as user root or preceded by
          <command>sudo</command> <option>...</option>):</para>

          <programlisting>add-apt-repository ppa:webupd8team/java
apt-get update
apt-get install oracle-jdk7-installer</programlisting>

          <para>During the installation process you will have to accept
          Oracle's license terms. If you do so this information will be cached
          and not be asked again for when updating via <command>aptitude
          </command><option>update</option>;<command>aptitude</command>
          <option>safe-upgrade</option>. After successful installation when
          executing <command
          xlink:href="http://www.oracle.com/us/technologies/java">java</command>
          <option>-version</option> in a shell you should see something
          similar to:</para>

          <programlisting>goik@goiki:~$ <emphasis role="bold">java -version</emphasis>
java version "1.7.0_07"
Java(TM) SE Runtime Environment (build 1.7.0_07-b10)
Java HotSpot(TM) Server VM (build 23.3-b01, mixed mode)</programlisting>

          <para>The Eclipse IDE comes <link
          xlink:href="http://www.eclipse.org/downloads">with various
          flavours</link> depending on which plugins are already being
          shipped. For our purposes the <quote><productname>Eclipse
          Classic</productname></quote> <link
          linkend="gloss_Java"><trademark>Java</trademark></link> edition is
          sufficient. You may however want to install other flavours like
          <quote><productname>Eclipse IDE for Java EE
          Developers</productname></quote> if you require features beyond this
          course's needs. Remember to download the correct 32 or 64 bit
          version corresponding to your<productname
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>.</para>

          <para>Follow <uri
          xlink:href="http://askubuntu.com/questions/26632/how-to-install-eclipse#answer-145018">http://askubuntu.com/questions/26632/how-to-install-eclipse#answer-145018</uri>
          to install eclipse on your system.</para>
        </section>

        <section xml:id="oxygenxmlInstall">
          <title><productname
          xlink:href="http://oxygenxml.com">Oxygenxml</productname>
          plugin</title>

          <para>Go to <uri
          xlink:href="http://www.oxygenxml.com/download_oxygenxml_developer.html?os=Eclipse">http://www.oxygenxml.com/download_oxygenxml_developer.html?os=Eclipse</uri>.
          You may choose between the <quote>Plugin Update site</quote> and
          <quote>Plugin zip distribution</quote> installation method. The
          latter allows for better long term eclipse plugin management and is
          being described at</para>

          <para>There are two different ways to install Eclipse
          plugins:</para>

          <itemizedlist>
            <listitem>
              <para>Use Eclipse's built in Update manager by <link
              xlink:href="http://www.oxygenxml.com/download_oxygenxml_developer.html?os=Eclipse#eclipse_install_instructions">defining
              a corresponding update site</link>.</para>
            </listitem>

            <listitem>
              <para>Unzip
              <filename>com.oxygenxml.developer_14.0.0.v2012082911.zip</filename>
              in a subfolder of <filename>.../eclipse/dropins</filename> and
              restart eclipse (as root).</para>
            </listitem>
          </itemizedlist>

          <para>See <xref linkend="oxygenLicenseKey"/> for obtaining a license
          key. You may as well install the standalone version of the Oxygen
          XML Editor.</para>
        </section>

        <section xml:id="testngInstall">
          <title><foreignphrase>TestNG</foreignphrase> plugin</title>

          <para>Some exercises require the TestNG plugin to be installed in
          the Eclipse IDE. You may proceed in a similar way as in <uri
          linkend="oxygenxmlInstall">Oxygenxml</uri>. According to <uri
          xlink:href="http://testng.org/doc/eclipse.html#eclipse-installation">http://testng.org/doc/eclipse.html#eclipse-installation</uri>
          the Eclipse URL being needed is
          <quote>http://beust.com/eclipse</quote>.</para>
        </section>

        <section xml:id="mysql">
          <title><productname
          xlink:href="http://www.mysql.com">Mysql</productname> Database
          components</title>

          <para>We start by installing the <productname
          xlink:href="http://www.mysql.com">Mysql</productname> server:</para>

          <programlisting>root@goiki:~# aptitude install mysql-server
The following NEW packages will be installed:
  libdbd-mysql-perl{a} libdbi-perl{a} libnet-daemon-perl{a} libplrpc-perl{a} 
  mysql-client-5.5{a} mysql-server-5.5 
0 packages upgraded, 6 newly installed, 0 to remove and 0 not upgraded.
Need to get 0 B/17.8 MB of archives. After unpacking 63.2 MB will be used.
Do you want to continue? [Y/n/?]</programlisting>

          <para>Hit <keycap>Y - return</keycap> to start. During the
          installation you will be asked for the <productname
          xlink:href="http://www.mysql.com">Mysql</productname> servers
          <quote>root</quote> (Administrator) password:</para>

          <programlisting>Package configuration                                                                              
                                                                                                   
                                                                                                   
     ┌───────────────────────────┤ Configuring mysql-server-5.5 ├────────────────────────────┐     
     │ While not mandatory, it is highly recommended that you set a password for the MySQL   │     
     │ administrative "root" user.                                                           │     
     │                                                                                       │     
     │ If this field is left blank, the password will not be changed.                        │     
     │                                                                                       │     
     │ New password for the MySQL "root" user:                                               │     
     │                                                                                       │     
     │ ********_____________________________________________________________________________ │     
     │                                                                                       │     
     │                                        &lt;Ok&gt;                                           │     
     │                                                                                       │     
     └───────────────────────────────────────────────────────────────────────────────────────┘     
                                                                                                   
                                                                                                   
                                                                                                   </programlisting>

          <para>This has to be entered twice. Keep a <emphasis
          role="bold">permanent</emphasis> record of this entry. Alternatively
          set a bookmark to <uri
          xlink:href="https://help.ubuntu.com/community/MysqlPasswordReset">https://help.ubuntu.com/community/MysqlPasswordReset</uri>
          for later reference *** and don't blame me! ***.</para>

          <para>At this point we should be able to connect to our newly
          installed Server. We create a database <quote>hdm</quote> to be used
          for our exercises:</para>

          <programlisting>goik@goiki:~$ mysql -u root -p
Enter password: 
Welcome to the MySQL monitor.  Commands end with ; or \g.
Your MySQL connection id is 42
Server version: 5.5.24-0ubuntu0.12.04.1 (Ubuntu)

Copyright (c) 2000, 2011, Oracle and/or its affiliates. All rights reserved.

Oracle is a registered trademark of Oracle Corporation and/or its
affiliates. Other names may be trademarks of their respective
owners.

Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.

mysql&gt; <emphasis role="bold">create database hdm;</emphasis>
Query OK, 1 row affected (0.00 sec)</programlisting>

          <para>Following <uri
          xlink:href="https://dev.mysql.com/doc/refman/5.5/en/adding-users.html">https://dev.mysql.com/doc/refman/5.5/en/adding-users.html</uri>
          we add a new user and grant full access to the newly created
          database:</para>

          <programlisting>goik@goiki:~$ mysql -u root -p
Enter password:                                                                                                                                                         
 ...                                                                                                                                                                                                                                                          
mysql&gt; CREATE USER 'hdmuser'@'localhost' IDENTIFIED BY 'XYZ';
mysql&gt; use hdm;
mysql&gt; GRANT ALL PRIVILEGES ON *.* TO 'hdmuser'@'localhost' WITH GRANT OPTION;
mysql&gt; FLUSH PRIVILEGES;</programlisting>

          <para>The next step is optional. The <productname
          xlink:href="http://www.ubuntu.com">Ubuntu</productname> <productname
          xlink:href="http://www.mysql.com">Mysql</productname> server default
          configuration allows connections only via
          <varname>loopback</varname> interface i.e.
          <varname>localhost</varname>. If you want your <productname
          xlink:href="http://www.mysql.com">Mysql</productname> server to
          listen to the external network interface comment out the
          bind-address parameter in
          <filename>/etc/mysql/my.cnf</filename>:</para>

          <programlisting># Instead of skip-networking the default is now to listen only on
# localhost which is more compatible and is not less secure.
# <emphasis role="bold">bind-address            = 127.0.0.1</emphasis></programlisting>

          <para>Since we are dealing with <link
          linkend="gloss_Java"><trademark>Java</trademark></link> a <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          driver is needed to connect Applications to our <productname
          xlink:href="http://www.mysql.com">Mysql</productname> server:</para>

          <programlisting>root@goiki:~# aptitude install libmysql-java</programlisting>

          <para>This provides the file
          /usr/share/java/mysql-connector-java-5.1.16.jar and two symbolic
          links:</para>

          <programlisting>goik@goiki:~$ cd /usr/share/java
goik@goiki:/usr/share/java$ ls -al mysql*
-rw-r--r-- 1 ... 2011 <emphasis role="bold">mysql-connector-java-5.1.16.jar</emphasis>
lrwxrwxrwx 1 ... 2011 <emphasis role="bold">mysql-connector-java.jar -&gt; mysql-connector-java-5.1.16.jar</emphasis>
lrwxrwxrwx 1 ... 2011 <emphasis role="bold">mysql.jar -&gt; mysql-connector-java.jar</emphasis></programlisting>
        </section>
      </section>

      <section xml:id="lectureNotes">
        <title>Lecture related resources</title>

        <para>The sources for lecture notes and exercises are available from
        the <orgname xlink:href="http://www.mi.hdm-stuttgart.de">MIB</orgname>
        <productname xlink:href="http://git-scm.com">git</productname>
        repository:</para>

        <para><uri
        xlink:href="https://version.mi.hdm-stuttgart.de/git/GoikLectures">https://version.mi.hdm-stuttgart.de/git/GoikLectures</uri></para>

        <para>Check-out is straightforward:</para>

        <programlisting>goik@goiki:~$ mkdir StructuredData;cd StructuredData

goik@goiki:~/StructuredData$ git clone https://version.mi.hdm-stuttgart.de/git/GoikLectures .
Cloning into '.'...
remote: Counting objects: 694, done
...
Resolving deltas: 100% (296/296), done.</programlisting>

        <para>After checkout an eclipse workspace holding the complete example
        source code becomes visible:</para>

        <programlisting>goik@goiki:~/StructuredData$ cd ws/eclipse
goik@goiki:~/StructuredData/ws/eclipse$ ls -al
insgesamt 16
drwxr-xr-x 3 goik fb1prof 4096 Nov  8 22:04 .
drwxr-xr-x 4 goik fb1prof 4096 Nov  8 22:04 ..
-rw-r--r-- 1 goik fb1prof   11 Nov  8 22:04 .gitignore
<emphasis role="bold">drwxr-xr-x 6 goik fb1prof 4096 Nov  8 22:04 Jdbc</emphasis></programlisting>

        <para>The subdirectory <filename>Jdbc</filename> can be imported as an
        eclipse project via File --&gt; import --&gt; General --&gt; Existing
        Projects into workspace. This should enable each participant to browse
        and execute the examples being provided in the lecture notes. It also
        contains the a <productname
        xlink:href="http://www.mysql.com">Mysql</productname> driver in
        Jdbc/lib/mysql-connector-java-5.1.16.jar being required to set up a
        <trademark
        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
        connection.</para>
      </section>

      <section xml:id="toolingConfigJdbc">
        <titleabbrev>Tooling</titleabbrev>

        <title>Tooling: Configuring and using the <link
        xlink:href="http://www.eclipse.org/datatools">Eclipse database
        development</link> plugin</title>

        <para>For some basic SQL communications the Eclipse environment offers
        a standard plugin (Database development). Establishing connections to
        a specific database server generally requires prior installation of a
        <trademark
        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
        driver on the client side as being shown in the following
        video:</para>

        <figure xml:id="figureConfigJdbcDriver">
          <title>Adding a <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          Driver for <productname
          xlink:href="http://www.mysql.com">Mysql</productname> to the
          database plugin.</title>

          <mediaobject>
            <videoobject>
              <videodata fileref="Ref/Video/jdbcDriverConfig.mp4"/>
            </videoobject>
          </mediaobject>
        </figure>

        <para>During the exercises the eclipse database development
        perspective may be used to browse and structure SQL tables and data.
        The following video demonstrates the configuration of a <trademark
        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
        connection to a local (<varname>localhost</varname> network interface)
        database server. With respect to the introduction given in <xref
        linkend="mysql"/> we assume the existence of a database
        <code>hdm</code> and a corresponding account (hdmuser/Password
        <code>XYZ</code>) on our database server.</para>

        <figure xml:id="figureConfigJdbcConnection">
          <title>Configuring a <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          connection to a (local) <productname
          xlink:href="http://www.mysql.com">Mysql</productname> database
          server.</title>

          <mediaobject>
            <videoobject>
              <videodata fileref="Ref/Video/jdbcConnection.mp4"/>
            </videoobject>
          </mediaobject>
        </figure>

        <para>We are now ready to communicate with our database server. The
        last video in this section shows some basic SQL tasks:</para>

        <figure xml:id="figureEclipseBasicSql">
          <title>Executing SQL statements, browsing schema and retrieving
          data</title>

          <mediaobject>
            <videoobject>
              <videodata fileref="Ref/Video/eclipseBasicSql.mp4"/>
            </videoobject>
          </mediaobject>
        </figure>
      </section>
    </chapter>

    <chapter xml:id="xmlIntro">
      <title>Introduction to XML</title>

      <section xml:id="xmlBasic">
        <title>The XML industry standard</title>

        <para>A short question might be: <quote>What is XML?</quote> An answer
        might be: The acronym XML stands for
        <quote>E<emphasis>x</emphasis>tensible <emphasis>M</emphasis>arkup
        <emphasis>L</emphasis><foreignphrase>anguage</foreignphrase></quote>
        and is an industry standard being published by the W3C standardization
        organization. Like other industry software standards talking about XML
        leads to talk about XML based software: Applications and frameworks
        supplying added values to software implementors and enhancing data
        exchange between applications.</para>

        <para>Many readers are already familiar with XML without explicitly
        referring to the standard itself: The world wide web's
        <foreignphrase>lingua franca</foreignphrase> HTML has been ported to
        an XML dialect forming the <link
        xlink:href="http://www.w3.org/MarkUp">XHTML</link> Standard. The idea
        behind this standard is to distinguish between an abstract markup
        language and rendered results being generated from so called document
        instances by a browser:</para>

        <figure xml:id="renderXhtmlMarkup">
          <title>Rendering XHTML markup</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/xhtml.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>Xhtml is actually a good example to illustrate the tree like,
        hierarchical structure of XML documents:</para>

        <figure xml:id="xhtmlTree">
          <title>Xhtml tree structure</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/xhtmlexample.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>We may extend this example by representing a mathematical
        formula via a standard called <link
        xlink:href="http://www.w3.org/Math">Mathml</link>:</para>

        <figure xml:id="mathmlExample">
          <title>A formula in <link
          xlink:href="http://www.w3.org/Math">MathML</link>
          representation.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/sqrtrender.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>Again we observe a similar situation: A database like
        <emphasis>representation</emphasis> of a formula on the left and a
        <emphasis>rendered</emphasis> version on the right. Regarding XML we
        have:</para>

        <itemizedlist>
          <listitem>
            <para>The <link xlink:href="http://www.w3.org/Math">MathML</link>
            standard intended to describe mathematical formulas. The standard
            defines a set of <emphasis>tags</emphasis> like e.g. <tag
            class="starttag">math:msqrt</tag> with well-defined semantics
            regarding permitted attribute values and nesting rules.</para>
          </listitem>

          <listitem>
            <para>Informal descriptions of formatting expectations.</para>
          </listitem>

          <listitem>
            <para>Software transforming an XML formula representation into
            visible or printable output. In other words: A rendering
            engine.</para>
          </listitem>
        </itemizedlist>

        <para>XML documents may also be regarded as a persistence mechanism to
        represent and store data. Similarities to Relational Database Systems
        exist. A RDBMS
        (<emphasis>R</emphasis><foreignphrase>elational</foreignphrase>
        <emphasis>D</emphasis><foreignphrase>atabase</foreignphrase>
        <emphasis>M</emphasis><foreignphrase>anagement</foreignphrase>
        <emphasis>S</emphasis><foreignphrase>ystem</foreignphrase>) is
        typically capable to hold Tera bytes of data being organized in
        tables. The arrangement of data may be subject to various constraints
        like candidate- or foreign key rules. With respect to both end users
        and software developers a RDBMS itself is a building block in a
        complete solution. We need an application on top of it acting as a
        user interface to the data being contained.</para>

        <para>In contrast to a RDBMS XML allows data to be organized
        hierarchically. The <link
        xlink:href="http://www.w3.org/Math">MathML</link> representation given
        in <xref linkend="mathmlExample"/> may be graphically
        visualized:</para>

        <figure xml:id="mathmltree">
          <title>A tree graph representation of the <link
          xlink:href="http://www.w3.org/Math">MathML</link> example given
          before.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/sqrtree.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>CAD applications may user XML documents as a representation of
        graphical primitives:</para>

        <informalfigure>
          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/attributes.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </informalfigure>

        <para>Of course RDBMS also allow the representation of tree like
        structures or arbitrary graphs. But these have to be modelled by using
        foreign key constraints since relational tables themselves have a
        <quote>flat</quote> structure. Some RDBMS vendors provide extensions
        to the SQL standard which allow <quote>native</quote> representations
        of <link linkend="gloss_XML"><abbrev>XML</abbrev></link>
        documents.</para>
      </section>

      <section xml:id="xmlHtml">
        <title>Well formed XML documents</title>

        <para>The general structure of an <link
        linkend="gloss_XML"><abbrev>XML</abbrev></link> document is as
        follows:</para>

        <figure xml:id="xmlbase">
          <title><link linkend="gloss_XML"><abbrev>XML</abbrev></link> basic
          structure</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/xmlbase.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>We explore a simple XML document representing messages like
        E-mails:</para>

        <figure xml:id="memoWellFormed">
          <title>The representation of a short message.</title>

          <programlisting>&lt;?xml<co xml:id="first_xml_code_magic"/> version="1.0"<co
              xml:id="first_xml_code_version"/> encoding="UTF-8"<co
              xml:id="first_xml_code_encoding"/>?&gt;
&lt;memo&gt;<co xml:id="first_xml_code_topelement"/>
 &lt;from&gt;M. Goik&lt;/from&gt;<co xml:id="first_xml_code_from"/>
 &lt;to&gt;B. King&lt;/to&gt;
 &lt;to&gt;A. June&lt;/to&gt;
 &lt;subject&gt;Best whishes&lt;/subject&gt;
 &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="first_xml_code_magic">
            <para>The very first characters <code>&lt;?xml</code> may be
            regarded as a <link
            xlink:href="http://en.wikipedia.org/wiki/Magic_number_(programming)">magic
            number string</link> being used as a format indicator which allows
            to distinguish between different file types i.e. GIF, JPEG, HTML
            and so on.</para>
          </callout>

          <callout arearefs="first_xml_code_version">
            <para>The <code>version="1.0"</code> attribute tells us that all
            subsequent lines will conform to the <link
            xlink:href="http://www.w3.org/TR/xml">XML</link> standard of
            version 1.0. This way a document can express its conformance to
            the version 1.0 standard even if in the future this standard
            evolves to a higher version e.g.
            <code>version="2.1"</code>.</para>
          </callout>

          <callout arearefs="first_xml_code_encoding">
            <para>The attribute <code>encoding="UTF-8"</code> tells us that
            all text in the current document uses <link
            xlink:href="http://unicode.org">Unicode</link> encoding. <link
            xlink:href="http://unicode.org">Unicode</link> is a widely
            accepted industry standard for font encoding. Thus European,
            Cyrillic and most Asian font codes are allowed to be used in
            documents <emphasis>simultaneously</emphasis>. Other encodings may
            limit the set of allowed characters, e.g.
            <code>encoding="ISO-8859-1"</code> will only allow characters
            belonging to western European languages. However a system also
            needs to have the corresponding fonts (e.g. TrueType) being
            installed in order to render the document appropriately. A
            document containing Chinese characters is of no use if the
            underlying rendering system lacks e.g. a set of Chinese True Type
            fonts.</para>
          </callout>

          <callout arearefs="first_xml_code_topelement">
            <para>An XML document has exactly one top level
            <emphasis>node</emphasis>. In contrast to the HTML standard these
            nodes are commonly called elements rather than tags. In this
            example the top level (root) element is <tag
            class="starttag">memo</tag>.</para>
          </callout>

          <callout arearefs="first_xml_code_from">
            <para>Each XML element like <tag class="starttag">from</tag> has a
            corresponding counterpart <tag class="endtag">from</tag>. In terms
            of XML we say each element being opened has to be closed. In
            conjunction with the precedent point this is equivalent to the
            fact that each XML document represents a tree structure as being
            shown in the <link linkend="mathmltree">tree graph</link>
            representation.</para>
          </callout>
        </calloutlist>

        <para>As with the introductory formula example this representation
        itself is of limited usefulness: In an office environment we need a
        rendered version being given either as print or as some online format
        like E-Mail or HTML.</para>

        <para>From a software developer's point of view we may use a piece of
        software called a <emphasis>parser</emphasis> to test the document's
        standard conformance. At the MI department we may simply invoke
        <userinput><command>xmlparse</command> message.xml</userinput> to
        start a check:</para>

        <programlisting><errortext>goik&gt;xmlparse wellformed.xml
Parsing was successful</errortext></programlisting>

        <para>Various XML related plugins are supplied for the <productname
        xlink:href="http://eclipse.org">eclipse platform</productname> like
        the <productname xlink:href="http://oxygenxml.com">Oxygen
        software</productname> supplying <quote>life</quote> conformance
        checking while editing XML documents. Now we test our assumptions by
        violating some of the rules stated before. We deliberately omit the
        closing element <tag class="endtag">from</tag>:</para>

        <figure xml:id="omitFrom">
          <title>An invalid XML document due to the omission of <tag
          class="endtag">from</tag>.</title>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;memo&gt;
 &lt;from&gt;M. Goik <co xml:id="omitFromMissingElement"/>
 &lt;to&gt;B. King&lt;/to&gt;
 &lt;to&gt;A. June&lt;/to&gt;
 &lt;subject&gt;Best whishes&lt;/subject&gt;
 &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>

          <calloutlist>
            <callout arearefs="omitFromMissingElement">
              <para>The opening element <tag class="starttag">from</tag> is
              not terminated by <tag class="endtag">from</tag>.</para>
            </callout>
          </calloutlist>
        </figure>

        <para>Consequently the parser's output reads:</para>

        <programlisting><errortext>goik&gt;xmlparse omitfrom.xml
file:///ma/goik/workspace/Vorlesungen/Input/Memo/omitfrom.xml:8:3: 
fatal error org.xml.sax.SAXParseException: The element type "from"
must be terminated by the matching end-tag "&lt;/from&gt;". parsing error</errortext></programlisting>

        <para>Experienced HTML authors may be confused: In fact HTML is not an
        XML standard. Instead HTML belongs to the set of SGML applications.
        SGML is a much older standard namely the <emphasis>Standard
        Generalized Markup Language</emphasis>.</para>

        <para>Even if every XML element has a closing counterpart the
        resulting XML may be invalid:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;memo&gt;
 &lt;from&gt;M. Goik&lt;to&gt;B. King&lt;/from&gt;&lt;/to&gt;
 &lt;to&gt;A. June&lt;/to&gt;
 &lt;subject&gt;Best whishes&lt;/subject&gt;
 &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>

        <para>The parser echoes:</para>

        <programlisting><computeroutput>file:///ma/goik/workspace/Vorlesungen/Input/Memo/nonest.xml:3:29:
fatal error org.xml.sax.SAXParseException: The element type "to" must be
terminated by the matching end-tag "&lt;/to&gt;". parsing error</computeroutput></programlisting>

        <para>This type of error is caused by so called improper nesting of
        elements: The element <tag class="starttag">from</tag>is closed before
        the <quote>inner</quote> element <tag class="starttag">to</tag> has
        been closed. Actually this violates the expressibility of XML
        documents as a tree like structure. The situation may be resolved by
        choosing:</para>

        <programlisting>...&lt;from&gt;M. Goik&lt;to&gt;B. King&lt;/to&gt;&lt;/from&gt;...</programlisting>

        <para>We provide two examples illustrating proper and improper nesting
        of XML documents:</para>

        <figure xml:id="fig_nestingProper">
          <title>Proper nesting of XML elements</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/propernest.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>The following example violates proper nesting constraint and
        thus does not provide an XML document:</para>

        <figure xml:id="fig_improperNest">
          <title>Improperly nested elements</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/impropernest.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <!-- goik:later
      <para>An animation showing the usage of the Oxygen plug in for the
      examples given above can be found <uri
      xlink:href="src/viewlet/wellformed/wellformed_viewlet_swf.html">here</uri>.</para>
-->

        <para>XML elements may have so called attributes like <tag
        class="attribute">date</tag> in the following example:</para>

        <figure xml:id="memoWellAttrib">
          <title>An XML document with attributes.</title>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;memo date="10.02.2006" priority="high"&gt;
  &lt;from&gt;M. Goik&lt;/from&gt;
  &lt;to&gt;B. King&lt;/to&gt;
  &lt;to&gt;A. June&lt;/to&gt;
  &lt;subject&gt;Best whishes&lt;/subject&gt;
  &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>
        </figure>

        <para>The conformance of a XML document with the following rules may
        be verified by invoking a parser:</para>

        <itemizedlist>
          <listitem>
            <para>Within the <emphasis>scope</emphasis> of a given element an
            attribute name must be unique. In the example above one may not
            define a second attribute <varname>date="..."</varname> within the
            same element &lt;memo ... &gt;. This reflects the usual
            programming language semantics of attributes: In a <link
            linkend="gloss_Java"><trademark>Java</trademark></link> class an
            attribute is represented by an unique identifier and thus cannot
            appear twice.</para>
          </listitem>

          <listitem>
            <para>An attribute value must be enclosed either in single (') or
            double (") quotes. This is different from the HTML standard which
            allows attribute values without quotes provided the given
            attribute value does not give rise to ambiguities. For example
            &lt;TD align=left&gt; is allowed since the attribute value <tag
            class="attvalue">left</tag> does not contain any spaces thus
            allowing a parser to recognize the end of the value's
            definition.</para>
          </listitem>
        </itemizedlist>

        <qandaset role="exercise">
          <title>A graphical representation of a memo.</title>

          <qandadiv>
            <qandaentry xml:id="example_memoAttribTree">
              <question>
                <para>Draw a graphical representation similar as in <xref
                linkend="mathmltree"/> of the memo document being given in
                <xref linkend="memoWellAttrib"/>.</para>
              </question>

              <answer>
                <para>The <link linkend="memoWellAttrib">memo
                document's</link> structure may be visualized as:</para>

                <informalfigure xml:id="memotreeFigure">
                  <para>A graphical representation of <xref
                  linkend="memoWellAttrib"/>:</para>

                  <informalfigure xml:id="memotreeFigureFalse">
                    <mediaobject>
                      <imageobject>
                        <imagedata fileref="Ref/Fig/memotree.fig"/>
                      </imageobject>
                    </mediaobject>
                  </informalfigure>

                  <para>The sequence of <emphasis>element</emphasis> child
                  nodes is important in XML and has to be preserved. Only the
                  order of the two attributes <tag
                  class="attribute">date</tag> and <tag
                  class="attribute">priority</tag> is undefined: They actually
                  belong to the <tag class="starttag">memo</tag> node serving
                  as a dictionary with the attribute names being the keys and
                  the attribute values being the values of the
                  dictionary.</para>
                </informalfigure>
              </answer>
            </qandaentry>

            <qandaentry xml:id="example_attribInQuotes">
              <question>
                <label>Attributes and quotes</label>

                <para>As stated before XML attributes have to be enclosed in
                single or double quotes. Construct an XML document with mixed
                quotes like <code>&lt;date day="monday'&gt;</code>. How does
                the parser react? Find the corresponding syntax definition of
                legal attribute values in the <link
                xlink:href="http://www.w3.org/TR/xml">XML standard W3C
                Recommendation</link>.</para>
              </question>

              <answer>
                <para>The parser flags a mixture of single and double quotes
                for a given attribute as an error. The XML standard <link
                xlink:href="http://www.w3.org/TR/xml#NT-AttValue">defines</link>
                the syntax of attribute values: An attribute value has to be
                enclosed <emphasis>either</emphasis> in two single
                <emphasis>or</emphasis> in two double quotes as being defined
                in <uri
                xlink:href="http://www.w3.org/TR/xml/#NT-AttValue">http://www.w3.org/TR/xml/#NT-AttValue</uri>.</para>
              </answer>
            </qandaentry>

            <qandaentry xml:id="quoteInAttributValue">
              <question>
                <label>Quotes as part of an attributes value?</label>

                <para>Single and double quote are used to delimit an attribute
                value. May quotes appear themselves as part of an at tribute's
                value, e.g. like in a person's name <code>Gary "King"
                Mandelson</code>?</para>
              </question>

              <answer>
                <para>Attribute values may contain double quotes if the
                attributes value is enclosed in single quotes and vice versa.
                As a limitation the value of an an attribute may not contain
                single quotes and double quotes at the same time:</para>

                <informalfigure xml:id="exampleSingleDoubleQuotes">
                  <para>Quotes as part of attribute values.</para>

                  <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;test&gt;
  &lt;person name='Gary "King" Mandelson'/&gt; &lt;!-- o.k. --&gt;
  &lt;person name="Gary 'King' Mandelson"/&gt; &lt;!-- o.k. --&gt;
  &lt;person name="Gary 'King 'S.' "Mandelson"'/&gt; &lt;!-- oops! --&gt;
&lt;/test&gt;</programlisting>
                </informalfigure>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>

        <para>Some constraints being imposed on XML documents by the standard
        defined so far may be summarized as:</para>

        <itemizedlist>
          <listitem>
            <para>A XML documents requires to have exactly one top level
            element.</para>
          </listitem>

          <listitem>
            <para>Elements have to be properly nested. An element must not be
            closed if an <quote>inner</quote> Element is still open.</para>
          </listitem>

          <listitem>
            <para>Attribute names within a given Element must be
            unique.</para>
          </listitem>

          <listitem>
            <para>Attribute values <emphasis>must</emphasis> be quoted
            correctly.</para>
          </listitem>
        </itemizedlist>

        <para>The very last rule shows one of several differences to the HTML
        Standard: In HTML a lot of elements don't have to be closed. For
        example paragraphs (<tag class="starttag">p</tag>) or images (<tag
        class="starttag">img src='foo.gif'</tag>) don't have to be closed
        explicitly. This is due to the fact that HTML used to be defined in
        accordance with the older <emphasis><emphasis
        role="bold">S</emphasis>tandard <emphasis
        role="bold">G</emphasis>eneralized <emphasis
        role="bold">M</emphasis>arkup <emphasis
        role="bold">L</emphasis>anguage</emphasis> (SGML) Standard.</para>

        <para>These constraints are part of the definition of a <link
        xlink:href="http://www.w3.org/TR/xml#sec-well-formed">well formed
        document</link>. The specification imposes additional constraints for
        a document to be well-formed. Some of these constraints require an
        understanding of so called entities being described in <xref
        linkend="chapter_entities"/>.</para>
      </section>
    </chapter>

    <chapter xml:id="dtd">
      <title>Beyond well- formedness</title>

      <section xml:id="motivationDdt">
        <title>Motivation</title>

        <para>So far we are able to create XML documents containing
        hierarchically structured data. We may nest elements and thus create
        tree structures of arbitrary depth. The only restrictions being
        imposed by the XML standard are the constraints of well - formedness.
        For many purposes in software development this is not
        sufficient.</para>

        <para>A company named <productname>Softmail</productname> might
        implement an email system which uses <link
        linkend="memoWellAttrib">memo</link> document files as low level data
        representation serving as a persistence layer. Now a second company
        named <productname>Hardmail</productname> wants to integrate mails
        generated by <productname>Softmail</productname>'s system into its own
        business product. The <productname>Hardmail</productname> software
        developers might <emphasis>infer</emphasis> the logical structure of
        <productname>Softmail</productname>'s email representation but the
        following problems arise:</para>

        <itemizedlist>
          <listitem>
            <para>The logical structure will in practice become more complex:
            E-mails may contain attachments leading to multi part messages.
            Additional header information is required for standard Internet
            mail compliance. This adds additional complexity to the XML
            structure being mandatory for data representation. Relying only on
            well-formedness the specification of an internal E-mail format can
            only be achieved <emphasis>informally</emphasis>. Thus a rule like
            <quote>Each E-mail must have a subject</quote> may be written down
            in the specification. A software developer will code these rules
            but probably make mistakes as the set of rules grows.</para>

            <para>In contrast a RDBMS based solution offers to solve such
            problems in a declarative manner: A developer may use a <code>NOT
            NULL</code> constraint on a subject attribute of type
            <code>VARCHAR</code> thus inhibiting empty subjects.</para>
          </listitem>

          <listitem>
            <para>As <productname>Softmail</productname>'s product evolves its
            internal E-mail XML format is subject to change due to functional
            extensions and possibly bug fixes both giving rise to
            interoperability problems.</para>
          </listitem>
        </itemizedlist>

        <para>Generally speaking well formed XML documents lack grammar
        constraints as being available for programming languages. In case of
        RDBMS developers can impose primary-, foreign and <code>CHECK</code>
        constraints in a <emphasis>declarative</emphasis> manner rather than
        hard coding them into their applications (A solution bad programmers
        are in favour of though...). Various XML standards exist for
        declarative constraint definitions namely:</para>

        <itemizedlist>
          <listitem>
            <para>Document Type Definitions being discussed in <xref
            linkend="dtdBasic"/>.</para>
          </listitem>

          <listitem>
            <para><link xlink:href="http://www.w3.org/XML/Schema">XML
            Schema</link></para>
          </listitem>

          <listitem>
            <para><link
            xlink:href="http://www.relaxng.org">RelaxNG</link></para>
          </listitem>
        </itemizedlist>
      </section>

      <section xml:id="dtdBasic">
        <title>Document type definitions (DTD)</title>

        <section xml:id="dtdFirstExample">
          <title>Structural descriptions for documents</title>

          <para>As an example we choose documents of type
          <emphasis>memo</emphasis> as a starting point. Documents like the
          example from <xref linkend="memoWellAttrib"/> may be
          <emphasis>informally</emphasis> described to be a sequence of the
          following mandatory items:</para>

          <figure xml:id="figure_memo_informalconstraints">
            <title>Informal constraints on <tag class="element">memo</tag>
            document instances</title>

            <itemizedlist>
              <listitem>
                <para><emphasis>Exactly one</emphasis> sender.</para>
              </listitem>

              <listitem>
                <para><emphasis>One or more</emphasis> recipients.</para>
              </listitem>

              <listitem>
                <para>Subject</para>
              </listitem>

              <listitem>
                <para>Content</para>
              </listitem>
            </itemizedlist>

            <para>In addition we have:</para>

            <itemizedlist>
              <listitem>
                <para>A date string <emphasis>must</emphasis> be
                supplied</para>
              </listitem>

              <listitem>
                <para>A priority <emphasis>may</emphasis> be supplied with
                allowed values to be chosen from the set of values <tag
                class="attvalue">low</tag>, <tag class="attvalue">medium</tag>
                or <tag class="attvalue">high</tag>.</para>
              </listitem>
            </itemizedlist>
          </figure>

          <para>All these fields contain ordinary text to be filled in by a
          user and shall appear exactly in the defined order. For simplicity
          we do not care about email address syntax rules being described in
          <link xlink:href="http://www.w3.org/Protocols/rfc822">RFC based
          address schemes</link>. We will see how the
          <emphasis>constraints</emphasis> mentioned above can be modelled in
          XML by an extension to the concept of well formed documents.</para>
        </section>

        <section xml:id="section_memo_machinereadable">
          <title>A machine readable description</title>

          <para>We now introduce an example of a <link
          xlink:href="http://www.w3.org/TR/xml#dt-doctype">Document Type
          Definition (DTD)</link> being part of the XML 1.0 standard. Such a
          DTD allows the specification of additional constraints to both
          element nodes and their attributes. Our set of <link
          linkend="figure_memo_informalconstraints" revision="">informal
          constraints</link> on memo documents may now be expressed as:</para>

          <figure xml:id="figure_memo_dtd">
            <title>A DTD to describe memo documents.</title>

            <programlisting>&lt;!ELEMENT memo     (from, to+, subject, content)&gt; <co
                xml:id="memodtd_memodef"/>
            
&lt;!ATTLIST memo  <co xml:id="memodtd_memo_attribs"/>
  date     CDATA             #REQUIRED
  priority (low|medium|high) #IMPLIED&gt;
            
&lt;!ELEMENT from     (#PCDATA)&gt; <co xml:id="memodtd_elem_from"/>
&lt;!ELEMENT to       (#PCDATA)&gt;
&lt;!ELEMENT subject  (#PCDATA)&gt;
&lt;!ELEMENT content  (#PCDATA)&gt;</programlisting>

            <calloutlist>
              <callout arearefs="memodtd_memodef">
                <para>A <tag class="element">memo</tag> consists of a sender,
                at least one recipient, a subject and content.</para>
              </callout>

              <callout arearefs="memodtd_memo_attribs">
                <para>A <tag class="element">memo</tag> has got one required
                attribute <varname>date</varname> and an optional attribute
                <varname>priority</varname> being restricted to the three
                allowed values <tag class="attvalue">low</tag>, <tag
                class="attvalue">medium</tag> and <tag
                class="attvalue">high</tag>.</para>
              </callout>

              <callout arearefs="memodtd_elem_from">
                <para>A <tag class="starttag">from</tag> element consists of
                ordinary text. This disallows XML markup. For example
                <code>&lt;from&gt;Smith &amp; partner&lt;/from&gt;</code> is
                disallowed since XML uses the ampersand (&amp;) to denote the
                beginning of an entity like <tag class="genentity">auml</tag>
                for the German a-umlaut (ä). The correct form is
                <code>&lt;from&gt;Smith &amp;amp; partner&lt;/from&gt;</code>
                using the predefined entity <tag class="genentity">amp</tag>
                as an escape sequence for the ampersand.</para>

                <para>The term <code>#PCDATA</code> is an acronym for
                <emphasis>P</emphasis><foreignphrase>arsed</foreignphrase>
                <emphasis>C</emphasis><foreignphrase>haracter</foreignphrase>
                <emphasis>Data</emphasis>, an abbreviation for a restricted
                version of ordinary strings. Without digging into details a
                <code>#PCDATA</code> string must not contain any markup code
                like e.g. <tag class="starttag">msqrt</tag>. This ensures that
                a string does not interfere with the document's XML markup.
                Parsed Character Data also means that from the viewpoint of
                XML the element's content is <emphasis>atomic</emphasis> so it
                can't be divided into substructures by an XML parser.</para>
              </callout>
            </calloutlist>
          </figure>

          <para>We notice the non-XML syntax of a DTD. It looks similar to an
          XML document (&lt;!ELEMENT ...&gt;) but in fact it is not even
          well-formed due to e.g. the exclamation mark in front of the
          <code>ELEMENT</code> keyword. <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          use a different syntax which has been specified in order to describe
          an XML document's grammar.</para>

          <para>From the viewpoint of software modeling a DTD is a
          <emphasis>schema</emphasis>. In the context of XML technologies the
          term <emphasis>schema</emphasis> refers to <link
          xlink:href="http://www.w3.org/XML/Schema">XML Schema</link> being an
          alternative language to describe the structure of XML
          documents.</para>

          <para>Readers being familiar with <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Backus-Naur_form">BNF</abbrev>
          or <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Extended_Backus_Naur_form">EBNF</abbrev>
          will be able to understand the grammatical rules being expressed
          here.</para>

          <productionset>
            <title>A message of type <tag class="starttag">memo</tag></title>

            <production xml:id="memo.ebnf.memo">
              <lhs>Memo Message</lhs>

              <rhs>'&lt;memo&gt;' <nonterminal
              def="#memo.ebnf.sender">Sender</nonterminal> [<nonterminal
              def="#memo.ebnf.recipient">Recipient</nonterminal>]+
              <nonterminal def="#memo.ebnf.subject">Subject</nonterminal>
              <nonterminal def="#memo.ebnf.content">Content</nonterminal>
              '&lt;/memo&gt;'</rhs>
            </production>

            <production xml:id="memo.ebnf.sender">
              <lhs>Sender</lhs>

              <rhs>'&lt;from&gt;' <nonterminal def="#memo.ebnf.text"> Text
              </nonterminal> '&lt;/from&gt;'</rhs>
            </production>

            <production xml:id="memo.ebnf.recipient">
              <lhs>Recipient</lhs>

              <rhs>'&lt;to&gt;' <nonterminal def="#memo.ebnf.text"> Text
              </nonterminal> '&lt;/to&gt;'</rhs>
            </production>

            <production xml:id="memo.ebnf.subject">
              <lhs>Subject</lhs>

              <rhs>'&lt;subject&gt;' <nonterminal def="#memo.ebnf.text"> Text
              </nonterminal> '&lt;/subject&gt;'</rhs>
            </production>

            <production xml:id="memo.ebnf.content">
              <lhs>Content</lhs>

              <rhs>'&lt;content&gt;' <nonterminal def="#memo.ebnf.text"> Text
              </nonterminal> '&lt;/content&gt;'</rhs>
            </production>

            <production xml:id="memo.ebnf.text">
              <lhs>Text</lhs>

              <rhs>[a-zA-Z0-9]* <lineannotation>In real documents this is too
              restrictive!</lineannotation></rhs>
            </production>
          </productionset>

          <para>In comparison to our informal description of memo documents a
          DTD offers an added value: The grammar is machine readable and may
          thus be used by a parser to check whether an XML document obeys the
          constraints being imposed. So the parser must be instructed to use a
          DTD in addition to the XML document in question. For this purpose an
          XML document may define a reference to a DTD:</para>

          <figure xml:id="memo_external_dtd">
            <title>A memo document instance holding a reference to a document
            external DTD.</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE memo<co xml:id="memo_external_dtd_top_element"/> SYSTEM<co
                xml:id="memo_external_dtd_system_decl"/> "memo.dtd"<co
                xml:id="memo_external_dtd_url"/> &gt;
&lt;memo date="10.02.2006" priority="high"&gt;
  &lt;from&gt;M. Goik&lt;/from&gt;
  &lt;to&gt;B. King&lt;/to&gt;
  &lt;to&gt;A. June&lt;/to&gt;
  &lt;subject&gt;Best whishes&lt;/subject&gt;
  &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>

            <calloutlist>
              <callout arearefs="memo_external_dtd_top_element">
                <para>The element <tag class="element">memo</tag> is chosen to
                be the top (root) element of the document's tree. It must be
                defined in the file <filename>memo.dtd</filename>. This is
                really a choice since a DTD defines a <emphasis>set</emphasis>
                of elements in <emphasis>arbitrary</emphasis> order. There is
                no such rule as <quote>define before use</quote>. So a DTD
                does not tell us which element has to appear on top of a
                document.</para>

                <para>Suppose a given DTD offers both <tag
                class="starttag">book</tag> and <tag
                class="starttag">report</tag> elements. An XML author writing
                a complex document will choose <tag
                class="starttag">book</tag> as top level element rather than
                <tag class="starttag">report</tag> being more appropriate for
                a small piece of documentation. Consequently it is an XML
                authors <emphasis>choice</emphasis> which of the elements
                being defined in a DTD shall appear as
                <emphasis>the</emphasis> top level element</para>
              </callout>

              <callout arearefs="memo_external_dtd_system_decl">
                <para>The <code>SYSTEM</code> keyword states that the DTD
                rules reside outside the XML document as a separate entity.
                Though this situation is the most common the grammar rules may
                also be <link linkend="dtd_and_document">defined inside</link>
                the XML document itself. For professional use this is not
                particularly useful but during DTD development it may be an
                option.</para>
              </callout>

              <callout arearefs="memo_external_dtd_url">
                <para>The address of the DTD rule set. In the given example it
                is just a filename but it may as well be an <link
                xlink:href="http://www.w3.org/Addressing">URL</link> of type
                <abbrev
                xlink:href="http://en.wikipedia.org/wiki/File_Transfer_Protocol">ftp</abbrev>,
                <abbrev xlink:href="http://www.w3.org/Protocols">http</abbrev>
                and so on, see <xref linkend="memoDtdOnFtp"/>.</para>
              </callout>
            </calloutlist>
          </figure>

          <para>In presence of a DTD parsing a document is actually a two step
          process: First the parser will check the document for well
          -formedness. Then the parser will read the referenced DTD (memo.dtd)
          and check the document for the additional constraints being defined
          there.</para>

          <para>In the current example both the DTD and the XML memo document
          reside as text files in a common file system folder. For general use
          a DTD is usually kept at a centralized location. The string
          following the <code>SYSTEM</code> keyword is actually a
          <emphasis>U</emphasis><foreignphrase>niform</foreignphrase>
          <emphasis>R</emphasis><foreignphrase>esource</foreignphrase>
          <emphasis>L</emphasis><foreignphrase>ocator</foreignphrase> <link
          xlink:href="http://www.w3.org/Addressing">(URL)</link>. Thus our
          <filename>memo.dtd</filename> may also be supplied as a <abbrev
          xlink:href="http://www.w3.org/Protocols">http</abbrev> or <abbrev
          xlink:href="http://en.wikipedia.org/wiki/File_Transfer_Protocol">ftp</abbrev>
          <link xlink:href="http://www.w3.org/Addressing">URL</link>:</para>

          <figure xml:id="memoDtdOnFtp">
            <title>A DTD reference to a FTP server.</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE memo SYSTEM "ftp://www.hdm-stuttgart.de/memo.dtd"&gt;
&lt;memo date="10.02.2006" priority="high"&gt;
  &lt;from&gt;M. Goik&lt;/from&gt;
  ...
&lt;/memo&gt;</programlisting>
          </figure>

          <para>For development purposes we may combine a DTD and a conforming
          document into a single unit. This is achieved by in line replacing
          the <code>SYSTEM "memo.dtd"</code> clause by the DTD itself:</para>

          <figure xml:id="dtd_and_document">
            <title>DTD and document within the same file</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE memo [<co xml:id="memo_inline_dtd_start"/>
&lt;!ELEMENT memo     (from, to+, subject, content)&gt;

&lt;!ATTLIST memo date     CDATA             #REQUIRED
               priority (low|medium|high) #IMPLIED&gt;

&lt;!ELEMENT from     (#PCDATA)&gt;
&lt;!ELEMENT to       (#PCDATA)&gt;
&lt;!ELEMENT subject  (#PCDATA)&gt;
&lt;!ELEMENT content  (#PCDATA)&gt;
]<co xml:id="memo_inline_dtd_end"/>&gt; <co xml:id="memo_inline_doc_start"/>
&lt;memo date="10.02.2006" priority="high"&gt; 
    &lt;from&gt;M. Goik&lt;/from&gt;
    &lt;to&gt;B. King&lt;/to&gt;
    &lt;to&gt;A. June&lt;/to&gt;
    &lt;subject&gt;Best whishes&lt;/subject&gt;
    &lt;content&gt;Hi all, congratulations to your splendid party&lt;/content&gt;
&lt;/memo&gt;</programlisting>

            <calloutlist>
              <callout arearefs="memo_inline_dtd_start">
                <para>The DTD definitions start right after the left bracket
                <quote>[</quote> thus replacing the <code>SYSTEM
                "memo.dtd"</code> declaration.</para>
              </callout>

              <callout arearefs="memo_inline_dtd_end">
                <para>The right bracket <quote>]</quote> terminates the DTD
                declarations. After finishing the <code>&lt;!DOCTYPE ...
                &gt;</code> declaration the document's content starts.</para>
              </callout>

              <callout arearefs="memo_inline_doc_start">
                <para>Start of document content.</para>
              </callout>
            </calloutlist>
          </figure>

          <para>Some terms are helpful in the context of <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s:</para>

          <variablelist>
            <varlistentry>
              <term>Validating / non-validating:</term>

              <listitem>
                <para>A non-validating parser only checks a document for well-
                formedness. If it also checks XML documents for conformance to
                DTD it is a <emphasis>validating</emphasis> parser. Caution:
                Even a non-validating parser needs to read a DTD (if being
                supplied) since it might have to expand <link
                linkend="section_generalentities">general entity</link>
                declarations being defined in it.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>Valid / invalid documents:</term>

              <listitem>
                <para>An XML document referencing a DTD may either be valid or
                invalid depending on its conformance to the DTD in
                question.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>Document instance:</term>

              <listitem>
                <para>An XML memo document may conform to the <link
                linkend="figure_memo_dtd">memo DTD</link>. In this case we
                call it a <emphasis>document instance</emphasis> of the memo
                DTD.</para>

                <para>This situation is quite similar as in typed programming
                languages: A <link
                linkend="gloss_Java"><trademark>Java</trademark></link>
                <code>class</code> declaration is a blueprint for the <link
                linkend="gloss_Java"><trademark>Java</trademark></link>
                runtime system to construct <link
                linkend="gloss_Java"><trademark>Java</trademark></link>
                objects in memory. This is done by e.g. a statement<code>
                String name = new String();</code>. The identifier
                <code>name</code> will hold a reference to an
                <emphasis>instance of class String</emphasis>. So in a <link
                linkend="gloss_Java"><trademark>Java</trademark></link>
                runtime environment a class declaration plays the same role as
                a DTD declaration in XML. See also <xref
                linkend="example_memoJavaClass"/>.</para>
              </listitem>
            </varlistentry>
          </variablelist>

          <para>For further discussions it is very useful to clearly
          distinguish element definitions in a DTD from their
          <emphasis>realizations</emphasis> in a corresponding document
          instance: Our memo DTD defines an element <tag
          class="starttag">from</tag> to be of content <code>#PCDATA</code>.
          According to the DTD in a document instance at least one <tag
          class="starttag">from</tag> clause must appear. If we were talking
          about HTML document instances we would prefer to talk about a <tag
          class="starttag">from</tag> <emphasis>tag</emphasis> rather than a
          <tag class="starttag">from</tag>
          <emphasis>element</emphasis>.</para>

          <para>In this document we will use the term <emphasis>element
          type</emphasis> to denote an <code>&lt;!ELEMENT ...</code>
          definition in a DTD. Thus we will talk about an element type <tag
          class="element">subject</tag> being defined in
          <filename>memo.dtd</filename>.</para>

          <para>An element type being defined in a <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>
          may have document instances as realizations. For example the
          document instance shown in <xref linkend="memo_external_dtd"/> has
          two <emphasis>nodes</emphasis> of element type <tag
          class="element">to</tag>. Thus we say that the document instance
          contains two <emphasis>element nodes</emphasis> of type <tag
          class="element">to</tag>. We will frequently abbreviate this by
          saying the instance contains to <tag class="starttag">from</tag>
          element nodes. And we may even omit the term
          <emphasis>nodes</emphasis> and simply talk about two <tag
          class="starttag">from</tag> elements. But the careful reader should
          always distinguish between a single type <code>foo</code> being
          defined in a <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>
          and the possibly empty set of <tag class="starttag">foo</tag> nodes
          appearing in valid document instances.</para>

          <para><abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          appear on top of well-formed XML documents:</para>

          <figure xml:id="wellformedandvalid">
            <title>Well-formed and valid documents</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/wellformedandvalid.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <qandaset role="exercise">
            <title>Validation of memo document instances.</title>

            <qandadiv>
              <qandaentry xml:id="example_memoTestValid">
                <question>
                  <para>Copy the two files <link
                  xlink:href="Ref/src/Memo.1/message.xml">message.xml</link>
                  and <link
                  xlink:href="Ref/src/Memo.1/memo.dtd">memo.dtd</link> into
                  your eclipse project. Use the Oxygen XML plug in to check if
                  the document is valid. Then subsequently do and undo the
                  following changes each time checking the document for
                  validity:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Omit the <tag class="starttag">from</tag>
                      element.</para>
                    </listitem>

                    <listitem>
                      <para>Change the order of the two sub elements <tag
                      class="starttag">subject</tag> and <tag
                      class="starttag">content</tag>.</para>
                    </listitem>

                    <listitem>
                      <para>Erase the <varname>date</varname> attribute and
                      its value.</para>
                    </listitem>

                    <listitem>
                      <para>Erase the <varname>priority</varname> attribute
                      and its value.</para>
                    </listitem>
                  </itemizedlist>

                  <para>What do you observe?</para>
                </question>

                <answer>
                  <para>The <tag class="attribute">priority</tag> attribute is
                  declared as <code>#IMPLIED</code> so it may be omitted.
                  Erasing the <tag class="attribute">priority</tag> attribute
                  thus leaves the document in a valid state. The remaining
                  three edit actions yield an invalid document
                  instance.</para>
                </answer>
              </qandaentry>

              <qandaentry xml:id="example_memoJavaClass">
                <question>
                  <label>A memo implementation sketch in Java</label>

                  <para>The aim of this exercise is to clarify the (abstract)
                  relation between XML <abbrev
                  xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
                  and sets of <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  classes rather then building a running application. We want
                  to model the <link xlink:href="Ref/src/Memo.1/memo.dtd">memo
                  DTD</link> as a set of <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  classes.</para>
                </question>

                <answer>
                  <para>The XML attributes <tag class="attribute">date</tag>
                  and <tag class="attribute">priority</tag> can be mapped as
                  <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  attributes. The same applies for the Memo elements <tag
                  class="element">from</tag>, <tag
                  class="element">subject</tag> and <tag
                  class="element">content</tag> which may be implemented as
                  simple Strings or alternatively as separate Classes wrapping
                  the String content. The latter method of implementation
                  should be preferred if the Memo DTD is expected to grow in
                  complexity. A simple sketch reads:</para>

                  <programlisting language="java">import java.util.Date;
import java.util.SortedSet;

public class Memo {
  private Date date;
  Priority priority = Priority.standard;
  private String from, subject,content;
  private SortedSet&lt;String&gt; to;
  // Accessors not yet implemented
}</programlisting>

                  <para>The only thing to note here is the implementation of
                  the <tag class="element">to</tag> element: We want to be
                  able to address a <emphasis>set</emphasis> of recipients.
                  Thus we have to disallow duplicates. Note that this is an
                  <emphasis>informal</emphasis> constraint not being handled
                  by our DTD: A Memo document instance
                  <emphasis>may</emphasis> have duplicate content in <tag
                  class="starttag">to</tag> nodes. This is a weakness of
                  <abbrev
                  xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s:
                  We are unable to impose uniqueness constraints on the
                  content of partial sets of document nodes.</para>

                  <para>On the other hand our set of recipients has to be
                  ordered: In a XML document instance the order of <tag
                  class="starttag">to</tag> nodes is important and has to be
                  preserved in a <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  representation. Thus we choose an
                  <classname>java.util.SortedSet</classname> parametrized with
                  String type to fulfill both requirements.</para>

                  <para>Our DTD defines:</para>

                  <programlisting>&lt;!ATTLIST memo ...  priority (low|medium|high) #IMPLIED&gt;</programlisting>

                  <para>Starting from <link
                  linkend="gloss_Java"><trademark>Java</trademark></link> 1.5
                  we may implement this constraint by a type safe enumeration
                  in a file <filename>Priority.java</filename>:</para>

                  <programlisting language="java">public enum Priority{low, standard, high};</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <para>In the following chapters we will extend the memo document
          type (<code>&lt;!DOCTYPE memo ... &gt;</code>) to demonstrate
          various concepts of <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          and other XML related standards. In parallel a series of exercises
          deals with building a DTD usable to edit books. This DTD gets
          extended as our knowledge about XML advances. We start with an
          initial exercise:</para>

          <qandaset role="exercise">
            <title>&gt;A DTD for editing books</title>

            <qandadiv>
              <qandaentry xml:id="example_bookDtd">
                <question>
                  <para>Write a DTD describing book document instances with
                  the following features:</para>

                  <itemizedlist>
                    <listitem>
                      <para>A book shall have a title to describe the book
                      itself.</para>
                    </listitem>

                    <listitem>
                      <para>A book shall have at least one but possibly a
                      sequence of chapters.</para>
                    </listitem>

                    <listitem>
                      <para>Each chapter shall have a title and at least one
                      paragraph.</para>
                    </listitem>

                    <listitem>
                      <para>The titles and paragraphs shall consist of
                      ordinary text.</para>
                    </listitem>
                  </itemizedlist>
                </question>

                <answer>
                  <para>A possible DTD looks like:</para>

                  <figure xml:id="figure_book.dtd_v1">
                    <title>A first DTD version for book documents</title>

                    <programlisting>&lt;!ELEMENT book     (title, chapter+)&gt;
&lt;!ELEMENT chapter  (title, para+)&gt;
&lt;!ELEMENT title    (#PCDATA)&gt;
&lt;!ELEMENT para     (#PCDATA)&gt;</programlisting>
                  </figure>

                  <para>We supply a valid document instance:</para>

                  <informalfigure xml:id="bookInitialInstance">
                    <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE book SYSTEM "book.dtd"&gt;
&lt;book&gt;
  &lt;title&gt;Introduction to Java&lt;/title&gt;
  &lt;chapter&gt;
    &lt;title&gt;Introduction&lt;/title&gt;
    &lt;para&gt;Java is a programming language&lt;/para&gt;
  &lt;/chapter&gt;
  &lt;chapter&gt;
     &lt;title&gt;The virtual machine&lt;/title&gt;
     &lt;para&gt;We also call it the runtime system.&lt;/para&gt;
  &lt;/chapter&gt;
  &lt;chapter&gt;
    &lt;title&gt;Annotations&lt;/title&gt;
    &lt;para&gt;Annotations provide a means to add meta information.&lt;/para&gt;
    &lt;para&gt;This is especially useful for framework authors.&lt;/para&gt;
  &lt;/chapter&gt;
&lt;/book&gt;</programlisting>
                  </informalfigure>

                  <para>.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="dtdVsSqlDdl">
          <title>Relating <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          and <acronym
          xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym> -
          <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev></title>

          <para>XML <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          and <acronym
          xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym> -
          <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev>
          are related: They both describe data models and thus integrity
          constraints. We consider a simple invoice example:</para>

          <figure xml:id="invoiceIntegrity">
            <title>Invoice integrity constraints</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/invoicedata.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>A relational implementation may look like:</para>

          <figure xml:id="invoiceSqlDdl">
            <title>Relational implementation</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/invoicedataimplement.fig"
                           scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>This data model can be expressed in XML as well:</para>

          <figure xml:id="invoiceXml">
            <title/>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/invoicewellformed.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <qandaset role="exercise">
            <title><abbrev
            xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
            and <acronym
            xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>-DDL</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para><xref linkend="invoiceXml"/> is a complete
                  implementation of the invoice data model including all
                  integrity constraints of <xref linkend="invoiceSqlDdl"/>.
                  Can this be achieved for arbitrary <acronym
                  xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>
                  schema's.</para>
                </question>

                <answer>
                  <para>XML <abbrev
                  xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
                  cannot express multiple foreign keys. Adding a second
                  foreign key <coref linkend="invoiceSecondFK"/> in a
                  referencing table <code>Order</code> already breaks <abbrev
                  xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>
                  expressibility:</para>

                  <programlisting>CREATE TABLE Order (
  orderNo BIGINT NOT NULL PRIMARY KEY,
  customer NUMERIC(5) NOT NULL <emphasis role="bold">REFERENCES Customer</emphasis> <co
                      xml:id="invoiceSecondFK"/>...</programlisting>

                  <remark>This actually is a deficiency of DTD's rather than
                  XML. The XML schema standard does not only allow multiple
                  foreign key definitions but polymorphic references as
                  well.</remark>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="xmlAndJava">
          <title>Relating <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          and <link linkend="gloss_Java"><trademark>Java</trademark></link>
          class descriptions.</title>

          <para>We may also compare XML data constraints to <link
          linkend="gloss_Java"><trademark>Java</trademark></link>. A <link
          linkend="gloss_Java"><trademark>Java</trademark></link> class
          declaration is actually a blueprint for a <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase">JRE</trademark>
          to instantiate compatible objects. Likewise an XML DTD restricts
          well-formed documents:</para>

          <figure xml:id="fig_XmlAndJava">
            <title>XML <abbrev
            xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
            and <link linkend="gloss_Java"><trademark>Java</trademark></link>
            class declarations.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/xmlattribandjava.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>
        </section>

        <section xml:id="section_dtdDetail">
          <title><abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
          in detail</title>

          <para>We have already seen that elements are building blocks of XML
          documents. Now we regard the formal rules that govern the way
          <code>&lt;!ELEMENT ...&gt;</code> declarations may appear in XML.
          This will lead to the notion of the term <emphasis>Content
          Model</emphasis>.</para>

          <para>Then we will shed some light on <code>&lt;!ATTRIBUTE
          ...&gt;</code> declarations. We will learn about possible attribute
          types and default values.</para>

          <para>Next we explore the <emphasis>physical</emphasis> structure of
          XML documents. We will see that <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
          and document instances may be physically subdivided into
          <emphasis>entities</emphasis> without touching their logical
          structure.</para>

          <para>Since we want to illustrate DTD grammars by <userinput
          xlink:href="http://en.wikipedia.org/wiki/Ebnf">EBNF</userinput>
          diagrams we first show some helpful non-terminals starting with the
          definition of white space. Apparently this is the same as in most
          programming languages:</para>

          <productionset>
            <title>White Space</title>

            <production xml:id="w3RecXml_NT-S">
              <lhs>S</lhs>

              <rhs>(#x20 | #x9 | #xD | #xA)+ <lineannotation>space, tabulator,
              carriage return and line feed</lineannotation></rhs>
            </production>
          </productionset>

          <para>The production rule for <code>Name</code> defines legal
          identifier names for element names like <tag
          class="element">memo</tag>. We learn that such an identifier must
          not begin with a digit. So the rule presented here resembles the
          grammar constraint on legal identifiers in the <link
          linkend="gloss_Java"><trademark>Java</trademark></link> programming
          language. The type <code>NMTOKEN</code> will be needed later when
          defining element attributes.</para>

          <productionset>
            <title>Names and Tokens</title>

            <production xml:id="w3RecXml_NT-NameChar">
              <lhs>NameChar</lhs>

              <rhs><nonterminal def="#w3RecXml_NT-Letter">Letter</nonterminal>
              | <nonterminal def="#w3RecXml_NT-Digit">Digit</nonterminal> |
              '.' | '-' | '_' | ':' | <nonterminal
              def="#w3RecXml_NT-CombiningChar"
              xlink:href="#w3RecXml_NT-CombiningChar">CombiningChar</nonterminal>
              | <nonterminal
              def="#w3RecXml_NT-Extender">Extender</nonterminal></rhs>
            </production>

            <production xml:id="w3RecXml_NT-Name">
              <lhs>Name</lhs>

              <rhs>(<nonterminal
              def="#w3RecXml_NT-Letter">Letter</nonterminal> | '_' | ':')
              (<nonterminal
              def="#w3RecXml_NT-NameChar">NameChar</nonterminal>)*</rhs>
            </production>

            <production xml:id="w3RecXml_NT-Names">
              <lhs>Names</lhs>

              <rhs><nonterminal def="#w3RecXml_NT-Name">Name</nonterminal>
              (#x20 <nonterminal
              def="#w3RecXml_NT-Name">Name</nonterminal>)*</rhs>
            </production>

            <production xml:id="w3RecXml_NT-Nmtoken">
              <lhs>Nmtoken</lhs>

              <rhs>(<nonterminal
              def="#w3RecXml_NT-NameChar">NameChar</nonterminal>)+</rhs>
            </production>

            <production xml:id="w3RecXml_NT-Nmtokens">
              <lhs>Nmtokens</lhs>

              <rhs><nonterminal
              def="#w3RecXml_NT-Nmtoken">Nmtoken</nonterminal> (#x20
              <nonterminal
              def="#w3RecXml_NT-Nmtoken">Nmtoken</nonterminal>)*</rhs>
            </production>
          </productionset>

          <section xml:id="section_contentmodel">
            <title>The content model</title>

            <para>We already saw examples of XML elements being composed of
            other elements in our <link
            linkend="figure_memo_dtd">memo.dtd</link>:</para>

            <programlisting>&lt;!ELEMENT memo     (from, to+, subject, content)&gt;</programlisting>

            <para>We call the right side the <emphasis>content
            model</emphasis> of the <tag class="element">memo</tag> element.
            The XML 1.0 specification defines <link
            xlink:href="http://www.w3.org/TR/xml#dt-eldecl">four</link>
            different <link
            xlink:href="http://www.w3.org/TR/2006/REC-xml-20060816/#elemdecls">element
            type definitions</link>:</para>

            <productionset xml:id="productionset_element_decl">
              <title>Element Type Declaration</title>

              <production xml:id="w3RecXml_NT-elementdecl">
                <lhs>elementdecl</lhs>

                <rhs>'&lt;!ELEMENT' <nonterminal
                def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                def="#w3RecXml_NT-Name">Name</nonterminal> <nonterminal
                def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                def="#w3RecXml_NT-contentspec">contentspec</nonterminal>
                <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                '&gt;'</rhs>
              </production>

              <production xml:id="w3RecXml_NT-contentspec">
                <lhs>contentspec</lhs>

                <rhs>'EMPTY' | 'ANY' | <nonterminal
                def="#w3RecXml_NT-Mixed">Mixed</nonterminal> | <nonterminal
                def="#w3RecXml_NT-children">children</nonterminal></rhs>
              </production>
            </productionset>

            <glosslist>
              <glossentry>
                <glossterm><link
                linkend="section_empty">EMPTY</link></glossterm>

                <glossdef>
                  <para>The element doesn't have any content at all. This
                  makes sense for elements with attributes being allowed as in
                  <tag class="emptytag"> img src="foo.gif"</tag>.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm><link linkend="section_any">ANY</link></glossterm>

                <glossdef>
                  <para>The element in question may contain a sequence of
                  arbitrary elements and ordinary text
                  (<code>#PCDATA</code>).</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm><nonterminal
                def="#w3RecXml_NT-Mixed">Mixed</nonterminal></glossterm>

                <glossdef>
                  <para>The element may contain an arbitrary sequence from a
                  set of child elements possibly interspersed with ordinary
                  text.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm><nonterminal
                def="#w3RecXml_NT-children">children</nonterminal></glossterm>

                <glossdef>
                  <para>An element contains <emphasis>only</emphasis> other
                  elements. A node of the element type in question may appear
                  as child of itself giving rise to recursion:</para>

                  <programlisting>...
&lt;chapter&gt;
  &lt;chapter&gt; ...&lt;/chapter&gt;
&lt;/chapter&gt;</programlisting>
                </glossdef>
              </glossentry>
            </glosslist>

            <para>All elements being declared are subject to the following
            validity constraint:</para>

            <constraintdef>
              <para>An element type MUST NOT be declared more than
              once.</para>
            </constraintdef>

            <para>Programmers will not be surprised: The above constraint is
            common to most programming languages. In <link
            linkend="gloss_Java"><trademark>Java</trademark></link> for
            example a given local variable may not be redefined:</para>

            <programlisting language="java">int count = 3;
double pi=3.1415926;
int count = 2;       // Fatal error: A variable must not be 
                     // redefined within the given scope</programlisting>

            <para>However there is no such rule like <quote>Define before
            use</quote>: Element <emphasis>and</emphasis> attribute
            definitions may refer to elements being defined
            <quote>later</quote>:</para>

            <programlisting>&lt;!ATTLIST memo<co
                xml:id="programlisting_elemattorder_memoatt"/> date     CDATA             #REQUIRED
               priority (low|medium|high) #IMPLIED&gt;

&lt;!ELEMENT memo<co xml:id="programlisting_elemattorder_memodecl"/>     (from, to+, subject, content)&gt;

&lt;!ELEMENT from     (#PCDATA)&gt;
&lt;!ELEMENT to       (#PCDATA)&gt;
&lt;!ELEMENT subject  (#PCDATA)&gt;
&lt;!ELEMENT content  (#PCDATA)&gt;</programlisting>

            <calloutlist>
              <callout arearefs="programlisting_elemattorder_memoatt">
                <para>Two attributes <varname>date</varname> and
                <varname>priority</varname> are defined for the element <tag
                class="starttag">memo</tag> which itself gets defined
                immediately <emphasis>after</emphasis> this definition.</para>
              </callout>

              <callout arearefs="programlisting_elemattorder_memodecl">
                <para>The <tag class="element">memo</tag> type definition
                refers to the element types <tag class="element">from</tag>,
                <tag class="element">to</tag>, <tag
                class="element">subject</tag> and <tag
                class="element">content</tag> all being defined
                afterwards.</para>
              </callout>
            </calloutlist>

            <section xml:id="section_empty">
              <title>The <code>EMPTY</code> declaration</title>

              <para>Element nodes of content type <code>EMPTY</code> are
              familiar from e.g. HTML:</para>

              <programlisting>...
&lt;p&gt;We saw the picture &lt;img src="person.gif"&gt; of the officer.
...</programlisting>

              <para>This code fragment shows an image embedded <emphasis>in
              line</emphasis> with the current text flow. This is possible in
              HTML being an SGML standard but it is <emphasis>not</emphasis>
              allowed in XML. Also the omission of <tag
              class="starttag">/p</tag> to close the paragraph is disallowed.
              In XML either of the two forms has to be chosen:</para>

              <itemizedlist>
                <listitem>
                  <para><code>&lt;p&gt;We saw the picture &lt;img
                  src="person.gif"&gt;&lt;/img&gt; of the
                  officer.&lt;/p&gt;</code></para>
                </listitem>

                <listitem>
                  <para><code>&lt;p&gt;We saw the picture &lt;img
                  src="person.gif"/&gt; of the
                  officer.&lt;/p&gt;</code></para>
                </listitem>
              </itemizedlist>

              <para>Using <tag class="starttag">img .../</tag> as a shorthand
              for an empty element is legal in XML but disallowed in SGML and
              thus HTML. This is one of the possible obstacles when migrating
              from SGML based HTML documents to an XML version of HTML like
              <link xlink:href="http://www.w3.org/MarkUp">Xhtml</link>. From
              <xref linkend="productionset_element_decl"/> we can infer the
              corresponding DTD declaration:</para>

              <programlisting>&lt;!ELEMENT img EMPTY&gt;</programlisting>
            </section>

            <section xml:id="section_any">
              <title>The <code>ANY</code> declaration</title>

              <para>The <code>ANY</code> declaration allows every element of a
              given DTD to appear as a child of the element being defined
              including the element itself. It is not possible to exclude
              certain elements from an <code>ANY</code> rule:</para>

              <figure xml:id="figure_any_declaration">
                <title>The <code>ANY</code> declaration</title>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE theater [
&lt;!ELEMENT theater ANY <co xml:id="figure_any_declaration_any"/> &gt;

&lt;!ELEMENT actor   (#PCDATA) <co xml:id="figure_any_declaration_actor"/> &gt;
&lt;!ELEMENT show    (#PCDATA) <co xml:id="figure_any_declaration_show"/>&gt; 
]&gt;
&lt;theater&gt;
    &lt;actor&gt;Peter Sun&lt;/actor&gt;
     some text <co xml:id="figure_any_declaration_doc_text"/>
     &lt;show&gt;Must go on&lt;/show&gt;
    &lt;theater&gt;Self referencing!&lt;/theater&gt; <co
                    xml:id="figure_any_declaration_actor_self_reference"/>
   &lt;!-- An error: --&gt;
    &lt;b&gt;Ooops, no such element defined in DTD&lt;/b&gt; <co
                    xml:id="figure_any_declaration_actor_undefined"/>
&lt;/theater&gt;</programlisting>

                <calloutlist>
                  <callout arearefs="figure_any_declaration_any">
                    <para>A <tag class="element">theater</tag> element may
                    consist of a sequence of arbitrary content. Every child
                    element must be defined in the DTD.</para>
                  </callout>

                  <callout arearefs="figure_any_declaration_actor figure_any_declaration_show">
                    <para>Two elements <tag class="element">actor</tag> and
                    <tag class="element">show</tag> consisting of mere textual
                    content.</para>
                  </callout>

                  <callout arearefs="figure_any_declaration_doc_text">
                    <para>Ordinary text may also be part of the <tag
                    class="starttag">theater</tag> element and may appear
                    everywhere.</para>
                  </callout>

                  <callout arearefs="figure_any_declaration_actor_self_reference">
                    <para>A <tag class="starttag">theater</tag> element may
                    appear as a child of itself. This gives rise to recursion
                    of arbitrary depth.</para>
                  </callout>

                  <callout arearefs="figure_any_declaration_actor_undefined">
                    <para>There is no element <tag class="starttag">b</tag>
                    defined in the DTD. Thus the current XML document is
                    invalid.</para>
                  </callout>
                </calloutlist>
              </figure>

              <para>Remark: The restriction to elements being defined in a DTD
              is common to other content model types as well. Actually every
              element being referenced by a definition in the DTD
              <emphasis>must</emphasis> itself be defined in order for the
              document to be valid.</para>
            </section>

            <section xml:id="section_mixed">
              <title>Mixed content</title>

              <para>Mixed content is similar to the ANY declaration. But the
              set of elements allowed to appear is restricted. We show an
              example:</para>

              <figure xml:id="figure_memo_content_mixed">
                <title>Extending the memo content type.</title>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE memo [
...
&lt;!ELEMENT content  (#PCDATA|emphasis|url)*&gt;
&lt;!ELEMENT emphasis (#PCDATA)&gt;
&lt;!ELEMENT url (#PCDATA)&gt;
&lt;!ATTLIST url href     CDATA  #REQUIRED&gt;
]&gt;
...
&lt;content&gt;The &lt;url href="http://w3.org/XML"&gt;XML&lt;/url&gt; language
  is &lt;emphasis&gt;easy&lt;/emphasis&gt; to learn. However you need 
  some &lt;emphasis&gt;time&lt;/emphasis&gt;.&lt;/content&gt; ...</programlisting>

                <caption>
                  <para>This grammar allows to emphasize text passages and to
                  define hypertext links.</para>
                </caption>
              </figure>

              <para>The formatting expectation is <quote>... The <link
              xlink:href="http://w3.org/XML">XML</link> language is
              <emphasis>easy</emphasis> to learn. However you need some
              <emphasis>time</emphasis>. ...</quote>. We may visualize this
              document instance as a tree:</para>

              <figure xml:id="extendContModelGraph">
                <title>Graphical representation of the extended
                <code>content</code> model.</title>

                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Fig/contentmixed.fig"/>
                  </imageobject>
                </mediaobject>
              </figure>

              <para>More formally the W3C specification defines mixed content
              models as:</para>

              <productionset xml:id="productionset_w3RecXml_NT-Mixed">
                <title>Mixed-content Declaration</title>

                <production xml:id="w3RecXml_NT-Mixed">
                  <lhs>Mixed</lhs>

                  <rhs>'(' <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  '#PCDATA' (<nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? '|' <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? <nonterminal
                  def="#w3RecXml_NT-Name">Name</nonterminal>)* <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? ')*' | '('
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? '#PCDATA'
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? ')'</rhs>
                </production>
              </productionset>

              <para>We notice that out simple <code>&lt;!ELEMENT from
              (#PCDATA)&gt;</code> is also described by this definition. It is
              just a special case of a single text node and no element nodes
              being present.</para>

              <qandaset role="exercise">
                <title>Variations of mixed content models</title>

                <qandadiv>
                  <qandaentry xml:id="example_allowed_mixed">
                    <question>
                      <para>You may assume that the element types <tag
                      class="element">emphasize</tag> and <tag
                      class="element">URL</tag> are correctly defined. Are the
                      following definitions allowed?</para>

                      <itemizedlist>
                        <listitem>
                          <para><code>&lt;! ELEMENT mix
                          (#PCDATA)*&gt;</code></para>
                        </listitem>

                        <listitem>
                          <para><code>&lt;! ELEMENT mix
                          (emphasize|#PCDATA)*&gt;</code></para>
                        </listitem>

                        <listitem>
                          <para><code>&lt;! ELEMENT mix
                          (#PCDATA|URL)&gt;</code></para>
                        </listitem>

                        <listitem>
                          <para><code>&lt;! ELEMENT mix
                          (emphasize|#PCDATA)+&gt;</code></para>
                        </listitem>
                      </itemizedlist>
                    </question>

                    <answer>
                      <programlisting>&lt;! ELEMENT mix (#PCDATA)*&gt;</programlisting>

                      <para>Valid due to syntax diagram.</para>

                      <programlisting>&lt;! ELEMENT mix (emphasize|#PCDATA)*&gt;</programlisting>

                      <para>Not valid. According to the production rule in
                      <xref linkend="productionset_w3RecXml_NT-Mixed"/> the
                      term <code>#PCDATA</code> <emphasis>must</emphasis> be
                      the first token.</para>

                      <programlisting><code>&lt;! ELEMENT mix (#PCDATA|URL)&gt;</code>, <code>&lt;! ELEMENT mix (emphasize|#PCDATA)+&gt;</code></programlisting>

                      <para>Both variants are disallowed: The indicator of
                      multiplicity <quote>*</quote> is mandatory and the only
                      legal token to appear.</para>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>
            </section>

            <section xml:id="section_element_content">
              <title>Element content</title>

              <para>We refer to our first version of our <link
              linkend="figure_memo_dtd">memo.dtd</link>. The <tag
              class="element">memo</tag> type declaration reads:</para>

              <programlisting>&lt;!ELEMENT memo     (from, to+, subject, content)&gt;</programlisting>

              <para>Basically this states that for valid document instances a
              <tag class="starttag">memo</tag> node consists of a sequence of
              other nodes. In this context we denote <tag
              class="starttag">memo</tag> as <emphasis>parent</emphasis> node.
              <tag class="element">from</tag>, <tag class="element">to</tag>,
              <tag class="element">subject</tag> and <tag
              class="element">content</tag> are called
              <emphasis>child</emphasis> nodes or
              <emphasis>children</emphasis> for short.</para>

              <para>A sequence of elements is a special case of a more general
              definition of element content in the XML specification. We
              already used the <quote>+</quote> operator to allow a node to
              appear multiple times. Actually there are three such operators
              being defined:</para>

              <glosslist>
                <glossentry>
                  <glossterm>?</glossterm>

                  <glossdef>
                    <para>A node may appear once or never.</para>
                  </glossdef>
                </glossentry>

                <glossentry>
                  <glossterm>+</glossterm>

                  <glossdef>
                    <para>A node must appear <emphasis>at least</emphasis>
                    once.</para>
                  </glossdef>
                </glossentry>

                <glossentry>
                  <glossterm>*</glossterm>

                  <glossdef>
                    <para>A node may appear an arbitrary number of times,
                    possibly not at all.</para>
                  </glossdef>
                </glossentry>
              </glosslist>

              <para>So far we only talked about sequences of element nodes. We
              may also define mutually exclusive alternatives:</para>

              <figure xml:id="operatorContentAlt">
                <title>The operator <quote>|</quote> defining exclusive
                alternatives.</title>

                <programlisting>...
&lt;!ELEMENT address   (email|telephone|town)<co
                    xml:id="programlisting_alternative_address"/> &gt;
&lt;!ELEMENT email     (#PCDATA)&gt;
&lt;!ELEMENT telephone (#PCDATA)&gt;
&lt;!ELEMENT town      (#PCDATA)&gt;
...

  &lt;address&gt;<co xml:id="programlisting_alternative_emailchild"/>
    &lt;email&gt;goik@hdm-stuttgart.de&lt;/email&gt;
  &lt;/address&gt;
...
  &lt;address&gt;<co xml:id="programlisting_alternative_telephonechild"/>
    &lt;telephone&gt;+49 (0)711-8923-2164&lt;/telephone&gt;
  &lt;/address&gt;
...</programlisting>

                <calloutlist>
                  <callout arearefs="programlisting_alternative_address">
                    <para>An <tag class="element">address</tag> node has
                    <emphasis>either</emphasis> an <tag
                    class="starttag">email</tag> child <emphasis>or</emphasis>
                    a <tag class="starttag">telephone</tag> or a <tag
                    class="starttag">town</tag> child.</para>
                  </callout>

                  <callout arearefs="programlisting_alternative_emailchild">
                    <para>An <tag class="starttag">address</tag> node having
                    an <tag class="starttag">email</tag> child.</para>
                  </callout>

                  <callout arearefs="programlisting_alternative_telephonechild">
                    <para>An <tag class="starttag">address</tag> node having
                    an <tag class="starttag">telephone</tag> child.</para>
                  </callout>
                </calloutlist>
              </figure>

              <para>Now we have collected the basic means allowing to
              structure XML documents. We have the three indicators
              <quote>?</quote>, <quote>+</quote> and <quote>*</quote> which
              govern the multiplicity of nodes. On the other hand the two
              operators <quote>,</quote> and <quote>|</quote> allow us to
              define sequences or mutually exclusive alternatives of element
              nodes. The XML standard defines the notion of <emphasis>content
              particles</emphasis> (<command>cp</command>) which allows these
              two types of structuring elements to be grouped and
              nested:</para>

              <productionset>
                <title>Element-content Models</title>

                <production xml:id="w3RecXml_NT-children">
                  <lhs>children</lhs>

                  <rhs>(<nonterminal
                  def="#w3RecXml_NT-choice">choice</nonterminal> |
                  <nonterminal def="#w3RecXml_NT-seq">seq</nonterminal>) ('?'
                  | '*' | '+')?</rhs>
                </production>

                <production xml:id="w3RecXml_NT-cp">
                  <lhs>cp</lhs>

                  <rhs>(<nonterminal
                  def="#w3RecXml_NT-Name">Name</nonterminal> | <nonterminal
                  def="#w3RecXml_NT-choice">choice</nonterminal> |
                  <nonterminal def="#w3RecXml_NT-seq">seq</nonterminal>) ('?'
                  | '*' | '+')?</rhs>
                </production>

                <production xml:id="w3RecXml_NT-choice">
                  <lhs>choice</lhs>

                  <rhs>'(' <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal def="#w3RecXml_NT-cp">cp</nonterminal> (
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? '|'
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal def="#w3RecXml_NT-cp">cp</nonterminal> )+
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? ')'</rhs>
                </production>

                <production xml:id="w3RecXml_NT-seq">
                  <lhs>seq</lhs>

                  <rhs>'(' <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal def="#w3RecXml_NT-cp">cp</nonterminal> (
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? ','
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal def="#w3RecXml_NT-cp">cp</nonterminal> )*
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? ')'</rhs>
                </production>
              </productionset>

              <para>We give two examples:</para>

              <figure xml:id="pureElementContent">
                <title>Examples of pure element content models</title>

                <glosslist>
                  <glossentry>
                    <glossterm><code>&lt;!ELEMENT address
                    (email|(name,street,town,telephone?))</code></glossterm>

                    <glossdef>
                      <para>An <tag class="element">address</tag> is given
                      either by an email or by a postal address plus an
                      optional telephone number.</para>
                    </glossdef>
                  </glossentry>

                  <glossentry>
                    <glossterm><code>&lt;!ELEMENT figurelist (title,
                    ((table|image|animation),
                    caption?)+)&gt;</code></glossterm>

                    <glossdef>
                      <para>We will call table, image and animations
                      <emphasis>block</emphasis> elements. The <tag
                      class="starttag">figurelist</tag> element defines a list
                      of figures. The whole list starts with an overall title.
                      Then we have at least one occurrence of a block element
                      and an optional caption.</para>
                    </glossdef>
                  </glossentry>
                </glosslist>
              </figure>

              <qandaset role="exercise">
                <title>Content models and operator priority&gt;</title>

                <qandadiv>
                  <qandaentry xml:id="example_operatorprecedence">
                    <question>
                      <para>Find and explain the error being buried in the
                      following DTD. After correcting the error construct a
                      valid document instance.</para>

                      <programlisting>&lt;!ELEMENT addresslist (address*) &gt;
&lt;!ELEMENT address (email | town,street) &gt;
&lt;!ELEMENT email (#PCDATA)&gt;
&lt;!ELEMENT town (#PCDATA)&gt;
&lt;!ELEMENT street (#PCDATA)&gt;</programlisting>
                    </question>

                    <answer>
                      <para>The following document uses the DTD:</para>

                      <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE addresslist SYSTEM "address.dtd"&gt;
&lt;addresslist&gt;
  &lt;address&gt;
    &lt;email&gt;bingo@cheat.com&lt;/email&gt;
  &lt;/address&gt;
  &lt;address&gt;
    &lt;town&gt;Paris&lt;/town&gt;
    &lt;street&gt;Avenue Kléber&lt;/street&gt;
  &lt;/address&gt;
&lt;/addresslist&gt;</programlisting>

                      <para>This yields the following parsing error:</para>

                      <programlisting><errortext>A ')' is required in the declaration of element type "address".</errortext></programlisting>

                      <para>Like many other error messages this one is not
                      really enlightening the reader. We examine the content
                      model of the element <tag
                      class="element">address</tag>:</para>

                      <programlisting>email | town,street</programlisting>

                      <para>We have tree elements joined by two operators
                      namely alternative and sequence. In contrast to e.g.
                      Boolean Algebras the XML standard does not define any
                      operator priority with respect to <quote>|</quote> and
                      <quote>,</quote>. Instead a DTD author must use braces
                      to explicitly define the desired priority:</para>

                      <programlisting>&lt;!ELEMENT address (email | (town,street)) &gt;</programlisting>

                      <para>We note that the operators <quote>*</quote>,
                      <quote>+</quote> and <quote>?</quote> have precedence
                      over <quote>|</quote> and <quote>,</quote>. Thus we may
                      write <code>town,street+</code> instead of the clumsy
                      term <code>town,(street)+</code>.</para>
                    </answer>
                  </qandaentry>

                  <qandaentry xml:id="example_book_v2">
                    <question>
                      <label>Book documents with mixed content and itemized
                      lists</label>

                      <para>Extend the first version of <link
                      linkend="example_bookDtd">book.dtd</link> to support the
                      following features:</para>

                      <itemizedlist>
                        <listitem>
                          <para>Within a <tag class="starttag">chapter</tag>
                          node <tag class="starttag">para</tag> and <tag
                          class="starttag">itemizedlist</tag> elements in
                          arbitrary order shall be allowed.</para>
                        </listitem>

                        <listitem>
                          <para><tag class="starttag">itemizedlist</tag> nodes
                          shall contain at least one <tag
                          class="starttag">listitem</tag>.</para>
                        </listitem>

                        <listitem>
                          <para><tag class="starttag">listitem</tag> nodes
                          shall be composed of one or more para or nested list
                          item elements.</para>
                        </listitem>

                        <listitem>
                          <para>Within a <tag class="starttag">para</tag> we
                          want to be able to emphasize text passages.</para>
                        </listitem>
                      </itemizedlist>

                      <para>The following sample document instance shall be
                      valid:</para>

                      <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE book SYSTEM "book.dtd"&gt;
&lt;book&gt;
  &lt;title&gt;Introduction to Java&lt;/title&gt;
  &lt;chapter&gt;
    &lt;title&gt;Introduction&lt;/title&gt;
    &lt;para&gt;Java supports &lt;emphasis&gt;lots&lt;/emphasis&gt; of concepts:&lt;/para&gt;
    &lt;itemizedlist&gt;
      &lt;listitem&gt;
        &lt;para&gt;Single &lt;emphasis&gt;implementation&lt;/emphasis&gt; inheritance.&lt;/para&gt;
      &lt;/listitem&gt;
      &lt;listitem&gt;
        &lt;para&gt;Multiple &lt;emphasis&gt;interface&lt;/emphasis&gt; inheritance.&lt;/para&gt;
        &lt;itemizedlist&gt;
          &lt;listitem&gt;&lt;para&gt;Built in types&lt;/para&gt;&lt;/listitem&gt;
          &lt;listitem&gt;&lt;para&gt;User defined types&lt;/para&gt;&lt;/listitem&gt;
        &lt;/itemizedlist&gt;
      &lt;/listitem&gt;
    &lt;/itemizedlist&gt;
  &lt;/chapter&gt;
&lt;/book&gt;</programlisting>
                    </question>

                    <answer>
                      <para>An extended DTD looks like:</para>

                      <figure xml:id="paraListEmphasize">
                        <title>Version 2 of book.dtd</title>

                        <programlisting>&lt;!ELEMENT book     (title, chapter+)&gt;
&lt;!ELEMENT chapter  (title, (para|itemizedlist)+ <co
                            xml:id="figure_book.dtd_v2_chapter"/>)&gt;
&lt;!ELEMENT title    (#PCDATA)&gt;
&lt;!ELEMENT para     (#PCDATA|emphasis)*<co xml:id="figure_book.dtd_v2_para"/>&gt;
&lt;!ELEMENT emphasis (#PCDATA)&gt;

&lt;!ELEMENT itemizedlist (listitem+)<co
                            xml:id="figure_book.dtd_v2_itemizedlist"/>&gt;
&lt;!ELEMENT listitem ((para|itemizedlist)<co
                            xml:id="figure_book.dtd_v2_listitem"/>+)&gt;</programlisting>

                        <caption>
                          <para>This allows emphasized text in <tag
                          class="starttag">para</tag> nodes and <tag
                          class="starttag">itemizedlists</tag>.</para>
                        </caption>
                      </figure>

                      <calloutlist>
                        <callout arearefs="figure_book.dtd_v2_chapter">
                          <para>We hook into <tag
                          class="starttag">chapter</tag> to allow arbitrary
                          sequences of at least one <tag
                          class="starttag">para</tag> or <tag
                          class="starttag">itemizedlist</tag> element
                          node.</para>
                        </callout>

                        <callout arearefs="figure_book.dtd_v2_para">
                          <para><tag class="starttag">para</tag> nodes now
                          allow mixed content.</para>
                        </callout>

                        <callout arearefs="figure_book.dtd_v2_itemizedlist">
                          <para>An itemized list contains at least one list
                          item.</para>
                        </callout>

                        <callout arearefs="figure_book.dtd_v2_listitem">
                          <para>A list item contains a sequence of at least
                          one <tag class="starttag">para</tag> or <tag
                          class="starttag">itemizedlist</tag> node. The latter
                          gives rise to nested lists. We find a similar
                          construct in HTML namely unnumbered lists defined by
                          <code>&lt;UL&gt;&lt;LI&gt;...
                          </code>constructs.</para>
                        </callout>
                      </calloutlist>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>
            </section>

            <section xml:id="comments_processing">
              <title>Comments and processing instructions</title>

              <para>A XML comment uses the syntax <code>&lt;!-- This is a
              comment! I love comments! --&gt;</code>. Without going into
              details here comments may appear in many locations both within
              <abbrev
              xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
              and document instances:</para>

              <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE addresslist [
&lt;!-- An addresslist may contain an arbitrary number of address nodes --&gt;
&lt;!ELEMENT addresslist (address)*&gt;
&lt;!ELEMENT address     (#PCDATA)&gt;
]&gt;
&lt;addresslist&gt;
  &lt;!-- the document author --&gt;
  &lt;address&gt;goik@hdm-stuttgart.de&lt;/address&gt;
  &lt;address&gt;bingo@problemcompany.com&lt;/address&gt;
&lt;/addresslist&gt;</programlisting>

              <para>Newbies to XML are sometimes confused about so called
              <emphasis>processing instructions</emphasis> (PI). Similar to
              XML comments it is possible to embed processing instructions
              into XML documents. As an example we show an excerpt from the
              <link
              xlink:href="http://www.w3.org/TR/2006/REC-xml-20060816/REC-xml-20060816.xml">source
              file</link> of the XML specification:</para>

              <programlisting>&lt;?xml version='1.0' encoding='UTF-8'?&gt;
&lt;!DOCTYPE spec SYSTEM "xmlspec.dtd" [
    &lt;!ENTITY base.uri "http://www.w3.org/TR/2006/"&gt;
...
]&gt;
&lt;?xml-stylesheet type="text/xsl" href="REC-xml.xsl" <co
                  xml:id="programmlisting_xmlspecsrc_xsltref"/> ?&gt; <co
                  xml:id="programmlisting_xmlspecsrc_pi"/>
&lt;spec w3c-doctype="rec" xml:lang="en"&gt;
...
      &lt;title&gt;Extensible Markup Language (XML)&lt;/title&gt;
...
&lt;/spec&gt;</programlisting>

              <calloutlist>
                <callout arearefs="programmlisting_xmlspecsrc_xsltref">
                  <para>A reference to a document external style sheet file.
                  The file <filename>REC-xml.xsl</filename> resides in the
                  same folder as the XML document itself. Thus a relative
                  <link xlink:href="http://www.w3.org/Addressing">URL</link>
                  is sufficient.</para>
                </callout>

                <callout arearefs="programmlisting_xmlspecsrc_pi">
                  <para>A processing instruction allowing a web browser to
                  render the XML file appropriately.</para>
                </callout>
              </calloutlist>

              <para>We first note that from a parser's <quote>point of
              view</quote> both XML comments and processing instructions are
              ignored. But software applications working with XML documents
              may inspect both types and interpret their content.</para>

              <para>The purpose of the processing instruction in the above
              document is to enable web browsers to render its content in a
              meaningful way. In contrast to HTML an arbitrary XML document
              does not provide any semantics being necessary to create
              meaningful renderings to end users. A <tag
              class="element">memo</tag> document may be interesting from a
              programmer's point of view but an end user will probably prefer
              either a HTML or a PDF document being
              <emphasis>generated</emphasis> from it. As we shall see in <xref
              linkend="xsl"/> the file <filename>REC-xml.xsl</filename>
              contains style sheet information adhering to the XSLT standard.
              Thus a browser being capable to process XSLT may visualize the
              XML document directly.</para>
            </section>

            <section xml:id="section_cdatasection">
              <title><acronym>CDATA</acronym> sections</title>

              <para>Editing XML documents with text editors it is tedious
              since we have to avoid XML markup in <code>#PCDATA</code> or
              attribute content. A computer scientist writing a documentation
              on C++ code might want to express <emphasis>bit shift</emphasis>
              and <emphasis>address of</emphasis> operators:</para>

              <programlisting>&lt;para&gt;If a &lt; b we set c = &amp; <co
                  xml:id="programlisting_wrongmarkup_amp"/> (a &gt;&gt; <co
                  xml:id="programlisting_wrongmarkup_gt"/> b); &lt;/para&gt;</programlisting>

              <calloutlist>
                <callout arearefs="programlisting_wrongmarkup_amp">
                  <para>First error: The operator <quote>&amp;</quote> is
                  reserved for <link linkend="chapter_entities">general entity
                  references</link> like e.g. <code>&amp;lt;</code>.</para>
                </callout>

                <callout arearefs="programlisting_wrongmarkup_gt">
                  <para>Second error: The character <quote>&gt;</quote> is
                  reserved to denote an element node's termination.</para>
                </callout>
              </calloutlist>

              <para>XML offers 5 predefined replacement entities for this
              purpose:</para>

              <table xml:id="xmlStandardEntities">
                <title>Replacement entities for XML markup characters</title>

                <?dbhtml table-width="15%" ?>

                <?dbfo table-width="15%" ?>

                <tgroup cols="2">
                  <colspec colwidth="1*"/>

                  <colspec colwidth="2*"/>

                  <tbody>
                    <row>
                      <entry>&lt;</entry>

                      <entry><tag class="genentity">lt</tag></entry>
                    </row>

                    <row>
                      <entry>&gt;</entry>

                      <entry><tag class="genentity">gt</tag></entry>
                    </row>

                    <row>
                      <entry>&amp;</entry>

                      <entry><tag class="genentity">amp</tag></entry>
                    </row>

                    <row>
                      <entry>"</entry>

                      <entry><tag class="genentity">quot</tag></entry>
                    </row>

                    <row>
                      <entry>'</entry>

                      <entry><tag class="genentity">apos</tag></entry>
                    </row>
                  </tbody>
                </tgroup>
              </table>

              <para>So without an appropriate editor our poor computer
              scientist will have to write:</para>

              <programlisting>&lt;para&gt;If a &amp;lt; b we set c = &amp;amp; (a &amp;gt;&amp;gt; b); &lt;/para&gt;</programlisting>

              <para>Looks promising, right? Actually the better alternative is
              to use an XML capable editor which allows an author to type
              <code>If a &lt; b we set c = &amp; (a &gt;&gt; b);</code>. The
              editor software will present this text to the author and
              <emphasis>internally</emphasis> save the correct XML code as
              presented before.</para>

              <para>If someone is forced to use a pure text editor
              <acronym>CDATA</acronym> sections the second best alternative. A
              <acronym>CDATA</acronym> Section encloses a text string which
              will not be interpreted by an XML parser. It starts with the
              reserved sequence <code>&lt;![CDATA[</code> and terminates with
              <quote>]]&gt;</quote>. The example given before reads:</para>

              <programlisting>&lt;para&gt;If &lt;![CDATA[a &lt; b we set c = &amp; (a &gt;&gt; b);]]&gt; &lt;/para&gt;</programlisting>

              <para>The precise definition is:</para>

              <productionset>
                <title><acronym>CDATA</acronym> Sections</title>

                <production xml:id="w3RecXml_NT-CDSect">
                  <lhs>CDSect</lhs>

                  <rhs><nonterminal
                  def="#w3RecXml_NT-CDStart">CDStart</nonterminal>
                  <nonterminal def="#w3RecXml_NT-CData">CData</nonterminal>
                  <nonterminal
                  def="#w3RecXml_NT-CDEnd">CDEnd</nonterminal></rhs>
                </production>

                <production xml:id="w3RecXml_NT-CDStart">
                  <lhs>CDStart</lhs>

                  <rhs>'&lt;![CDATA['</rhs>
                </production>

                <production xml:id="w3RecXml_NT-CData">
                  <lhs>CData</lhs>

                  <rhs>(<nonterminal
                  def="#w3RecXml_NT-Char">Char</nonterminal>* - (<nonterminal
                  def="#w3RecXml_NT-Char">Char</nonterminal>* ']]&gt;'
                  <nonterminal
                  def="#w3RecXml_NT-Char">Char</nonterminal>*))</rhs>
                </production>

                <production xml:id="w3RecXml_NT-CDEnd">
                  <lhs>CDEnd</lhs>

                  <rhs>']]&gt;'</rhs>
                </production>
              </productionset>

              <para>Thus inside a <acronym>CDATA</acronym> section only the
              exact sequence <quote>]]&gt;</quote> is disallowed.</para>
            </section>
          </section>

          <section xml:id="section_attributetypes">
            <title>Attribute types</title>

            <para>When discussing the content model type <link
            linkend="section_empty">EMPTY</link> we already mentioned the
            possibility of element nodes having attributes like <tag
            class="emptytag">img src="..."</tag>. We discuss two features of
            element attributes namely its <emphasis>type</emphasis> and the
            way default values are specified.:</para>

            <figure xml:id="attribImg">
              <title>Attributes of HTML <tag class="emptytag">img</tag>
              elements.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/attribInElement.fig" scale="65"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>We already observed that content model definitions allow us
            to define <emphasis>composition</emphasis> rules. Thus a <tag
            class="starttag">chapter</tag> may consist of a <tag
            class="starttag">title</tag> node followed by <tag
            class="starttag">para</tag> and other nodes. This defines
            hierarchical , tree like structures. But the
            <emphasis>actual</emphasis> string content is defined as
            <code>#PCDATA</code>. We are unable to specify a node's content to
            consist purely of numbers for example. In contrast XML DTD
            attribute definitions offer a limited set of predefined types to
            choose from.</para>

            <section xml:id="section_cdata">
              <title><code>CDATA</code></title>

              <para>An element type may be defined to have attributes of type
              <code>CDATA</code>:</para>

              <programlisting>&lt;!ATTLIST img <co
                  xml:id="programlisting_img_element"/>
   src<co xml:id="programlisting_img_att_src"/> CDATA<co
                  xml:id="programlisting_img_att_src_type"/> #REQUIRED<co
                  xml:id="programlisting_img_att_src_default"/> &gt;</programlisting>

              <calloutlist>
                <callout arearefs="programlisting_img_element">
                  <para>Start of the definition of a <emphasis>set</emphasis>
                  of attributes for the element type <tag
                  class="element">img</tag>.</para>
                </callout>

                <callout arearefs="programlisting_img_att_src">
                  <para>Start of the first at tribute's definition named <tag
                  class="attribute">src</tag>.</para>
                </callout>

                <callout arearefs="programlisting_img_att_src_type">
                  <para>The attribute <tag class="attribute">src</tag>'s type
                  is <code>CDATA</code>.</para>
                </callout>

                <callout arearefs="programlisting_img_att_src_default">
                  <para>The attribute <tag class="attribute">src</tag> is
                  mandatory, see <xref linkend="section_attribute_default"/>
                  .</para>
                </callout>
              </calloutlist>

              <para>We have to be careful here. The term <code>CDATA</code>
              resembles <code>#PCDATA</code> already being introduced for
              content models. Actually these two terms are completely distinct
              since <code>CDATA</code> refers to attribute values. Consider
              the following code snippet:</para>

              <programlisting>&lt;para&gt;We may use "quotes" here&lt;/para&gt;</programlisting>

              <para>This is completely legal since all characters being used
              refer to the production rule of <code>#PCDATA</code>. But using
              the same as an attribute value instead causes trouble:</para>

              <programlisting>&lt;img src="bold.gif" alt="We may use "quotes" here" /&gt;</programlisting>

              <para>This is indeed not even well formed XML. The two inner
              quotes embedding the substring <code>quotes</code> interfere
              with the two outer quotes delimiting the attribute <tag
              class="attribute">src</tag>'s value. As we shall see in <xref
              linkend="example_quotes"/> there is a solution to this problem
              but the current example shows that the production rules of
              <code>#PCDATA</code> and <code>CDATA</code> differ.</para>

              <qandaset role="exercise">
                <title>book.dtd and languages&gt;</title>

                <qandadiv>
                  <qandaentry xml:id="example_book.dtd_v3">
                    <question>
                      <para>We want to extend our DTD from <xref
                      linkend="example_book_v2"/> by allowing an author to
                      define the language used within the document. Add an
                      attribute declaration to the top level element <tag
                      class="element">book</tag>.</para>
                    </question>

                    <answer>
                      <para>We simply have to add a single line to our
                      DTD:</para>

                      <programlisting>&lt;!ELEMENT book     (title, chapter+)&gt;
<emphasis role="bold">&lt;!ATTLIST book lang CDATA #IMPLIED &gt;</emphasis>
...</programlisting>

                      <para>This allows us to globally set a language for a
                      document:</para>

                      <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE book SYSTEM "book.dtd"&gt;
&lt;book lang="english"&gt;
  &lt;title&gt;Introduction to Java&lt;/title&gt;
...</programlisting>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>

              <para>The XML specification defines attribute definitions
              belonging to element types as:</para>

              <productionset>
                <title>Attribute-list Declaration</title>

                <production xml:id="w3RecXml_NT-AttlistDecl">
                  <lhs>AttlistDecl</lhs>

                  <rhs>'&lt;!ATTLIST' <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                  def="#w3RecXml_NT-Name">Name</nonterminal> <nonterminal
                  def="#w3RecXml_NT-AttDef">AttDef</nonterminal>* <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? '&gt;'</rhs>
                </production>

                <production xml:id="w3RecXml_NT-AttDef">
                  <lhs>AttDef</lhs>

                  <rhs><nonterminal def="#w3RecXml_NT-S">S</nonterminal>
                  <nonterminal def="#w3RecXml_NT-Name">Name</nonterminal>
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>
                  <nonterminal
                  def="#w3RecXml_NT-AttType">AttType</nonterminal>
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>
                  <nonterminal
                  def="#w3RecXml_NT-DefaultDecl">DefaultDecl</nonterminal></rhs>
                </production>
              </productionset>

              <para>The first rule tells us that multiple attributes may be
              defined for a given element. This is quite <quote>normal</quote>
              since the same applies for example when attributes are defined
              within <link
              linkend="gloss_Java"><trademark>Java</trademark></link> or C++
              classes. Actually in <link
              xlink:href="http://www.w3.org/MarkUp">XHTML</link> the <tag
              class="emptytag">img</tag> element's attribute list is defined
              as:</para>

              <programlisting>&lt;!ATTLIST img
      src          CDATA        #REQUIRED
      alt          CDATA        #REQUIRED
      longdesc     CDATA        #IMPLIED
      height       CDATA        #IMPLIED
      width        CDATA        #IMPLIED
      ...  &gt;</programlisting>

              <para>The second production rule tells us that attribute names
              like <tag class="attribute">src</tag> must be of <link
              linkend="w3RecXml_NT-Name">Name</link> production. For example
              <code>4element</code> would be an illegal name since attribute
              name strings may contain numbers but not at the beginning. This
              is quite common in most programming languages and refers to the
              term of a legal identifier.</para>

              <para>The second rule also tells us that <code>CDATA</code> is
              only one among other possible attribute types:</para>

              <productionset>
                <title>Attribute Types</title>

                <production xml:id="w3RecXml_NT-AttType">
                  <lhs>AttType</lhs>

                  <rhs><nonterminal
                  def="#w3RecXml_NT-StringType">StringType</nonterminal> |
                  <nonterminal
                  def="#w3RecXml_NT-TokenizedType">TokenizedType</nonterminal>
                  | <nonterminal
                  def="#w3RecXml_NT-EnumeratedType">EnumeratedType</nonterminal></rhs>
                </production>

                <production xml:id="w3RecXml_NT-StringType">
                  <lhs>StringType</lhs>

                  <rhs>'CDATA'</rhs>
                </production>

                <production xml:id="w3RecXml_NT-TokenizedType">
                  <lhs>TokenizedType</lhs>

                  <rhs>'ID'| 'IDREF'| 'IDREFS'| 'ENTITY'| 'ENTITIES'|
                  'NMTOKEN'| 'NMTOKENS'</rhs>
                </production>
              </productionset>

              <para>The discussion of <code>ENTITY</code> types will be
              deferred till <xref linkend="chapter_entities"/>. Before
              discussing the remaining types we mention a topic common to all
              attribute types:</para>

              <qandaset role="exercise">
                <title>Enclosing quotes</title>

                <qandadiv>
                  <qandaentry xml:id="example_quotes">
                    <question>
                      <para>We recall the problem of nested quotes yielding
                      non-well formed XML code:</para>

                      <programlisting>&lt;img src="bold.gif" alt="We may use "quotes" here" /&gt;</programlisting>

                      <para>The XML specification defines legal attribute
                      value definitions as:</para>

                      <productionset>
                        <title>Literals</title>

                        <production xml:id="w3RecXml_NT-EntityValue">
                          <lhs>EntityValue</lhs>

                          <rhs>'"' ([^%&amp;"] | <nonterminal
                          def="#w3RecXml_NT-PEReference">PEReference</nonterminal>
                          | <nonterminal
                          def="#w3RecXml_NT-Reference">Reference</nonterminal>)*
                          '"' |  "'" ([^%&amp;'] | <nonterminal
                          def="#w3RecXml_NT-PEReference">PEReference</nonterminal>
                          | <nonterminal
                          def="#w3RecXml_NT-Reference">Reference</nonterminal>)*
                          "'"</rhs>
                        </production>

                        <production xml:id="w3RecXml_NT-AttValue">
                          <lhs>AttValue</lhs>

                          <rhs>'"' ([^&lt;&amp;"] | <nonterminal
                          def="#w3RecXml_NT-Reference">Reference</nonterminal>)*
                          '"' |  "'" ([^&lt;&amp;'] | <nonterminal
                          def="#w3RecXml_NT-Reference">Reference</nonterminal>)*
                          "'"</rhs>
                        </production>

                        <production xml:id="w3RecXml_NT-SystemLiteral">
                          <lhs>SystemLiteral</lhs>

                          <rhs>('"' [^"]* '"') | ("'" [^']* "'")</rhs>
                        </production>

                        <production xml:id="w3RecXml_NT-PubidLiteral">
                          <lhs>PubidLiteral</lhs>

                          <rhs>'"' <nonterminal
                          def="#w3RecXml_NT-PubidChar">PubidChar</nonterminal>*
                          '"' | "'" (<nonterminal
                          def="#w3RecXml_NT-PubidChar">PubidChar</nonterminal>
                          - "'")* "'"</rhs>
                        </production>

                        <production xml:id="w3RecXml_NT-PubidChar">
                          <lhs>PubidChar</lhs>

                          <rhs>#x20 | #xD | #xA | [a-zA-Z0-9]
                          | [-'()+,./:=?;!*#@$_%]</rhs>
                        </production>
                      </productionset>

                      <para>Find out how it is possible to set the attribute
                      <tag class="attribute">alt</tag>'s value to the string
                      <code>We may use "quotes" here</code>.</para>
                    </question>

                    <answer>
                      <para>The production rule for attribute values
                      reads:</para>

                      <productionset>
                        <productionrecap linkend="w3RecXml_NT-AttValue"/>
                      </productionset>

                      <para>This allows us to use either of two alternatives
                      to delimit attribute values:</para>

                      <glosslist>
                        <glossentry>
                          <glossterm><tag class="starttag">img ...
                          alt="..."/</tag></glossterm>

                          <glossdef>
                            <para><emphasis>Validity constraint:</emphasis> do
                            not use <code>"</code> inside the value
                            string.</para>
                          </glossdef>
                        </glossentry>

                        <glossentry>
                          <glossterm><tag class="starttag">img ...
                          alt='...'/</tag></glossterm>

                          <glossdef>
                            <para><emphasis>Validity constraint:</emphasis> do
                            not use <code>'</code> inside the value
                            string.</para>
                          </glossdef>
                        </glossentry>
                      </glosslist>

                      <para>We may take advantage of the second rule:</para>

                      <programlisting>&lt;img src="bold.gif" alt='We may use "quotes" here' /&gt;</programlisting>

                      <para>Notice that according to <xref
                      linkend="w3RecXml_NT-AttValue"/> the delimiting quotes
                      must not be mixed. The following code is thus not well
                      formed:</para>

                      <programlisting>&lt;img src="bold.gif'/&gt;</programlisting>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>
            </section>

            <section xml:id="section_nmtoken">
              <title><code>NMTOKEN</code> /<code>NMTOKENS</code></title>

              <para>Name tokens are essentially strings composed of a
              restricted character set. A name token must for example not
              contain any white space. We already mentioned its production
              rule:</para>

              <productionset>
                <productionrecap linkend="w3RecXml_NT-Nmtoken"/>
              </productionset>

              <para>This may be used to restrict attribute values. We consider
              a configuration file containing a list of user accounts:</para>

              <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE userlist [
&lt;!ELEMENT userlist (account*)&gt;
&lt;!ELEMENT account EMPTY&gt;
&lt;!ATTLIST account
    username NMTOKEN #REQUIRED
    password CDATA   #IMPLIED
    &gt;
]&gt;
&lt;userlist&gt;
  &lt;account username="Joe"/&gt;
  &lt;account username="Mr. Bean"/&gt;
                &lt;!-- Whoops, an illegal space!--&gt;
&lt;/userlist&gt;</programlisting>

              <para>We extend the above example by allowing each user to
              belong to a <emphasis>set</emphasis> of groups. We achieve this
              by adding an attribute <tag class="attribute">groups</tag> of
              type <code>NMTOKENS</code>:</para>

              <programlisting>...
&lt;!ATTLIST account
    username NMTOKEN #REQUIRED
    groups NMTOKENS #IMPLIED
    password CDATA   #IMPLIED
    &gt;
]&gt;
&lt;userlist&gt;
  &lt;account username="Joe" groups="admin staff team"/&gt;
&lt;/userlist&gt;</programlisting>

              <para>This defines a user <code>Joe</code> belonging to the
              three groups <code>admin</code>, <code>staff</code> and
              <code>team</code>. Informally we see a list of tokens separated
              by spaces. This is indeed the formal W3C specification:</para>

              <productionset>
                <productionrecap linkend="w3RecXml_NT-Nmtokens"/>
              </productionset>

              <para>According to this rule only single spaces (#20) are legal.
              Actual parser implementations seem to accept more general
              whitespace here. Thus a sequence of spaces, tabs, carriage
              returns and newlines is also accepted as a separator
              value.</para>
            </section>

            <section xml:id="section_name_token_group">
              <title>Enumeration values</title>

              <para>The XML standard allows us to define enumerations by
              restricting an attribute value to a predefined set of name
              tokens:</para>

              <productionset>
                <title>Enumerated Attribute Types</title>

                <production xml:id="w3RecXml_NT-EnumeratedType">
                  <lhs>EnumeratedType</lhs>

                  <rhs><nonterminal
                  def="#w3RecXml_NT-NotationType">NotationType</nonterminal> |
                  <nonterminal
                  def="#w3RecXml_NT-Enumeration">Enumeration</nonterminal></rhs>
                </production>

                <production xml:id="w3RecXml_NT-NotationType">
                  <lhs>NotationType</lhs>

                  <rhs>'NOTATION' <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal> '(' <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? <nonterminal
                  def="#w3RecXml_NT-Name">Name</nonterminal> (<nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? '|' <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? <nonterminal
                  def="#w3RecXml_NT-Name">Name</nonterminal>)* <nonterminal
                  def="#w3RecXml_NT-S">S</nonterminal>? ')'</rhs>
                </production>

                <production xml:id="w3RecXml_NT-Enumeration">
                  <lhs>Enumeration</lhs>

                  <rhs>'(' <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal
                  def="#w3RecXml_NT-Nmtoken">Nmtoken</nonterminal>
                  (<nonterminal def="#w3RecXml_NT-S">S</nonterminal>? '|'
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>?
                  <nonterminal
                  def="#w3RecXml_NT-Nmtoken">Nmtoken</nonterminal>)*
                  <nonterminal def="#w3RecXml_NT-S">S</nonterminal>? ')'</rhs>
                </production>
              </productionset>

              <para>We start with an example of a <emphasis>Name Token
              Group</emphasis> aka enumeration:</para>

              <figure xml:id="figure_nametokengroup">
                <title>A name token group</title>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE top [
&lt;!ELEMENT top        (chemical*)&gt;
&lt;!ELEMENT chemical   (#PCDATA)&gt;
&lt;!ATTLIST chemical state  (solid|liquid|gas) <co
                    xml:id="figure_nametokengroup_att_state"/> #REQUIRED <co
                    xml:id="figure_nametokengroup_att_state_required"/>&gt;
]&gt;
&lt;top&gt;
  &lt;chemical state="gas" <co xml:id="figure_nametokengroup_oxygen_state"/>&gt;Oxygen&lt;/chemical&gt;
  &lt;chemical state="liquid" <co xml:id="figure_nametokengroup_water_state"/>&gt;Water&lt;/chemical&gt;
  
  &lt;chemical state="superfluous" <co
                    xml:id="figure_nametokengroup_helium_state"/>&gt;Helium&lt;/chemical&gt;
                    &lt;!-- Ooops! --&gt;
&lt;/top&gt;</programlisting>

                <calloutlist>
                  <callout arearefs="figure_nametokengroup_att_state">
                    <para>The attribute <tag class="attribute">state</tag>'s
                    value may have values from the set {solid, liquid,
                    gas}.</para>
                  </callout>

                  <callout arearefs="figure_nametokengroup_att_state_required">
                    <para><tag class="attribute">state</tag> is
                    mandatory.</para>
                  </callout>

                  <callout arearefs="figure_nametokengroup_oxygen_state">
                    <para>A legal value.</para>
                  </callout>

                  <callout arearefs="figure_nametokengroup_water_state">
                    <para>Another legal value.</para>
                  </callout>

                  <callout arearefs="figure_nametokengroup_helium_state">
                    <para>The token value <tag
                    class="attvalue">superfluous</tag> does not belong to the
                    set of allowed values. The parser flags this error
                    as:</para>

                    <para><code>Attribute "state" with value "superfluous"
                    must have a value from the list "solid liquid gas
                    ".</code></para>
                  </callout>
                </calloutlist>
              </figure>

              <para>The rule defining an <link
              linkend="w3RecXml_NT-Enumeration">Enumeration</link> has to be
              supplemented by a validity constraint: The set of legal token
              values must not contain duplicates. This would violate the
              attributes property allowing values to be chosen from a
              <emphasis>set</emphasis>.</para>

              <qandaset role="exercise">
                <title>Restriction of allowed languages</title>

                <qandadiv>
                  <qandaentry xml:id="example_book.dtd_v4">
                    <question>
                      <para xml:lang="">We extend our book.dtd version from
                      <xref linkend="example_book.dtd_v3"/>. The attribute
                      <tag class="attribute">lang</tag> is simple free text.
                      We want to restrict this to allow only values from the
                      set {en,fr,de,it,es}.</para>
                    </question>

                    <answer>
                      <para>We restrict our attribute definition from type
                      <code>CDATA</code> to a name token group:</para>

                      <programlisting>&lt;!ATTLIST book lang (en|fr|de|it|es) #IMPLIED &gt;</programlisting>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>

              <para>The notation type branch production rule's usage is quite
              similar:</para>

              <figure xml:id="attributeNotation">
                <title>A notation attribute</title>

                <programlisting>&lt;!DOCTYPE doc [

&lt;!NOTATION <emphasis role="bold">cpp</emphasis>  SYSTEM "The ANSI C++ programming language"&gt;
&lt;!NOTATION <emphasis role="bold">perl</emphasis> SYSTEM "The PERL script programming language"&gt;
&lt;!NOTATION <emphasis role="bold">sql</emphasis>  SYSTEM "SQL 92 database query language"&gt;

&lt;!ELEMENT doc (code)*&gt;
&lt;!ELEMENT code (#PCDATA)&gt;
&lt;!ATTLIST code
   language NOTATION (<emphasis role="bold">cpp</emphasis>|<emphasis
                    role="bold">perl</emphasis>|<emphasis role="bold">sql</emphasis>)  #REQUIRED &gt;
]&gt;
&lt;doc&gt;
  &lt;code language="<emphasis role="bold">cpp</emphasis>"&gt;delete[] namelist;&lt;/code&gt;
  &lt;code language="<emphasis role="bold">sql</emphasis>"&gt;SELECT * FROM User;&lt;/code&gt;
&lt;/doc&gt;</programlisting>
              </figure>

              <para>The only difference in comparison to a Name Token Group is
              the keyword <code>NOTATION</code>. There are however additional
              validity constraints imposed by the XML specification.</para>

              <para>In the given example the content of <tag
              class="starttag">para</tag> nodes was declared as
              <code>#PCDATA</code>. Actually all types of element content
              except <code>EMPTY</code> may appear.</para>

              <itemizedlist>
                <listitem>
                  <para>Values of type <code>NOTATION</code>
                  <emphasis>must</emphasis> match one of the notation names
                  included in the declaration. In the given example this would
                  be either <tag class="attvalue">cpp</tag>, <tag
                  class="attvalue">perl</tag> or <tag
                  class="attvalue">sql</tag>. All notation names in the
                  declaration <emphasis>must</emphasis> be declared.</para>
                </listitem>

                <listitem>
                  <para>An element type <emphasis>must not</emphasis> have
                  more than one <code>NOTATION</code> attribute specified.
                  Actually a <code>NOTATION</code> attribute value gives us a
                  <quote>promise</quote> about the expected content of the
                  element node in which it appears. So if the content of a
                  <tag class="starttag">para</tag> node is SQL code it cannot
                  in addition be declared to be of language category type
                  <emphasis>declarative</emphasis>.</para>
                </listitem>

                <listitem>
                  <para>For compatibility to SGML an attribute of type
                  <code>NOTATION</code> <emphasis>must not</emphasis> be
                  declared on an element declared <link
                  linkend="section_empty">EMPTY</link>.</para>
                </listitem>
              </itemizedlist>
            </section>

            <section xml:id="section_id_idref">
              <title><code>ID</code> and <code>IDREF / IDREFS</code></title>

              <para>The pair of attribute types <code>ID</code> and
              <code>IDREF</code> defines internal references within a given
              XML document instance. Before considering XML we recall the way
              document internal references are implemented in HTML. A
              reference originates from a <emphasis>source</emphasis> and
              leads to a <emphasis>target</emphasis>, in HTML the latter is
              frequently called an <emphasis>anchor</emphasis>:</para>

              <figure xml:id="figure_reference_html">
                <title>An internal reference within a HTML document</title>

                <programlisting>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&gt;
&lt;html&gt;
  &lt;head&gt;&lt;title&gt;Reference example&lt;/title&gt;&lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;Reference example&lt;/h1&gt;
    &lt;p&gt;&lt;a name="foo" <co xml:id="figure_reference_html_anchor"/>&gt;&lt;/a&gt;This is the target.&lt;/p&gt;

    &lt;p&gt;There may be lots of text in between ...&lt;/p&gt;
    &lt;p&gt;There may be lots of text in between ...&lt;/p&gt;

    &lt;h1&gt;This is a different section&lt;/h1&gt;
    &lt;p&gt;Click &lt;a href="#foo" <co xml:id="figure_reference_html_link1"/>&gt;here&lt;/a&gt; to see the target.&lt;/p&gt;

    &lt;h1&gt;This is a third section&lt;/h1&gt;
    &lt;p&gt;Again &lt;a href="#foo" <co xml:id="figure_reference_html_link2"/>&gt;clicking&lt;/a&gt; yields the same target.&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>
              </figure>

              <calloutlist>
                <callout arearefs="figure_reference_html_anchor">
                  <para>Each <tag class="starttag">a name="foo"</tag> tag with
                  the given value must appear only once. Thus it is an error
                  if a second tag <tag class="starttag">a name="foo"</tag>
                  appears within the same HTML file since the value <tag
                  class="attvalue">foo</tag> would not be unique.</para>
                </callout>

                <callout arearefs="figure_reference_html_link1">
                  <para>The <quote>#</quote> is a shorthand for a document
                  local reference. A full HTML reference looks like
                  <code>http://someserver.org/docs/intro.html#foo</code>
                  defining a reference to the position indicated by <tag
                  class="starttag">&lt;a name="foo"&gt;</tag> within the
                  document with path <code>/docs/intro.html</code> on the
                  server <code>someserver.org</code> accessed by the <link
                  xlink:href="http://www.w3.org/Protocols">HTTP</link>
                  protocol . Thus <quote><code>#foo</code></quote> points to
                  the local target defined by <tag class="starttag">a
                  name="foo"</tag> in the document itself.</para>
                </callout>

                <callout arearefs="figure_reference_html_link2">
                  <para>A second link to the same destination.</para>
                </callout>
              </calloutlist>

              <para>In a database context we would call <tag
              class="starttag">&lt;a name="foo"&gt;</tag> a <emphasis>primary
              key value</emphasis>. The element node <tag class="starttag">a
              href="#foo"</tag> would be considered a <emphasis>foreign
              key</emphasis> reference which may appear multiple times
              pointing to the same target.</para>

              <para>In HTML a node may at the same time be itself a reference
              target and define a reference to another target:</para>

              <programlisting>&lt;a name="thisTarget" href="linkToOtherTarget"&gt;click on me!&lt;/a&gt;</programlisting>

              <para>The XML standard adopts a different way to implement
              document internal references. We give an example:</para>

              <figure xml:id="figure_intern_reference_xml">
                <title>Internal references in XML</title>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE catalog [

&lt;!ELEMENT catalog   (product*) <co
                    xml:id="figure_intern_reference_xml_catalog"/> &gt;
&lt;!ELEMENT product   (title, para*) <co
                    xml:id="figure_intern_reference_xml_product"/>&gt;
&lt;!ELEMENT title     (#PCDATA)&gt;
&lt;!ELEMENT para (#PCDATA|link)* <co
                    xml:id="figure_intern_reference_xml_para"/> &gt;
&lt;!ELEMENT link (#PCDATA)&gt;

&lt;!ATTLIST product id  ID <co
                    xml:id="figure_intern_reference_xml_att_product_id"/>    #IMPLIED&gt;
&lt;!ATTLIST link    ref IDREF <co
                    xml:id="figure_intern_reference_xml_att_link_ref"/> #REQUIRED&gt;
]&gt;
&lt;catalog&gt;
  &lt;product id="homeTrainer" <co
                    xml:id="figure_intern_reference_xml_define_id_hometrainer"/> &gt;
    &lt;title&gt;Home trainer&lt;/title&gt;
    &lt;para&gt;Like to torture yourself in front of your TV?&lt;/para&gt;
  &lt;/product&gt;
  &lt;product <co xml:id="figure_intern_reference_xml_product_no_id"/>&gt;
    &lt;title&gt;Mountain bike&lt;/title&gt;
    &lt;para&gt;If you hate rain look &lt;link ref="homeTrainer" <co
                    xml:id="figure_intern_reference_xml_define_ref1_hometrainer"/> &gt;here&lt;/link&gt;.&lt;/para&gt;
  &lt;/product&gt;
&lt;/catalog&gt;</programlisting>
              </figure>

              <calloutlist>
                <callout arearefs="figure_intern_reference_xml_catalog">
                  <para>Start of the DTD. A catalog consists of
                  products.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_product">
                  <para>A product has a title and optional paragraphs to
                  describe it in detail.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_para">
                  <para>A paragraph allows mixed content of text and
                  references to other parts of the document.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_att_product_id">
                  <para>A <tag class="starttag">product</tag> node may have an
                  attribute <tag class="attribute">id</tag> with an unique
                  value within the document instance.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_att_link_ref">
                  <para>A <tag class="starttag">link</tag>
                  <emphasis>must</emphasis> have an attribute <tag
                  class="attribute">ref</tag> with a value referring to an
                  element with a corresponding attribute value of type
                  <code>ID</code>.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_define_id_hometrainer">
                  <para>A product with unique <code>id</code> value
                  <code>homeTrainer</code>.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_product_no_id">
                  <para>A product without <code>id</code> value. Thus it may
                  not be referenced.</para>
                </callout>

                <callout arearefs="figure_intern_reference_xml_define_ref1_hometrainer">
                  <para>A reference to <emphasis>the</emphasis> element node
                  with a defined attribute of type <code>ID</code> and value
                  <code>homeTrainer</code>.</para>
                </callout>
              </calloutlist>

              <para>From this example we will now present the syntax and
              validity constraints supplied by the XML specification:</para>

              <glosslist>
                <glossentry>
                  <glossterm><code>ID</code></glossterm>

                  <glossdef>
                    <para><itemizedlist>
                        <listitem>
                          <para>Values of type <code>ID</code>
                          <emphasis>must</emphasis> match the <link
                          linkend="w3RecXml_NT-Name">Name</link> production. A
                          name <emphasis>must not</emphasis> appear more than
                          once in an XML document as a value of this type;
                          i.e., <code>ID</code> values
                          <emphasis>must</emphasis> uniquely identify the
                          elements which bear them. In a database context this
                          would be considered a <emphasis>primary key
                          constraint</emphasis>.</para>
                        </listitem>

                        <listitem>
                          <para>An element type <emphasis>must not</emphasis>
                          have more than one <code>ID</code> attribute
                          specified.</para>
                        </listitem>

                        <listitem>
                          <para>An <code>ID</code> attribute
                          <emphasis>must</emphasis> have a declared default of
                          <code>#IMPLIED</code> or
                          <code>#REQUIRED</code>.</para>
                        </listitem>
                      </itemizedlist></para>
                  </glossdef>
                </glossentry>

                <glossentry>
                  <glossterm><code>IDREF</code></glossterm>

                  <glossdef>
                    <para>Values of type <code>IDREF</code> MUST match the
                    <link linkend="w3RecXml_NT-Name">Name</link> production.
                    Each Name <emphasis>must</emphasis> match the value of an
                    <code>ID</code> attribute on some element in the XML
                    document; i.e. <code>IDREF</code> values
                    <emphasis>must</emphasis> match the value of some
                    <code>ID</code> attribute. In a database context this
                    would be considered a <emphasis>foreign key
                    constraint</emphasis>.</para>
                  </glossdef>
                </glossentry>

                <glossentry>
                  <glossterm><code>IDREFS</code></glossterm>

                  <glossdef>
                    <para>Values of type <code>IDREFS</code> are sets of
                    <code>IDREF</code> values separated by spaces:</para>

                    <programlisting>&lt;!DOCTYPE gamelist [
&lt;!ELEMENT gamelist      (game+, gameCategory+)&gt;
&lt;!ELEMENT game      (#PCDATA)&gt;
&lt;!ATTLIST game id ID #REQUIRED&gt;

&lt;!ELEMENT gameCategory   (#PCDATA)&gt;
&lt;!ATTLIST gameCategory games IDREFS #REQUIRED&gt;
]&gt;
&lt;gamelist&gt;
    &lt;game id='chess'&gt;Chess&lt;/game&gt;
    &lt;game id='poker'&gt;Poker&lt;/game&gt;
    &lt;game id='bj'&gt;Black Jack&lt;/game&gt;
    
    &lt;gameCategory games="poker bj"&gt;Card games&lt;/gameCategory&gt;
&lt;/gamelist&gt;</programlisting>

                    <para>The restriction to the term <emphasis
                    role="bold">set</emphasis> disallowing duplicates is
                    important. The following snippet containing two identical
                    references would be flagged as an error:</para>

                    <programlisting>...
&lt;gameCategory games="poker bj poker"&gt;Card games&lt;/gameCategory&gt;
...</programlisting>
                  </glossdef>
                </glossentry>
              </glosslist>

              <qandaset role="exercise">
                <title>Legal attribute values</title>

                <qandadiv>
                  <qandaentry xml:id="example_legal_attribute_values">
                    <question>
                      <para>Complete the following matrix. Enter a
                      <quote>+</quote> if the attribute value satisfies the
                      constraint being imposed by the attribute type and a
                      <quote>-</quote> otherwise.</para>

                      <informaltable xml:id="table_legal_attribute_matrix">
                        <?dbhtml table-width="40%" ?>

                        <?dbfo table-width="40%" ?>

                        <tgroup cols="4">
                          <colspec colwidth="3*"/>

                          <colspec colwidth="2*"/>

                          <colspec colwidth="2*"/>

                          <colspec colwidth="2*"/>

                          <tbody>
                            <row>
                              <entry/>

                              <entry><code>CDATA</code></entry>

                              <entry><code>NMTOKEN</code></entry>

                              <entry><code>ID</code></entry>
                            </row>

                            <row>
                              <entry><code>_foo</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>too small</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>2three4</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>-man</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>two3four</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>Uhh-oops</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>a+b</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>

                            <row>
                              <entry><code>&amp;</code></entry>

                              <entry/>

                              <entry/>

                              <entry/>
                            </row>
                          </tbody>
                        </tgroup>
                      </informaltable>
                    </question>

                    <answer>
                      <para>We may use the following code to ask a
                      parser:</para>

                      <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE doc [
&lt;!ELEMENT doc  (testentry)*&gt;
&lt;!ELEMENT testentry EMPTY&gt;
&lt;!ATTLIST testentry 
   cd CDATA  #REQUIRED
   nm NMTOKEN #REQUIRED
   id ID #REQUIRED
   &gt;
]&gt;
&lt;doc&gt;
  &lt;testentry cd="_foo" nm="_foo" id="_foo"/&gt;
  &lt;testentry cd="too small" nm="too small" id="too small"/&gt;
  &lt;testentry cd="2three4" nm="2three4" id="2three4"/&gt;
  &lt;testentry cd="-man" nm="-man" id="-man"/&gt;
  &lt;testentry cd="two3four" nm="two3four" id="two3four"/&gt;
  &lt;testentry cd="Uhh-oops" nm="Uhh-oops" id="Uhh-oops"/&gt;
  &lt;testentry cd="a+b" nm="a+b" id="a+b"/&gt;
&lt;/doc&gt;</programlisting>

                      <para>This yields:</para>

                      <table xml:id="exerciseAtttypeLegalValue">
                        <title>Legal attribute values</title>

                        <?dbhtml table-width="40%" ?>

                        <?dbfo table-width="40%" ?>

                        <tgroup cols="4">
                          <colspec colwidth="3*"/>

                          <colspec colwidth="2*"/>

                          <colspec colwidth="2*"/>

                          <colspec colwidth="2*"/>

                          <tbody>
                            <row>
                              <entry/>

                              <entry><code>CDATA</code></entry>

                              <entry><code>NMTOKEN</code></entry>

                              <entry><code>ID</code></entry>
                            </row>

                            <row>
                              <entry><code>_foo</code></entry>

                              <entry>+</entry>

                              <entry>+</entry>

                              <entry>+</entry>
                            </row>

                            <row>
                              <entry><code>too small</code></entry>

                              <entry>+</entry>

                              <entry>-</entry>

                              <entry>-</entry>
                            </row>

                            <row>
                              <entry><code>2three4</code></entry>

                              <entry>+</entry>

                              <entry>+</entry>

                              <entry>-</entry>
                            </row>

                            <row>
                              <entry><code>-man</code></entry>

                              <entry>+</entry>

                              <entry>+</entry>

                              <entry>-</entry>
                            </row>

                            <row>
                              <entry><code>two3four</code></entry>

                              <entry>+</entry>

                              <entry>+</entry>

                              <entry>+</entry>
                            </row>

                            <row>
                              <entry><code>Uhh-oops</code></entry>

                              <entry>+</entry>

                              <entry>+</entry>

                              <entry>+</entry>
                            </row>

                            <row>
                              <entry><code>a+b</code></entry>

                              <entry>+</entry>

                              <entry>-</entry>

                              <entry>-</entry>
                            </row>

                            <row>
                              <entry><code>&amp;</code></entry>

                              <entry>-</entry>

                              <entry>-</entry>

                              <entry>-</entry>
                            </row>
                          </tbody>
                        </tgroup>
                      </table>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>

              <qandaset role="exercise">
                <title>book.dtd and internal references</title>

                <qandadiv>
                  <qandaentry xml:id="example_book.dtd_v5">
                    <question>
                      <para>We want to extent our DTD from <xref
                      linkend="example_book.dtd_v4"/> to allow document
                      internal references by:</para>

                      <itemizedlist>
                        <listitem>
                          <para>Allowing each <tag
                          class="starttag">chapter</tag>, <tag
                          class="starttag">para</tag> and <tag
                          class="starttag">itemizedlist</tag> to become
                          reference targets.</para>
                        </listitem>

                        <listitem>
                          <para>Extending the element <tag
                          class="element">para</tag>'s mixed content model by
                          a new element <tag class="element">link</tag> with
                          an attribute <tag class="attribute">linkend</tag>
                          being a reference to a target.</para>
                        </listitem>
                      </itemizedlist>
                    </question>

                    <answer>
                      <para>We extend our DTD:</para>

                      <programlisting>&lt;!ELEMENT book     (title, chapter+)&gt;
&lt;!ATTLIST book lang (en|fr|de|it|es) #IMPLIED &gt;
&lt;!ELEMENT chapter  (title, (para|itemizedlist)+)&gt;
&lt;!ATTLIST chapter 
    id <co xml:id="progamlisting_book_v5_chapter_id"/> ID #IMPLIED &gt;
&lt;!ELEMENT title    (#PCDATA)&gt;
&lt;!ELEMENT para     (#PCDATA|emphasis|link <co
                          xml:id="progamlisting_book_v5_mixed_link"/>)*&gt;
&lt;!ATTLIST para
    id <co xml:id="progamlisting_book_v5_para_id"/> ID #IMPLIED &gt;
&lt;!ELEMENT emphasis (#PCDATA)&gt;
&lt;!ELEMENT link (#PCDATA) <co xml:id="progamlisting_book_v5_link"/>&gt;
&lt;!ATTLIST link
    linkend <co xml:id="progamlisting_book_v5_link_linkend"/> IDREF #REQUIRED &gt;

&lt;!ELEMENT itemizedlist (listitem+)&gt;
&lt;!ATTLIST itemizedlist
    id <co xml:id="progamlisting_book_v5_itemizedList_id"/> ID #IMPLIED &gt;
&lt;!ELEMENT listitem ((para|itemizedlist)+)&gt;</programlisting>

                      <calloutlist>
                        <callout arch=""
                                 arearefs="progamlisting_book_v5_chapter_id progamlisting_book_v5_para_id progamlisting_book_v5_itemizedList_id">
                          <para>Defining an attribute <tag
                          class="attribute">id</tag> of type <code>ID</code>
                          for the elements <tag class="element">chapter</tag>,
                          <tag class="element">para</tag> and <tag
                          class="element">itemizedList</tag>. This enables an
                          author to define internal reference targets.</para>
                        </callout>

                        <callout arearefs="progamlisting_book_v5_mixed_link">
                          <para>A link is part of the element <tag
                          class="element">para</tag>'s mixed content model.
                          Thus an author may define internal references along
                          with ordinary text.</para>
                        </callout>

                        <callout arearefs="progamlisting_book_v5_link">
                          <para>Like in HTML a link may contain text. If
                          converted to HTML the formatting expectation is a
                          hypertext link.</para>
                        </callout>

                        <callout arearefs="progamlisting_book_v5_link_linkend">
                          <para>The attribute <tag
                          class="attribute">linkend</tag> holds the reference
                          to an internal target being either a <tag
                          class="element">chapter</tag>, a <tag
                          class="element">para</tag> or an <tag
                          class="element">itemizedList</tag>.</para>
                        </callout>
                      </calloutlist>
                    </answer>
                  </qandaentry>
                </qandadiv>
              </qandaset>
            </section>
          </section>

          <section xml:id="section_attribute_default">
            <title>Attribute default values</title>

            <para>We have implicitly introduced attribute default values
            already. The formal production rule reads:</para>

            <productionset>
              <title>Attribute Defaults</title>

              <production xml:id="w3RecXml_NT-DefaultDecl">
                <lhs>DefaultDecl</lhs>

                <rhs>'#REQUIRED' | '#IMPLIED' | (('#FIXED' <nonterminal
                def="#w3RecXml_NT-S">S</nonterminal>)? <nonterminal
                def="#w3RecXml_NT-AttValue">AttValue</nonterminal>)</rhs>
              </production>
            </productionset>

            <para>We have already introduced <code>#REQUIRED</code> and
            <code>#IMPLIED</code> describing attribute values that
            <emphasis>must</emphasis> be specified and attribute values that
            <emphasis>may</emphasis> be specified. The Attribute type
            declaration <code>#FIXED</code> is typically used during DTD
            development and rarely for production systems. In a nutshell it
            enables a DTD author to define an attribute with a fixed value
            that cannot be overwritten by an author in a document
            instance.</para>

            <figure xml:id="attTypeFixed">
              <title>The attribute type <code>#FIXED</code></title>

              <programlisting xml:id="figure_fixed">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE configuration [
&lt;!ELEMENT configuration (property*)&gt;
&lt;!ELEMENT property EMPTY&gt;
&lt;!ATTLIST property
  version CDATA #FIXED "3.4" <co xml:id="programmlisting_fixed_attfixed"/>
  key NMTOKEN #REQUIRED
  value CDATA #IMPLIED &gt;
]&gt;
&lt;configuration&gt;
  &lt;property key="user" value="admin"/&gt; <co
                  xml:id="programmlisting_fixed_unset"/>
  &lt;property key="password" value="verySecret" version="3.4" <co
                  xml:id="programmlisting_fixed_correctlyset"/> /&gt;

    &lt;!-- Ooops! --&gt;
  &lt;property key="ldapHost" value="141.62.1.5" version="3.7" <co
                  xml:id="programmlisting_fixed_illdefined"/>/&gt;
&lt;/configuration&gt;</programlisting>
            </figure>

            <calloutlist>
              <callout arearefs="programmlisting_fixed_attfixed">
                <para>For each <tag class="element">property</tag> node the
                attribute <tag class="attribute">version</tag> with value <tag
                class="attvalue">3.4</tag> is automatically defined.</para>
              </callout>

              <callout arearefs="programmlisting_fixed_unset">
                <para>The attribute <tag class="attribute">version</tag> is
                not explicitly set. Any software acting on the document will
                see the value <tag class="attvalue">3.4</tag> though.</para>
              </callout>

              <callout arearefs="programmlisting_fixed_correctlyset">
                <para>The attribute <tag class="attribute">version</tag> is
                explicitly set to the value <tag class="attvalue">3.4</tag>
                being defined in the DTD.</para>
              </callout>

              <callout arearefs="programmlisting_fixed_illdefined">
                <para>The attribute <tag class="attribute">version</tag> is
                explicitly set to the value <tag class="attvalue">3.7</tag>
                differing from the value <tag class="attvalue">3.4</tag> being
                defined in the DTD. A validating parser will complain:</para>

                <programlisting><errortext>[Xerces] Attribute "version" with value "3.7" must have a value of "3.4".</errortext></programlisting>
              </callout>
            </calloutlist>

            <para>Next we discuss attributes with default value
            definitions:</para>

            <figure xml:id="attDefDefault">
              <title xml:id="figure_attribute_default">Attribute definitions
              with default values</title>

              <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE doc [
&lt;!ELEMENT doc (para*)&gt;
&lt;!ELEMENT para (#PCDATA)&gt;
&lt;!ATTLIST para
    language CDATA "english" <co
                  xml:id="programlisting_attribute_default_language"/>&gt;
]&gt;
&lt;doc&gt;
  &lt;para language="french" <co
                  xml:id="programlisting_attribute_default_french"/>&gt;Une maison&lt;/para&gt;
  &lt;para <co xml:id="programlisting_attribute_default_implicit"/>&gt;A house&lt;/para&gt;
  &lt;para language="english" <co
                  xml:id="programlisting_attribute_default_defaultoverride"/>&gt;Another house&lt;/para&gt;
&lt;/doc&gt;</programlisting>
            </figure>

            <calloutlist>
              <callout arearefs="programlisting_attribute_default_language">
                <para>Declaration of an attribute <tag
                class="attribute">language</tag> with default value <tag
                class="attvalue">english</tag>.</para>
              </callout>

              <callout arearefs="programlisting_attribute_default_french">
                <para>The attribute value may be overridden as long as the
                content conforms to the <code>CDATA</code> attribute
                type.</para>
              </callout>

              <callout arearefs="programlisting_attribute_default_implicit">
                <para>A <tag class="starttag">para</tag> node with implicit
                value <tag class="attribute">language="english"</tag>.</para>
              </callout>

              <callout arearefs="programlisting_attribute_default_defaultoverride">
                <para>Explicitly setting the DTD default value.</para>
              </callout>
            </calloutlist>

            <para>So the difference in declaring an attribute value either
            <code>#FIXED</code> or with an ordinary default is the fact, that
            the latter may be overridden with a value differing from the
            default being supplied in the DTD.</para>
          </section>
        </section>

        <section xml:id="catalogs">
          <title>Catalogs for <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s</title>

          <para>Till now our method to reference a DTD from a document
          instance is via a SYSTEM reference:</para>

          <programlisting>&lt;!DOCTYPE book SYSTEM "ftp://someserver.com/book.dtd"&gt; ...</programlisting>

          <para>As mentioned before the DTD may be accessed from the file
          system or referenced by different protocols like http. As an example
          we consider the XML version of the hypertext markup language
          HTML:</para>

          <figure xml:id="figure_xhtmlbase">
            <title>A simple XHTML document</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
  &lt;head&gt;&lt;title&gt;A first start&lt;/title&gt;&lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;A first start&lt;/h1&gt;
    &lt;p&gt;This is a very simple document&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>
          </figure>

          <para>In this example the DTD can be accessed via http. This seems
          to be perfect: A parser reads the document and retrieves referenced
          resources. But what happens if the HTTP server
          <code>www.w3.org</code> is inaccessible? Or if someone wants to work
          offline or in a company's intra net with restricted access policies?
          In all these cases it is desirable to have a local copy of the DTD
          to become independent from a remote server. The most simple solution
          is a copy the complete DTD to the host's local file system:</para>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html SYSTEM "C:\mystuff\xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
...</programlisting>

          <para>This seems to solve the problem of resources being
          unavailable. But what about interoperability? If we want to exchange
          documents with other people we cannot expect our partners to supply
          the DTD at the same location in the file system. For this reason XML
          supports the concept of <emphasis>public identifiers</emphasis>. We
          extend the current example:</para>

          <figure xml:id="figure_xhtml_public">
            <title>A XHTML document insversion 2 oftance with public and
            system identifier</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
   "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
  &lt;head&gt;&lt;title&gt;A first start&lt;/title&gt;&lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;A first start&lt;/h1&gt;
    &lt;p&gt;This is a very simple document&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>
          </figure>

          <para>The String <quote>-//W3C//DTD XHTML 1.0 Strict//EN</quote>
          should uniquely identify the given DTD. Thus a different XHTML DTD
          version or even a different XML DTD <emphasis>must have</emphasis> a
          different public identifier. Note that in the above example a
          <code>SYSTEM</code> identifier
          <code>http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</code> must
          still be present although the keyword <code>SYSTEM</code> is
          absent.</para>

          <para>Now a parser may use a <code>PUBLIC</code> identifier to find
          the DTD even if the resource being referenced by the
          <code>SYSTEM</code> identifier's value is unavailable. This is
          achieved by so called DTD catalogs. A catalog maps
          <code>PUBLIC</code> identifier values to physical resources. It may
          be conceived as a map:</para>

          <figure xml:id="publicSystemDict">
            <title>A catalog joining public identifiers with physical
            resources.</title>

            <programlisting>OVERRIDE YES <co
                xml:id="figure_emacs_catalog_preferpublic"/>   -- prefer public identifiers to system identifiers --
...
        -- XHTML 1.0 --
PUBLIC "-//W3C//DTD XHTML 1.0 Frameset//EN" <co
                xml:id="figure_emacs_catalog_pubid"/>   xhtml1-frameset.dtd <co
                xml:id="figure_emacs_catalog_resource"/> 
PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"        xhtml1-strict.dtd
PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"  xhtml1-transitional.dtd
...
        -- Docbook 3.1 --
PUBLIC "-//OASIS//DTD DocBook V3.1//EN"          docbook.dtd
...</programlisting>
          </figure>

          <calloutlist>
            <callout arearefs="figure_emacs_catalog_preferpublic">
              <para>As being stated in the subsequent comment public
              identifiers will have precedence over system identifiers.</para>
            </callout>

            <callout arearefs="figure_emacs_catalog_pubid">
              <para>A public identifier with value <code>-//W3C//DTD XHTML 1.0
              Frameset//EN</code> ...</para>
            </callout>

            <callout arearefs="figure_emacs_catalog_resource">
              <para>... and the corresponding value
              <filename>${BASEDIR}/xhtml1-frameset.dtd</filename>.</para>
            </callout>
          </calloutlist>

          <para>The format of a catalog file is by no means specified. Some
          applications prefer XML formats to store these mappings. We note
          that in presence of a <code>PUBLIC</code> identifier an XML
          application is free to choose either of the two offered DTD files if
          both are accessible.</para>

          <qandaset role="exercise">
            <title>Relation between public and system identifiers</title>

            <qandadiv>
              <qandaentry xml:id="example_public_system">
                <question>
                  <para>We recall <xref linkend="figure_xhtml_public"/>. The
                  public identifier uniquely identifies the DTD. Thus the
                  system identifier still being present seems to be
                  superfluous. How does a parser react if we omit it? Read the
                  XML specification and find the corresponding
                  definition.</para>
                </question>

                <answer>
                  <para>Omitting the <code>SYSTEM</code> identifier yields a
                  parsing error:</para>

                  <programlisting><errortext>The system identifier must begin with either a single or
double quote character.</errortext></programlisting>

                  <para>This message is a bit confusing. Actually the
                  <code>SYSTEM</code> identifier <emphasis>must</emphasis>
                  still be present and a better parser should actually
                  complain about its absence instead of only remarking the
                  missing begin quotes. The production rule indeed states that
                  even for <code>PUBLIC</code> identifiers a system literal is
                  mandatory:</para>

                  <productionset>
                    <title>External Entity Declaration</title>

                    <production xml:id="w3RecXml_NT-ExternalID">
                      <lhs>ExternalID</lhs>

                      <rhs>'SYSTEM' <nonterminal
                      def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                      def="#w3RecXml_NT-SystemLiteral">SystemLiteral</nonterminal>
                      <sbr/> | 'PUBLIC' <nonterminal
                      def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                      def="#w3RecXml_NT-PubidLiteral">PubidLiteral</nonterminal>
                      <nonterminal def="#w3RecXml_NT-S">S</nonterminal>
                      <nonterminal
                      def="#w3RecXml_NT-SystemLiteral">SystemLiteral</nonterminal></rhs>
                    </production>

                    <production xml:id="w3RecXml_NT-NDataDecl">
                      <lhs>NDataDecl</lhs>

                      <rhs><nonterminal def="#w3RecXml_NT-S">S</nonterminal>
                      'NDATA' <nonterminal
                      def="#w3RecXml_NT-S">S</nonterminal> <nonterminal
                      def="#w3RecXml_NT-Name">Name</nonterminal></rhs>
                    </production>
                  </productionset>
                </answer>
              </qandaentry>

              <qandaentry xml:id="example_public_dtdlookup">
                <question>
                  <label>DTD lookup by PUBLIC identifier</label>

                  <para>Modify the document of the preceding exercise
                  by:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Change the <code>PUBLIC</code> identifier from
                      <code>-//W3C//DTD XHTML 1.0 Strict//EN</code> to
                      <code>-//W3C//DTD XHTML 1.0
                      Transitional//EN</code>.</para>
                    </listitem>

                    <listitem>
                      <para>Change the <code>SYSTEM</code> identifier to a
                      resource name which cannot be retrieved.</para>
                    </listitem>
                  </itemizedlist>

                  <para>Use the Oxygen plug in to check whether this document
                  instance is still valid. Which DTD is used for validation?
                  Hint: Check the
                  <option>Window-&gt;Preferences-&gt;oxyGen-&gt;XML-&gt;XML
                  Catalog</option> menu.</para>
                </question>

                <answer>
                  <para>We modify the <code>SYSTEM</code> identifier by
                  omitting the <filename>.dtd</filename> suffix. Thus the DTD
                  cannot be retrieved by this <link
                  xlink:href="http://www.w3.org/Addressing">URL</link> any
                  longer. But we observe that the document remains valid. We
                  conclude that the parser found a DTD via the
                  <code>PUBLIC</code> identifier.</para>

                  <para>This assumption is indeed true: In the indicated
                  options menu we find that a master catalog file
                  <filename>/usr/share/.../frameworks/catalog.xml</filename>
                  is used for looking up <code>PUBLIC</code>
                  identifiers:</para>

                  <programlisting>&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN" 
  "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
...
  &lt;nextCatalog catalog="xhtml/dtd/xhtmlcatalog.xml" /&gt;
  &lt;nextCatalog catalog="xhtml11/dtd/xhtmlcatalog.xml" /&gt;
  &lt;nextCatalog catalog="xhtml11/schema/xhtmlcatalog.xml" /&gt;
...
&lt;/catalog&gt;</programlisting>

                  <para>And in <filename>xhtml/dtd/xhtmlcatalog.xml</filename>
                  we find:</para>

                  <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE catalog PUBLIC "-//OASIS//DTD Entity Resolution XML Catalog V1.0//EN"
    "http://www.oasis-open.org/committees/entity/release/1.0/catalog.dtd"&gt;
&lt;catalog xmlns="urn:oasis:names:tc:entity:xmlns:xml:catalog"&gt;
...
  &lt;public publicId="<emphasis role="bold">-//W3C//DTD XHTML 1.0 Transitional//EN</emphasis>" uri="<emphasis
                      role="bold">xhtml1-transitional.dtd</emphasis>"/&gt;
  &lt;public publicId="<emphasis role="bold">-//W3C//DTD XHTML 1.0 Transitional//EN</emphasis>" uri="<emphasis
                      role="bold">xhtml1-strict.dtd</emphasis>"/&gt;
  &lt;public publicId="-//W3C//DTD XHTML 1.0 Frameset//EN" uri="xhtml1-frameset.dtd"/&gt;
...
&lt;/catalog&gt;</programlisting>

                  <para>We learn from this example that a W3C standard
                  describing a catalog file's structure exists.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>
      </section>

      <section xml:id="xhtml">
        <title>The XHTML DTD</title>

        <para>The XHTML standard is completely defined in terms of a family of
        <abbrev
        xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s.
        One member of this family is denoted as <emphasis>strict</emphasis>
        referring to the largest distinction with regards to
        <quote>traditional</quote> HTML. We start with a <quote>Hello,
        World</quote> example:</para>

        <figure xml:id="htmlHelloRender">
          <title>A XHTML Hello, World example and its rendering</title>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
    &lt;head&gt;
        &lt;title&gt;Hello Example&lt;/title&gt;
    &lt;/head&gt;
    &lt;body&gt;
        &lt;h1&gt;Hello, World ...&lt;/h1&gt;
    &lt;/body&gt;
&lt;/html&gt;</programlisting>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Screen/hello.screen.png"/>
            </imageobject>
          </mediaobject>
        </figure>
      </section>
    </chapter>

    <chapter xml:id="xmlApis">
      <title><abbrev
      xlink:href="http://en.wikipedia.org/wiki/Api">API</abbrev>s for XML
      document processing</title>

      <section xml:id="sax">
        <title>The Simple API for XML</title>

        <section xml:id="saxPrinciple">
          <title>The principle of a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym>
          application</title>

          <para>We are already familiar with transformations of XML document
          instances to other formats. Sometimes the capabilities being offered
          by a given transformation approach do not suffice for a given
          problem. Obviously a general purpose programming language like <link
          linkend="gloss_Java"><trademark>Java</trademark></link> offers
          superior means to perform advanced manipulations of XML document
          trees.</para>

          <para>Before diving into technical details we present an example
          exceeding the limits of our present transformation capabilities. We
          want to format an XML catalog document with article descriptions to
          HTML. The price information however shall resides in a XML document
          external database namely a RDBMS:</para>

          <figure xml:id="saxRdbmsAccessPrinciple">
            <title>Generating HTML from a XML document and an RDBMS.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/saxxmlrdbms.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>Our catalog might look like:</para>

          <figure xml:id="simpleCatalog">
            <title>A <link linkend="gloss_XML"><abbrev>XML</abbrev></link>
            based catalog.</title>

            <programlisting>&lt;catalog&gt;
  &lt;item orderNo="<emphasis role="bold">3218</emphasis>"&gt;Swinging headset&lt;/item&gt;
  &lt;item orderNo="<emphasis role="bold">9921</emphasis>"&gt;200W Stereo Amplifier&lt;/item&gt;
&lt;/catalog&gt;</programlisting>
          </figure>

          <para>The RDBMS may hold some relation with a field
          <code>orderNo</code> as primary key and a corresponding attribute
          like <code>price</code>. In a real world application
          <code>orderNo</code> should probably be an integer typed
          <code>IDENTITY</code> attribute.</para>

          <figure xml:id="saxRdbmsSchema">
            <title>A Relation containing price information.</title>

            <programlisting>CREATE TABLE Product (
  orderNo CHAR(10) PRIMARY KEY
 ,price Money
)
        
INSERT INTO Product VALUES('<emphasis role="bold">3218</emphasis>', 42.57)
INSERT INTO Product VALUES('<emphasis role="bold">9921</emphasis>', 121.50)</programlisting>

            <caption>
              <para>Prices are depending on article numbers.</para>
            </caption>
          </figure>

          <para>The intended HTML output with order numbers being highlighted
          looks like:</para>

          <figure xml:id="saxPriceOut">
            <title>HTML generated output.</title>

            <programlisting>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&gt;
        &lt;html&gt;
          &lt;head&gt;&lt;title&gt;Available products&lt;/title&gt;&lt;/head&gt;
          &lt;body&gt;
            &lt;table border="1"&gt;
              &lt;tbody&gt;
                &lt;tr&gt;
                  &lt;th&gt;<emphasis role="bold">Order number</emphasis>&lt;/th&gt;
                  &lt;th&gt;Price&lt;/th&gt;
                  &lt;th&gt;Product&lt;/th&gt;
                &lt;/tr&gt;
                &lt;tr&gt;
                  &lt;td&gt;<emphasis role="bold">3218</emphasis>&lt;/td&gt;
                  &lt;td&gt;42,57&lt;/td&gt;
                  &lt;td&gt;Swinging headset&lt;/td&gt;
                &lt;/tr&gt;
                &lt;tr&gt;
                  &lt;td&gt;<emphasis role="bold">9921</emphasis>&lt;/td&gt;
                  &lt;td&gt;121,50&lt;/td&gt;
                  &lt;td&gt;200W Stereo Amplifier&lt;/td&gt;
                &lt;/tr&gt;
              &lt;/tbody&gt;
            &lt;/table&gt;
          &lt;/body&gt;
        &lt;/html&gt;</programlisting>

            <caption>
              <para>This result HTML document contains content both from our
              XML document an from the database table
              <code>Product</code>.</para>
            </caption>
          </figure>

          <para>The intended transformation is beyond the XSLT standard's
          processing capabilities: XSLT does not enable us to RDBMS content.
          However some XSLT processors provide extensions for this
          task.</para>

          <para>It is tempting to write a <link
          linkend="gloss_Java"><trademark>Java</trademark></link> application
          which might use e.g. <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          for database access. But how do we actually read and parse a XML
          file? Sticking to the <link
          linkend="gloss_Java"><trademark>Java</trademark></link> standard we
          might use a <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/java/io/FileInputStream.html">FileInputStream</link>
          instance to read from <code>catalog.xml</code> and write a XML
          parser by ourself. Fortunately <orgname>SUN</orgname>'s <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase">JDK</trademark>
          already includes an API denoted <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym>, the
          <emphasis>S</emphasis>imple <emphasis>A</emphasis>pi for
          <emphasis>X</emphasis>ml. The<productname
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdk-7-readme-429198.html">JDK</productname>
          also includes a corresponding parser implementation. In addition
          there are third party <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> parser
          implementations available like <productname
          xlink:href="http://xerces.apache.org">Xerces</productname> from the
          <orgname xlink:href="http://www.apache.org">Apache
          Foundation</orgname>.</para>

          <para>The <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> API is event
          based and will be illustrated by the relationship between customers
          and a software vendor company:</para>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/updateinfo.fig"/>
            </imageobject>
          </mediaobject>

          <para>After purchasing software customers are asked to register
          their software. This way the vendor receives the customer's address.
          Each time a new release is being completed all registered customers
          will receive a notification typically including a <quote>special
          offer</quote> to upgrade their software. From an abstract point of
          view the following two actions take place:</para>

          <variablelist>
            <varlistentry>
              <term>Registration</term>

              <listitem>
                <para>The customer registers itself at the company's site
                indicating it's interest in updated versions.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>Notification</term>

              <listitem>
                <para>Upon completion of each new software release (considered
                to be an <emphasis>event</emphasis>) a message is sent to all
                registered customers.</para>
              </listitem>
            </varlistentry>
          </variablelist>

          <para>The same principle applies to GUI applications in software
          development. A key press <emphasis>event</emphasis> for example will
          be forwarded by an application's <emphasis>event handler</emphasis>
          to a callback function (sometimes called a
          <emphasis>handler</emphasis> method) being implemented by an
          application developer. The <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> API works the
          same way: A parser reads a XML document generating events which
          <emphasis>may</emphasis> be handled by an application. During
          document parsing the XML tree structure gets
          <quote>flattened</quote> to a sequence of events:</para>

          <figure xml:id="saxFlattenEvent">
            <title>Parsing a XML document creates a corresponding sequence of
            events.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/saxmodel.pdf"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>An application may register components to the parser:</para>

          <figure xml:id="figureSax">
            <title><acronym
            xlink:href="http://www.saxproject.org">SAX</acronym>
            Principle</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/saxapparch.pdf"/>
              </imageobject>

              <caption>
                <para>A <acronym
                xlink:href="http://www.saxproject.org">SAX</acronym>
                application consists of a <acronym
                xlink:href="http://www.saxproject.org">SAX</acronym> parser
                and an implementation of event handlers being specific to the
                application. The application is developed by implementing the
                two handlers.</para>
              </caption>
            </mediaobject>
          </figure>

          <para>An Error Handler is required since the XML stream may contain
          errors. In order to implement a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> application we
          have to:</para>

          <orderedlist>
            <listitem>
              <para>Instantiate required objects:</para>

              <itemizedlist>
                <listitem>
                  <para>Parser</para>
                </listitem>

                <listitem>
                  <para>Event Handler</para>
                </listitem>

                <listitem>
                  <para>Error Handler</para>
                </listitem>
              </itemizedlist>
            </listitem>

            <listitem>
              <para>Register handler instances</para>

              <itemizedlist>
                <listitem>
                  <para>register Event Handler to Parser</para>
                </listitem>

                <listitem>
                  <para>register Error Handler to Parser</para>
                </listitem>
              </itemizedlist>
            </listitem>

            <listitem>
              <para>Start the parsing process by calling the parser's
              appropriate method.</para>
            </listitem>
          </orderedlist>
        </section>

        <section xml:id="saxIntroExample">
          <title>First steps</title>

          <para>Our first <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> toy application
          <classname>sax.stat.v1.ElementCount</classname> shall simply count
          the number of elements it finds in an arbitrary XML document. In
          addition the <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> events shall be
          written to standard output generating output sketched in <xref
          linkend="saxFlattenEvent"/>. The application's central
          implementation reads:</para>

          <figure xml:id="saxElementCount">
            <title>Counting XML elements.</title>

            <programlisting language="java">package sax.stat.v1;
...        
        
public class ElementCount {

  public void parse(final String uri) {
    try {
      final SAXParserFactory saxPf = SAXParserFactory.newInstance();
      final SAXParser saxParser = saxPf.newSAXParser();
      saxParser.parse(uri, eventHandler);
    } catch (ParserConfigurationException e){
      e.printStackTrace(System.err);
    } catch (org.xml.sax.SAXException e) {
      e.printStackTrace(System.err);
    } catch (IOException e){
      e.printStackTrace(System.err);
    }
  }

  public int getElementCount() {
    return eventHandler.getElementCount();
  }
  private final MyEventHandler eventHandler = new MyEventHandler();
}</programlisting>

            <caption>
              <para>This application works for arbitrary well-formed XML
              documents.</para>
            </caption>
          </figure>

          <para>We now explain this application in detail. The first part
          deals with the instantiation of a parser:</para>

          <programlisting language="java">try {
   final SAXParserFactory saxPf = <emphasis role="bold">SAXParserFactory</emphasis>.newInstance();
   final SAXParser saxParser = saxPf.newSAXParser();
   saxParser.parse(uri, eventHandler);
} catch (ParserConfigurationException e){
   e.printStackTrace(System.err);
} ...</programlisting>

          <para>In order to keep an application independent from a specific
          parser implementation the <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> uses the so
          called <link
          xlink:href="http://www.dofactory.com/Patterns/PatternAbstract.aspx">Abstract
          Factory Pattern</link> instead of simply calling a constructor from
          a vendor specific parser class.</para>

          <para>In order to be useful the parser has to be instructed to do
          something meaningful when a XML document gets parsed. For this
          purpose our application supplies an event handler instance:</para>

          <programlisting language="java">public void parse(final String uri) {
  try {
    final SAXParserFactory saxPf = SAXParserFactory.newInstance();
    final SAXParser saxParser = saxPf.newSAXParser();
    saxParser.parse(uri, <emphasis role="bold">eventHandler</emphasis>);
  } catch (org.xml.sax.SAXException e) {
 ...
  private final MyEventHandler <emphasis role="bold">eventHandler = new MyEventHandler()</emphasis>;
}</programlisting>

          <para>What does the event handler actually do? It offers methods to
          the parser being callable during the parsing process:</para>

          <programlisting language="java">package sax.stat.v1;
...        
public class MyEventHandler extends <classname>org.xml.sax.helpers.DefaultHandler</classname> {
        
  public void <emphasis role="bold"><emphasis role="bold">startDocument()</emphasis></emphasis><co
              xml:id="programlisting_eventhandler_startDocument"/> {
    System.out.println("Opening Document");
  }
  public void <emphasis role="bold">endDocument()</emphasis><co
              xml:id="programlisting_eventhandler_endDocument"/> {
    System.out.println("Closing Document");
  }
  public void <emphasis role="bold">startElement(String namespaceUri, String localName, String rawName,
                     Attributes attrs)</emphasis> <co
              xml:id="programlisting_eventhandler_startElement"/>{
    System.out.println("Opening \"" + rawName + "\"");
    elementCount++;
  }
  public void <emphasis role="bold">endElement(String namespaceUri, String localName,
    String rawName)</emphasis><co
              xml:id="programlisting_eventhandler_endElement"/>{
    System.out.println("Closing \"" + rawName + "\"");
  }
  public void <emphasis role="bold">characters(char[] ch, int start, int length)</emphasis><co
              xml:id="programlisting_eventhandler_characters"/>{
    System.out.println("Content \"" + new String(ch, start, length) + '"');
  }
  public int getElementCount() <co
              xml:id="programlisting_eventhandler_getElementCount"/>{
    return elementCount;
  }
  private int elementCount = 0;
}</programlisting>

          <calloutlist>
            <callout arearefs="programlisting_eventhandler_startDocument">
              <para>This method gets called exactly once namely when opening
              the XML document as a whole.</para>
            </callout>

            <callout arearefs="programlisting_eventhandler_endDocument">
              <para>After successfully parsing the whole document instance
              this method will finally be called.</para>
            </callout>

            <callout arearefs="programlisting_eventhandler_startElement">
              <para>This method gets called each time a new element is parsed.
              In the given catalog.xml example it will be called three times:
              First when the <tag class="starttag">catalog</tag> appears and
              then two times upon each &lt;item ... &gt;. The supplied
              parameters depend whether or not name space processing is
              enabled.</para>
            </callout>

            <callout arearefs="programlisting_eventhandler_endElement">
              <para>Called each time an element like <tag
              class="starttag">item ...</tag> gets closed by its counterpart
              <tag class="endtag">item</tag>.</para>
            </callout>

            <callout arearefs="programlisting_eventhandler_characters">
              <para>This method is responsible for the treatment of textual
              content i.e. handling <code>#PCDATA</code> element content. We
              will explain its uncommon signature a little bit later.</para>
            </callout>

            <callout arearefs="programlisting_eventhandler_getElementCount">
              <para><function>getElementCount()</function> is a getter method
              to read only access the private field
              <varname>elementCount</varname> which gets incremented in <coref
              linkend="programlisting_eventhandler_startElement"/> each time
              an XML element opens.</para>
            </callout>
          </calloutlist>

          <para>The call <code>saxParser.parse(uri, eventHandler)</code>
          actually initiates the parsing process and tells the parser
          to:</para>

          <itemizedlist>
            <listitem>
              <para>Open the XML document being referenced by the URI
              argument.</para>
            </listitem>

            <listitem>
              <para>Forward XML events to the event handler instance supplied
              by the second argument.</para>
            </listitem>
          </itemizedlist>

          <para>A driver class containing a <code>main(...)</code> method may
          start the whole process and print out the desired number of elements
          upon completion of a parsing run:</para>

          <programlisting language="java">package sax.stat.v1;
        
public class ElementCountDriver {
  public static void main(String argv[]) {
    ElementCount xmlStats = new ElementCount();
    xmlStats.parse("<emphasis role="bold">Input/Sax/catalog.xml</emphasis>");
    System.out.println("Document contains " + xmlStats.<emphasis role="bold">getElementCount()</emphasis> + " elements");
  }
}</programlisting>

          <para>Processing the catalog example instance yields:</para>

          <programlisting>Opening Document
<emphasis role="bold">Opening "catalog"</emphasis> <co
              xml:id="programlisting_catalog_output"/>
Content "
  "
<emphasis role="bold">Opening "item"</emphasis> <co
              xml:id="programlisting_catalog_item1"/>
Content "Swinging headset"
Closing "item"
Content "
  "
<emphasis role="bold">Opening "item"</emphasis>  <co
              xml:id="programlisting_catalog_item2"/>
Content "200W Stereo Amplifier"
Closing "item"
Content "
"
Closing "catalog"
Closing Document
<emphasis role="bold">Document contains 3 elements</emphasis> <co
              xml:id="programlisting_catalog_elementcount"/></programlisting>

          <calloutlist>
            <callout arearefs="programlisting_catalog_output">
              <para>Start parsing element <tag
              class="starttag">catalog</tag>.</para>
            </callout>

            <callout arch="" arearefs="programlisting_catalog_item1">
              <para>Start parsing element <tag class="starttag">item
              orderNo="3218"</tag>Swinging headset<tag class="endtag"
              role="">item</tag>.</para>
            </callout>

            <callout arch="" arearefs="programlisting_catalog_item2">
              <para>Start parsing element <tag class="starttag">item
              orderNo="9921"</tag>200W Stereo Amplifier<tag class="endtag"
              role="">item</tag>.</para>
            </callout>

            <callout arearefs="programlisting_catalog_elementcount">
              <para>After the parsing process has completed the application
              outputs the number of elements being counted so far.</para>
            </callout>
          </calloutlist>

          <para>The output contains some lines of <quote>empty</quote>
          content. This content is due to whitespace being located between
          elements. For example a newline appears between the the <tag
          class="starttag">catalog</tag> and the first <tag
          class="starttag">item</tag> element. The parser encapsulates this
          whitespace in a call to the <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/ContentHandler.html#characters(char[],%20int,%20int)">characters</link>
          method. In an application this call will typically be ignored. XML
          document instances in a professional context will typically not
          contain any newline characters at all. Instead the whole document is
          represented as a single line. This inhibits human readability which
          is not required if the processing applications work well. In this
          case empty content as above will not appear.</para>

          <para>The <code>characters(char[] ch, int start, int length)</code>
          method's signature looks somewhat strange regarding <link
          linkend="gloss_Java"><trademark>Java</trademark></link> conventions.
          One might expect <code>characters(String s)</code>. But this way the
          <acronym xlink:href="http://www.saxproject.org">SAX</acronym> API
          allows efficient parser implementations: A parser may initially
          allocate a reasonable large <code>char</code> array of say 128 bytes
          sufficient to hold 64 (<link
          xlink:href="http://unicode.org">Unicode</link>) characters. If this
          buffer gets exhausted the parser might allocate a second buffer of
          double size thus implementing an <quote>amortized doubling</quote>
          algorithm:</para>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/saxcharacter.pdf"/>
            </imageobject>
          </mediaobject>

          <para>In this example the first element content fits in the first
          buffer. The second content <code>200W Stereo Amplifier</code> and
          the third content <code>Earphone</code> both fit in the second
          buffer. Subsequent content may require further buffer allocations.
          Such a strategy minimizes the number of time consuming <code>new
          </code> <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html">String</link>
          <code>(...)</code> constructor calls being necessary for the more
          convenient API variant <code>characters(String s)</code>.</para>
        </section>

        <section xml:id="saxRegistry">
          <title>Event- and error handler registration</title>

          <para>Our first <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> application
          suffers from the following deficiencies:</para>

          <itemizedlist>
            <listitem>
              <para>The error handling is very sparse. It completely relies on
              exceptions being thrown by classes like <link
              xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/SAXException.html">SAXException</link>
              which frequently do not supply meaningful error
              information.</para>
            </listitem>

            <listitem>
              <para>The application is not aware of namespaces. Thus reading
              e.g. <abbrev
              xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> document
              instances will not allow to distinguish between elements from
              different namespaces like HTML.</para>
            </listitem>

            <listitem>
              <para>The parser will not validate a document instance against a
              DTD being present.</para>
            </listitem>
          </itemizedlist>

          <para>We now incrementally add these features to the <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> parsing
          process. <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> offers an
          interface <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/XMLReader.html">XmlReader</link>
          to conveniently <emphasis>register</emphasis> event- and error
          handler instances instead of passing them as a separate argument to
          the <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/parsers/SAXParser.html#parse(java.lang.String,%20org.xml.sax.helpers.DefaultHandler)">parse</link>
          method. We first code an error handler class by implementing the
          interface <classname>org.xml.sax.ErrorHandler</classname> being part
          of the <acronym xlink:href="http://www.saxproject.org">SAX</acronym>
          API:</para>

          <programlisting language="java">package sax.stat.v2;
...        
public class MyErrorHandler implements ErrorHandler {

  <emphasis role="bold">public void warning(SAXParseException e)</emphasis> {
    System.err.println("[Warning]" + getLocationString(e));
  }
  <emphasis role="bold">public void error(SAXParseException e)</emphasis> {
    System.err.println("[Error]" + getLocationString(e));
  }
  <emphasis role="bold">public void fatalError(SAXParseException e)</emphasis> throws SAXException{
    System.err.println("[Fatal Error]" + getLocationString(e));
  }
  private String getLocationString(SAXParseException e) {
    return " line " + e.getLineNumber() +
    ", column " + e.getColumnNumber()+ ":" +  e.getMessage();
  }
}</programlisting>

          <para>These three methods represent the
          <classname>org.xml.sax.ErrorHandler</classname> interface. The
          method <function>getLocationString</function> is used to supply
          precise parsing error locations by means of line- and column numbers
          within a document instance. If errors or warnings are encountered
          the parser will call one of the appropriate public methods:</para>

          <figure xml:id="saxMissItem">
            <title>A non well formed document.</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;catalog&gt;
  &lt;item orderNo="3218"&gt;Swinging headset&lt;/item&gt;
  &lt;item orderNo="9921"&gt;200W Stereo Amplifier 
&lt;/catalog&gt;</programlisting>

            <caption>
              <para>This document is not well formed since due to a missing a
              closing <tag class="endtag">item</tag> tag is missing.</para>
            </caption>
          </figure>

          <para>Our error handler method gets called yielding an informative
          message:</para>

          <programlisting>[Fatal Error] line 5, column -1:Expected "&lt;/item&gt;" to terminate
element starting on line 4.</programlisting>

          <para>This error output is achieved by
          <emphasis>registering</emphasis> an instance of
          <classname>sax.stat.v2.MyErrorHandler</classname> to the parser
          prior to starting the parsing process. In the following code snippet
          we also register a content handler instance to the parser and thus
          separate the parser's configuration from its invocation:</para>

          <programlisting language="java">package sax.stat.v2;
...
public class ElementCount {
  public ElementCount()
   throws SAXException, ParserConfigurationException{
      final SAXParserFactory saxPf = SAXParserFactory.newInstance();
      final SAXParser saxParser = saxPf.newSAXParser();
      xmlReader = saxParser.getXMLReader();
      xmlReader.setContentHandler(eventHandler); <co
              xml:id="programlisting_assemble_parser_setcontenthandler"/>
      xmlReader.setErrorHandler(errorHandler); <co
              xml:id="programlisting_assemble_parser_seterrorhandler"/>
  }
  public void parse(final String uri)
    throws IOException, SAXException{
    xmlReader.parse(uri); <co
              xml:id="programlisting_assemble_parser_invokeparse"/>
  }
  public int getElementCount() {
    return eventHandler.getElementCount(); <co
              xml:id="programlisting_assemble_parser_getelementcount"/>
  }
  private final XMLReader xmlReader;
  private final MyEventHandler eventHandler = new MyEventHandler(); <co
              xml:id="programlisting_assemble_parser_createeventhandler"/>
  private final MyErrorHandler errorHandler = new MyErrorHandler(); <co
              xml:id="programlisting_assemble_parser_createerrorhandler"/>
}</programlisting>

          <calloutlist>
            <callout arearefs="programlisting_assemble_parser_setcontenthandler programlisting_assemble_parser_seterrorhandler">
              <para>Referring to <xref linkend="figureSax" os=""/> these two
              calls attach the event- and error handler objects to the parser
              thus implementing the two arrows from the parser to the
              application's implementation.</para>
            </callout>

            <callout arearefs="programlisting_assemble_parser_invokeparse">
              <para>The parser is invoked. Note that in this example we only
              pass a document's URI but no reference to a handler
              object.</para>
            </callout>

            <callout arearefs="programlisting_assemble_parser_getelementcount">
              <para>The method <function>getElementCount()</function> is
              needed to allow a calling object to access the private
              <varname>eventHandler</varname> object's
              <function>getElementCount()</function> method.</para>
            </callout>

            <callout arearefs="programlisting_assemble_parser_createeventhandler programlisting_assemble_parser_createerrorhandler">
              <para>An event handling and an error handling object are created
              to handle events during the parsing process.</para>
            </callout>
          </calloutlist>

          <para>The careful reader might notice a subtle difference between
          the content- and the error handler implementation: The class
          <classname>sax.stat.v2.MyErrorHandler</classname> implements the
          interface <classname>org.xml.sax.ErrorHandler</classname>. But
          <classname>sax.stat.v2.MyEventHandler</classname> is derived from
          <classname>org.xml.sax.helpers.DefaultHandler</classname> which
          itself implements the
          <classname>org.xml.sax.ContentHandler</classname> interface.
          Actually one might as well start from the latter interface requiring
          to implement all of it's 11 methods. In most circumstances this only
          complicates the application's code since it is unnecessary to react
          to events belonging for example to processing instructions. For this
          reason it is good coding practice to use the empty default
          implementations in
          <classname>org.xml.sax.helpers.DefaultHandler</classname> and to
          redefine only those methods corresponding to events actually being
          handled by the application in question.</para>

          <qandaset role="exercise">
            <title>Reading XML attributes</title>

            <qandadiv>
              <qandaentry xml:id="exercise_saxAttrib">
                <question>
                  <label>Reading an element's set of attributes.</label>

                  <para>The example document instance does include <tag
                  class="attribute">orderNo</tag> attribute values for each
                  <tag class="starttag">item</tag> element. The parser does
                  not yet show these attribute keys and their corresponding
                  values. Read the documentation for <classname
                  xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/Attributes.html">org.xml.sax.Attributes</classname>
                  and extend the given code to use it.</para>
                </question>

                <answer>
                  <para>For the given example it would suffice to read the
                  known <tag class="attribute">orderNo</tag> attributes value.
                  A generic solution may ask for the set of all defined
                  attributes and show their values:</para>

                  <programlisting language="java">package sax;
        
public class AttribEventHandler extends DefaultHandler {

  public void startElement(String namespaceUri, String localName,
      String rawName, Attributes attrs) {
    System.out.println("Opening Element " + rawName);
    for (int i = 0; i &lt; attrs.getLength(); i++){
      System.out.println(attrs.getQName(i) + "=" + attrs.getValue(i) + "\n");
    }
  }
}</programlisting>
                </answer>
              </qandaentry>

              <qandaentry xml:id="saxRdbms">
                <question>
                  <label>SAX processing with RDBMS access.</label>

                  <para>Implement the example given in <xref
                  linkend="saxRdbmsAccessPrinciple"/> to produce the output
                  sketched in <xref linkend="saxPriceOut"/>. You may start by
                  implementing <emphasis>and testing</emphasis> the following
                  methods of a RDBMS interfacing class using <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>:</para>

                  <programlisting language="java">package sax.rdbms;
        
public class RdbmsAccess {

  public void connect(final String host, final int port,
      final String userName, final String password) {
     // <emphasis role="bold">open connection to a database</emphasis>
  }
  public String readPrice(final String articleNumber) {
    return "0";  // <emphasis role="bold">To be implemented as access to a ResultSet object</emphasis>
  }
  public void close() {
    // <emphasis role="bold">close database connection</emphasis>
  }
}</programlisting>

                  <para>You may find it helpful to write a small testbed for
                  the RDBMS access functionality prior to integrate it into
                  your <acronym
                  xlink:href="http://www.saxproject.org">SAX</acronym>
                  application producing HTML output.</para>
                </question>

                <answer>
                  <para>We start by creating a suitable RDBMS Table:</para>

                  <programlisting>CREATE SCHEMA AUTHORIZATION midb2
CREATE TABLE Product(
  orderNo CHAR(10) NOT NULL PRIMARY KEY
 ,price DECIMAL (9,2) NOT NULL 
)</programlisting>

                  <para>Next we feed some toy data:</para>

                  <programlisting>INSERT INTO Product VALUES('x-223', 330.20);
INSERT INTO Product VALUES('w-124', 110.40);</programlisting>

                  <para>Now we implement our RDBMS access class:</para>

                  <programlisting language="java">package dom.xsl;
...
public class DbAccess {

  public void connect(final String jdbcUrl, 
      final String userName, final String password) {
    try {
      conn = DriverManager.getConnection(jdbcUrl, userName, password);
      priceQuery = conn.prepareStatement(sqlPriceQuery);
    } catch (SQLException e) {
      System.err.println("Unable to open connection to database:" + e);}
  }
  public String readPrice(final String articleNumber) {
    String result;
    try {
      priceQuery.setString(1, articleNumber);
      final ResultSet rs = priceQuery.executeQuery();
      if (rs.next()) {
        result = rs.getString("price");
      } else {
        result = "No price available for article '" + articleNumber + "'";
      }
    } catch (SQLException e) {
      result = "Error reading price for article '" + articleNumber + "':" + e;
    }
    return result;
  }
  public void close() {
    try {conn.close();} catch (SQLException e) {
      System.err.println("Error closing database connection:" + e);
    }
  }
  static {
    try { Class.forName("com.ibm.db2.jcc.DB2Driver");
    } catch (ClassNotFoundException e) {
      System.err.println("Unable to register Driver:" + e);}
  }
  private static final String sqlPriceQuery = 
      "SELECT price FROM Product WHERE orderNo = ?";
  private PreparedStatement priceQuery = null;
  private Connection conn = null;
}</programlisting>

                  <para>This access layer may be tested independently from
                  handling catalog instances:</para>

                  <programlisting language="java">package dom/xsl;

public class DbAccessDriver {

  public static void main(String[] args) {
    final DbAccess dbaccess = new DbAccess();
    dbaccess.connect("jdbc:db2://db2.mi.hdm-stuttgart.de:10000/hdm",
        "midb2", "password");
    System.out.println(dbaccess.readPrice("x-223"));
    System.out.println(dbaccess.readPrice("..aaargh!"));
    dbaccess.close();
  }
}</programlisting>

                  <para>If the above test succeeds we may embed the RDBMS
                  access layer into our The <acronym
                  xlink:href="http://www.saxproject.org">SAX</acronym>
                  handler:</para>

                  <programlisting language="java">package sax.rdbms;
...
public class HtmlEventHandler extends DefaultHandler{
  public void startDocument() {
    dbaccess.connect("jdbc:db2://db2.mi.hdm-stuttgart.de:10000/hdm",
      "midb2", "password");
    System.out.println("&lt;html&gt;&lt;head&gt;&lt;title&gt;Catalog&lt;/title&gt;&lt;/head&gt;");
  }
  public void endDocument() {
    System.out.println("&lt;/html&gt;");
    dbaccess.close();
  }
  public void startElement(String namespaceUri, String localName,
      String rawName, Attributes attrs){
    if (rawName.equals("catalog")){
      System.out.println("&lt;body&gt;&lt;H1&gt;A catalog&lt;/H1&gt;"
          +"&lt;table border='1'&gt;&lt;tbody&gt;");
      System.out.println("&lt;tr&gt;&lt;th&gt;Order number&lt;/th&gt;\n"
          + "&lt;th&gt;Price&lt;/th&gt;\n"
          +" &lt;th&gt;Product&lt;/th&gt;&lt;/tr&gt;");
    } else if (rawName.equals("item")){
      final String orderNo = attrs.getValue("orderNo");
      System.out.print("&lt;tr&gt;&lt;td&gt;" + orderNo 
          + "&lt;/td&gt;\n&lt;td&gt;" + dbaccess.readPrice(orderNo)
          + "&lt;/td&gt;\n&lt;td&gt;");
    } else {
      System.err.println("Element '" + rawName + "' unknown");
    }
  }
  public void endElement(String namespaceUri, String localName,
      String rawName) {
    if (rawName.equals("catalog")){
      System.out.println("&lt;/tbody&gt;&lt;/table&gt;");     
    } else if (rawName.equals("item")){
      System.out.println("&lt;/td&gt;&lt;/tr&gt;\n");
    } 
  }
  public void characters(char[] ch, int start, int length) {
    System.out.print(new String(ch, start, length));
  }
  private DbAccess dbaccess = new DbAccess();
}</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="saxValidate">
          <title><acronym xlink:href="http://www.saxproject.org">SAX</acronym>
          validation</title>

          <para>So far we only parsed well formed document instances. Our
          current parser may operate on valid XML instances:</para>

          <figure xml:id="saxNotValid">
            <title>An invalid XML document.</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE catalog [
  &lt;!ELEMENT catalog (item) &gt;
  &lt;!ELEMENT item (#PCDATA) &gt;
&lt;!ATTLIST item orderNo NMTOKEN #REQUIRED &gt;
]&gt;
&lt;catalog&gt;
  &lt;item orderNo="3218"&gt;Swinging headset&lt;/item&gt;
  &lt;item orderNo="9921"&gt;200W Stereo Amplifier&lt;/item&gt;
&lt;/catalog&gt;</programlisting>

            <caption>
              <para>In contrast to <xref linkend="saxMissItem"/> this document
              is well formed. But it is not <emphasis
              role="bold">valid</emphasis> with respect to the DTD grammar
              since more than one <tag class="starttag">item</tag> elements
              are present.</para>
            </caption>
          </figure>

          <para>This document instance is well-formed but not valid. The
          parser will not report any error or warning. In order to enable
          validation we need to configure our parser:</para>

          <programlisting language="java">xmlReader.setFeature("http://xml.org/sax/features/validation", true);</programlisting>

          <para>The string <code>http://xml.org/sax/features/validation</code>
          serves as a key. Since this is an ordinary string value a parser may
          or may not implement it. The <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> standard
          defines two exception classes for dealing with feature related
          errors:</para>

          <variablelist>
            <varlistentry>
              <term><link
              xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/SAXNotRecognizedException.html">SAXNotRecognizedException</link></term>

              <listitem>
                <para>The feature is not known to the parser.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term><link
              xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/xml/sax/SAXNotSupportedException.html">SAXNotSupportedException</link></term>

              <listitem>
                <para>The feature is known to the parser but the parser does
                not support it or it does not support a specific value being
                set as a value.</para>
              </listitem>
            </varlistentry>
          </variablelist>
        </section>

        <section xml:id="saxNamespace">
          <title>Namespaces</title>

          <para>In order to make a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> parser
          application namespace aware we have to activate two <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> parsing
          feature:</para>

          <programlisting language="java">xmlReader = saxParser.getXMLReader();
xmlReader.setFeature("http://xml.org/sax/features/namespaces", true);
xmlReader.setFeature("http://xml.org/sax/features/namespace-prefixes", true);</programlisting>

          <para>This instructs the parser to pass the namespace's name for
          each element. Namespace prefixes like <code>xsl</code> in <tag
          class="starttag">xsl:for-each</tag> are also passed and may be used
          by an application:</para>

          <programlisting language="java">package sax;
...
public class NamespaceEventHandler extends DefaultHandler {
...
 public void startElement(String <emphasis role="bold">namespaceUri</emphasis>, String localName,
                           String rawName, Attributes attrs) {
   System.out.println("Opening Element rawName='" + rawName + "'\n"
       + "namespaceUri='" + <emphasis role="bold">namespaceUri</emphasis> + "'\n"
       + "localName='" + localName
       + "'\n--------------------------------------------");
}</programlisting>

          <para>As an example we take a XSLT script:</para>

          <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:xsl='http://www.w3.org/1999/XSL/Transform'
  xmlns:fo='http://www.w3.org/1999/XSL/Format'&gt;

  &lt;xsl:template match="/"&gt;
    &lt;fo:block&gt;A block&lt;/fo:block&gt;
    &lt;HTML/&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</programlisting>

          <para>This XSLT script being conceived as a XML document instance
          contains elements belonging to two different namespaces namely
          <code>http://www.w3.org/1999/XSL/Transform</code> and
          <code>http://www.w3.org/1999/XSL/Format</code>. The script also
          contains a <quote>raw</quote> <tag audience=""
          class="emptytag">HTML</tag> element being introduced only for
          demonstration purposes belonging to the default namespace. The
          result reads:</para>

          <programlisting>Opening Element rawName='xsl:stylesheet'
namespaceUri='http://www.w3.org/1999/XSL/Transform'
localName='stylesheet'
--------------------------------------------
Opening Element rawName='xsl:template'
namespaceUri='http://www.w3.org/1999/XSL/Transform'
localName='template'
--------------------------------------------
Opening Element rawName='fo:block'
namespaceUri='http://www.w3.org/1999/XSL/Format'
localName='block'
--------------------------------------------
Opening Element rawName='HTML'
namespaceUri=''
localName='HTML'</programlisting>

          <para>Now the parser tells us to which namespace a given element
          node belongs to. A XSLT engine for example uses this information to
          build two classes of elements:</para>

          <itemizedlist>
            <listitem>
              <para>Elements belonging to the namespace
              <code>http://www.w3.org/1999/XSL/Transform</code> like <tag
              class="emptytag">xsl:value-of select="..."</tag> have to be
              interpreted as instructions by the processor.</para>
            </listitem>

            <listitem>
              <para>Elements <emphasis role="bold">not</emphasis> belonging to
              the namespace <code>http://www.w3.org/1999/XSL/Transform</code>
              like <tag class="emptytag">html</tag> or <tag
              class="starttag">fo:block</tag> are copied <quote>as is</quote>
              to the output.</para>
            </listitem>
          </itemizedlist>
        </section>
      </section>

      <section xml:id="dom">
        <title>The Document Object Model (<acronym
        xlink:href="http://www.w3.org/DOM">DOM</acronym>)</title>

        <titleabbrev><acronym
        xlink:href="http://www.w3.org/DOM">DOM</acronym></titleabbrev>

        <section xml:id="domBase">
          <title>Language independent specification</title>

          <titleabbrev>Language independence</titleabbrev>

          <para>XML documents allow for automated content processing. We
          already discussed the <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> API to access
          XML documents by <link
          linkend="gloss_Java"><trademark>Java</trademark></link>
          applications. There are however situations where <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> is not
          appropriate:</para>

          <itemizedlist>
            <listitem>
              <para>The <acronym
              xlink:href="http://www.saxproject.org">SAX</acronym> is event
              based. XML node elements are passed to handler methods.
              Sometimes we want to access neighbouring nodes from a context
              node in our handler methods for example a <tag
              class="starttag">title</tag> following a <tag
              class="starttag">chapter</tag> node. <acronym
              xlink:href="http://www.saxproject.org">SAX</acronym> does not
              offer any support for this. If we need references to
              neighbouring nodes we have to create them ourselves during a
              <acronym xlink:href="http://www.saxproject.org">SAX</acronym>
              parsing run. This is tedious and leads to code being hard to
              understand.</para>
            </listitem>

            <listitem>
              <para>Some applications may want to select node sets by <acronym
              xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
              expressions which is completely impossible in a <acronym
              xlink:href="http://www.saxproject.org">SAX</acronym>
              application.</para>
            </listitem>

            <listitem>
              <para>We may want to move subtrees within a document itself (for
              example exchanging two <tag class="starttag">chapter</tag>
              nodes) or even transferring them to a different document.</para>
            </listitem>
          </itemizedlist>

          <para>The greatest deficiency of the <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> is the fact
          that an XML instance is not represented as a tree like structure but
          as a succession of events. The <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> allows us to
          represent XML document instances as tree like structures and thus
          enables navigational operations between nodes.</para>

          <para>In order to achieve language <emphasis>and</emphasis> software
          vendor independence the <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> approach uses two
          stages:</para>

          <itemizedlist>
            <listitem>
              <para>The <acronym
              xlink:href="http://www.w3.org/DOM">DOM</acronym> is formulated
              in an Interface Definition Language (<abbrev
              xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>)</para>
            </listitem>

            <listitem>
              <para>In order to use the <acronym
              xlink:href="http://www.w3.org/DOM">DOM</acronym> API by a
              concrete programming language a so called <emphasis>language
              binding</emphasis> is required. In languages like <link
              linkend="gloss_Java"><trademark>Java</trademark></link> the
              language binding will still be a set of (<link
              linkend="gloss_Java"><trademark>Java</trademark></link>)
              interfaces. Thus for actually coding an application an
              implementation of these interfaces is needed</para>
            </listitem>
          </itemizedlist>

          <para>So what exactly may an <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>
          be? The programming language <link
          linkend="gloss_Java"><trademark>Java</trademark></link> already
          allows pure interface definitions without any implementation. In C++
          the same result can be achieved by so called <emphasis>pure virtual
          classes</emphasis>. An <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>
          offers extended features to describe such interfaces. For <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> the <productname
          xlink:href="http://www.omg.org/gettingstarted/corbafaq.htm">CORBA
          2.2</productname> <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>
          had been chosen to describe an XML document programming interface.
          As a first example we take an excerpt from the <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym>'s <link
          xlink:href="http://www.w3.org/TR/DOM-Level-3-Core/core.html#ID-1950641247">Node</link>
          interface definition:</para>

          <programlisting>interface Node {
          // NodeType
          const unsigned short      ELEMENT_NODE       = 1;
          const unsigned short      ATTRIBUTE_NODE     = 2;
          const unsigned short      TEXT_NODE          = 3;
   ...

          readonly attribute  DOMString   nodeName;
                   attribute  DOMString   nodeValue;
                                                    // raises(DOMException) on setting
                                                    // raises(DOMException) on retrieval
          readonly attribute  unsigned short       nodeType;
          readonly attribute  Node                 parentNode;
   ...
  readonly attribute  NodeList             childNodes;
          readonly attribute  Node                 firstChild;
   ...
          Node                      insertBefore(in Node newChild, 
                                                 in Node refChild)
                                                 raises(DOMException);
   ...</programlisting>

          <para>If we want to implement the <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>
          <classname>org.w3c.dom.Node</classname> specification in e.g. <link
          linkend="gloss_Java"><trademark>Java</trademark></link> a language
          binding has to be defined. This means writing <link
          linkend="gloss_Java"><trademark>Java</trademark></link> code which
          closely resembles the <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Interface_description_language">IDL</abbrev>
          specification. Obviously this task depends on and is restricted by
          the constructs being offered by the target programming language. The
          W3C <link
          xlink:href="http://www.w3.org/TR/DOM-Level-3-Core/java-binding.html">defines</link>
          the <link linkend="gloss_Java"><trademark>Java</trademark></link>
          <classname>org.w3c.dom.Node</classname> interface by:</para>

          <programlisting language="java">package org.w3c.dom;

public interface Node {
   public static final short           ELEMENT_NODE         = 1; // Node Types
   public static final short           ATTRIBUTE_NODE       = 2;
   public static final short           TEXT_NODE            = 3;
      ...
   public String             getNodeName();
   public String             getNodeValue() throws DOMException;
   public void               setNodeValue(String nodeValue) throws DOMException;
   public short              getNodeType();
   public Node               getParentNode();
   public NodeList           getChildNodes();
   public Node               getFirstChild();
   ...
   public Node               insertBefore(Node newChild, 
                                          Node refChild)
                                          throws DOMException;
   ...
 }</programlisting>

          <para>We take
          <methodname>org.w3c.dom.Node.getChildNodes()</methodname> as an
          example:</para>

          <figure xml:id="domRetrieveChildren">
            <title>Retrieving child nodes of a given context node</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/domtree.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>The <classname>org.w3c.dom.Node</classname> interface offers a
          set of common operations for objects being part of a XML document.
          But a XML document tree contains different types of nodes such
          as:</para>

          <itemizedlist>
            <listitem>
              <para>Elements</para>
            </listitem>

            <listitem>
              <para>Attributes</para>
            </listitem>

            <listitem>
              <para>Entities</para>
            </listitem>
          </itemizedlist>

          <para>An XML API may address this issue by offering data types to
          represent these different kinds of nodes. The <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> <link
          linkend="gloss_Java"><trademark>Java</trademark></link> Binding
          defines an inheritance hierarchy of interfaces for this
          purpose:</para>

          <figure xml:id="domJavaNodeInterfaces">
            <title>Inheritance interface hierarchy in the <acronym
            xlink:href="http://www.w3.org/DOM">DOM</acronym> <link
            linkend="gloss_Java"><trademark>Java</trademark></link>
            binding</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/nodeHierarchy.svg"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>Two commonly used <link
          linkend="gloss_Java"><trademark>Java</trademark></link>
          implementations of these interfaces are:</para>

          <variablelist>
            <varlistentry>
              <term>Xerces</term>

              <listitem>
                <para><orgname
                xlink:href="http://xml.apache.org/xerces2-j">Apache Software
                foundation</orgname></para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term>Jaxp</term>

              <listitem>
                <para><orgname xlink:href="http://java.sun.com/xml/jaxp">Sun
                microsystems</orgname></para>
              </listitem>
            </varlistentry>
          </variablelist>

          <para>Both implementations offer additional interfaces beyond the
          <acronym xlink:href="http://www.w3.org/DOM">DOM</acronym>'s
          scope.</para>

          <para>Going back to the <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> itself the
          specification is divided into <link
          xlink:href="http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/introduction.html#DOMArchitecture-h2">modules</link>:</para>

          <figure xml:id="figureDomModules">
            <title><acronym xlink:href="http://www.w3.org/DOM">DOM</acronym>
            modules.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Screen/dom-architecture.screen.png"/>
              </imageobject>
            </mediaobject>
          </figure>
        </section>

        <section xml:id="domCreate">
          <title>Creating a new document from scratch</title>

          <titleabbrev>New document</titleabbrev>

          <para>If we want to export non-XML content (e.g. from a RDBMS) into
          XML we may achieve this by the following recipe:</para>

          <orderedlist>
            <listitem>
              <para>Create a document builder instance.</para>
            </listitem>

            <listitem>
              <para>Create an empty <link
              xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/w3c/dom/Document.html">Document</link>
              instance.</para>
            </listitem>

            <listitem>
              <para>Fill in the desired Elements and Attributes.</para>
            </listitem>

            <listitem>
              <para>Create a serializer.</para>
            </listitem>

            <listitem>
              <para>Serialize the resulting tree to a stream.</para>
            </listitem>
          </orderedlist>

          <para>An introductory piece of code illustrates these steps:</para>

          <figure xml:id="simpleDomCreate">
            <title>Creation of a XML document instance from scratch.</title>

            <programlisting language="java">package dom;
...
public class CreateDoc {
   public static void main(String[] args) throws Exception {

      // Create the root element
      <emphasis role="bold">final Element titel = new Element("titel");
</emphasis>
      //Set a date
      <emphasis role="bold">titel.setAttribute("date", "23.02.2000");</emphasis>

      // Append a text node as child
      <emphasis role="bold">titel.addContent(new Text("Versuch 1"));</emphasis>
      
      
      // Set formatting for the XML output
      <emphasis role="bold">final Format outFormat = Format.getPrettyFormat();</emphasis>
      
      // Serialize to console
      <emphasis role="bold">final XMLOutputter printer = new XMLOutputter(outFormat);
      printer.output(titel, System.out);</emphasis>
   }
}</programlisting>
          </figure>

          <para>We get the following result:</para>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;titel date="23.02.2000"&gt;Versuch 1&lt;/titel&gt;</programlisting>
        </section>

        <section xml:id="domCreateExercises">
          <title>Exercises</title>

          <qandaset role="exercise">
            <title>A sub structured <tag class="starttag">title</tag></title>

            <qandadiv>
              <qandaentry xml:id="createDocModify">
                <question>
                  <label>Creation of an extended XML document instance</label>

                  <para>In order to run the examples given during the lecture
                  the <filename
                  xlink:href="http://www.jdom.org/downloads">jdom2.jar</filename>
                  library must be added to the
                  <envar>CLASSPATH</envar>.</para>

                  <para>The <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> creating
                  example given before may be used as a starting point. Extend
                  the <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> tree
                  created in <xref linkend="simpleDomCreate"/> to produce an
                  extended XML document:</para>

                  <programlisting>&lt;title&gt;
  &lt;long&gt;The long version of this title&lt;/long&gt;
  &lt;short&gt;Short version&lt;/short&gt;
&lt;/title&gt;</programlisting>
                </question>

                <answer>
                  <programlisting language="java">package dom;
...
public class CreateExtended {
  /**
   * @param args
   * @throws IOException 
   */
  public static void main(String[] args) throws IOException {
     
     final Element titel = new Element("titel"),
           tLong = new Element("long"),
           tShort = new Element("short");
     
     <emphasis role="bold">// Append &lt;long&gt; and &lt;short&gt; to parent &lt;title&gt;</emphasis>
     titel.addContent(tLong).addContent(tShort);
     
     <emphasis role="bold">// Append text to &lt;long&gt; and &lt;short&gt;</emphasis>
     tLong.addContent(new Text("The long version of this title"));
     tShort.addContent(new Text("Short version"));
     
     <emphasis role="bold">// Set formatting for the XML output</emphasis>
     Format outFormat = Format.getPrettyFormat();
     
     <emphasis role="bold">// Serialize to console</emphasis>
     final XMLOutputter printer = new XMLOutputter(outFormat);
     printer.output(titel, System.out);
  }
}</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="domParse">
          <title>Parsing existing XML documents</title>

          <titleabbrev>Parsing</titleabbrev>

          <para>We already used a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> to parse an XML
          document. Rather than handling <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> events
          ourselves these events may be used to construct a <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> representation of
          our document. This work is done by an instance of. We use our
          catalog example from <xref linkend="simpleCatalog"/> as an
          introductory example.</para>

          <para>We already noticed the need for an
          <classname>org.xml.sax.ErrorHandler</classname> object during
          <acronym xlink:href="http://www.saxproject.org">SAX</acronym>
          processing. A <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> Parser requires a
          similar type of Object in order to react to parsing errors in a
          meaningful way. In principle a <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> parser implementor
          is free to choose his implementation but most implementations are
          based on top of a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym> parser. For
          this reason it was natural to choose a <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> error handling
          interface which is similar to a <acronym
          xlink:href="http://www.saxproject.org">SAX</acronym>
          <classname>org.xml.sax.ErrorHandler</classname>. The following code
          serves the needs described before:</para>

          <figure xml:id="domTreeTraversal">
            <title>Accessing a XML Tree purely by <acronym
            xlink:href="http://www.w3.org/DOM">DOM</acronym> methods.</title>

            <programlisting language="java">package dom;
...
public class ArticleOrder {

<emphasis role="bold">  // Though we are playing DOM here, a <acronym
                  xlink:href="http://www.saxproject.org">SAX</acronym> parser still
  // assembles our DOM tree.</emphasis>
   private SAXBuilder builder = new SAXBuilder();
   
   public ArticleOrder() {
      <emphasis role="bold">// Though an ErrorHandler is not strictly required it allows
     // for easierlocalization of XML document errors</emphasis>
      builder.setErrorHandler(new MySaxErrorHandler(System.out));<co
                linkends="domSetSaxErrorHandler-co"
                xml:id="domSetSaxErrorHandler"/>
   }

   /** Descending a catalog till its &lt;item&gt; elements. For each product
    *  its name and order number are being written to the output.
    * @throws ...
    */
   public void process(final String filename) throws JDOMException, IOException {
      
      <emphasis role="bold">// Parsing our XML file</emphasis>
      final Document docInput = builder.build(filename);
      
      <emphasis role="bold">// Accessing the document's root element</emphasis>
      final Element docRoot = docInput.getRootElement();
      
      <emphasis role="bold">// Accessing the &lt;item&gt; children of parent element &lt;catalog&gt;</emphasis>
      final List&lt;Element&gt; items = docRoot.getChildren(); // Element nodes only
      for (final Element item : items) {
         System.out.println("Article: " + item.getText()
                   + ", order number: " + item.getAttributeValue("orderNo"));
      } ...</programlisting>

            <para>Note <coref linkend="domSetSaxErrorHandler"
            xml:id="domSetSaxErrorHandler-co"/>: This is our standard <acronym
            xlink:href="http://www.saxproject.org">SAX</acronym> error handler
            implementing the <classname>org.xml.sax.ErrorHandler</classname>
            interface.</para>
          </figure>

          <para>Executing this method needs a driver instance providing an
          input XML filename:</para>

          <programlisting language="java">package dom;
...
public class ArticleOrderDriver {
  public static void main(String[] argv) throws Exception {
    final ArticleOrder ao = new ArticleOrder();
    ao.process("<emphasis role="bold">Input/article.xml</emphasis>");
  }
}</programlisting>

          <para>This yields:</para>

          <programlisting>Article: Swinging headset, order number: 3218
Article: 200W Stereo Amplifier, order number: 9921</programlisting>

          <para>To illustrate the internal processes we take a look at the
          sequence diagram:</para>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/sequenceDomParser.svg"/>
            </imageobject>
          </mediaobject>

          <qandaset role="exercise">
            <title>Creating HTML output</title>

            <qandadiv>
              <qandaentry xml:id="exercise_domHtmlSimple">
                <question>
                  <label>Simple HTML output</label>

                  <para>Instead exporting simple text output in <xref
                  linkend="domTreeTraversal"/> we may also create HTML pages
                  like:</para>

                  <programlisting>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&gt;
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;Available articles&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;Available articles&lt;/h1&gt;
    &lt;table&gt;
      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;th align="left"&gt;Article Description&lt;/th&gt;&lt;th&gt;Order Number&lt;/th&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td align="left"&gt;<emphasis role="bold">Swinging headset</emphasis>&lt;/td&gt;&lt;td&gt;<emphasis
                      role="bold">3218</emphasis>&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td align="left"&gt;<emphasis role="bold">200W Stereo Amplifier</emphasis>&lt;/td&gt;&lt;td&gt;<emphasis
                      role="bold">9921</emphasis>&lt;/td&gt;
        &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>

                  <para>Instead of simply writing
                  <code>...println(&lt;html&gt;\n\t&lt;head&gt;...)</code>
                  statements you are expected to code a more sophisticated
                  solution. We may combine<xref linkend="createDocModify"/>
                  and <xref linkend="createDocModify"/>. The idea is reading
                  the XML catalog instance as a <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> as before.
                  Then construct a <emphasis>second</emphasis> <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> tree for
                  the desired HTML output and fill in the article information
                  from the first <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> tree
                  accordingly.</para>
                </question>

                <answer>
                  <para>We introduce a class
                  <classname>solve.dom.HtmlTree</classname>:</para>

                  <programlisting language="java">package solve.dom;
...
package solve.dom;

import java.io.IOException;
import java.io.PrintStream;

import org.jdom2.DocType;
import org.jdom2.Document;
import org.jdom2.Element;
import org.jdom2.Text;
import org.jdom2.output.Format;
import org.jdom2.output.XMLOutputter;

/**
 * Holding a HTML DOM to produce output.
 * @author goik
 */
public class HtmlTree {

   private Document htmlOutput;
   private Element tableBody;

   public HtmlTree(final String titleText,
         final String[] tableHeaderFields) { <co
                      linkends="programlisting_catalog2html_htmlskel_co"
                      xml:id="programlisting_catalog2html_htmlskel"/>

      DocType doctype =  new DocType("html",
            "-//W3C//DTD XHTML 1.0 Strict//EN", 
            "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd");

      final Element htmlRoot = new Element("html"); <co
                      linkends="programlisting_catalog2html_tablehead_co"
                      xml:id="programlisting_catalog2html_tablehead"/>
      htmlOutput = new Document(htmlRoot);
      htmlOutput.setDocType(doctype);

      // We create a HTML skeleton including an "empty" table
      final Element head = new Element("head"),
            body = new Element("body"),
            table = new Element("table");

      htmlRoot.addContent(head).addContent(body);

      head.addContent(new Element("title").addContent(new Text(titleText)));
      
      body.addContent(new Element("h1").addContent(new Text(titleText)));

      body.addContent(table);


      tableBody = new Element("tbody");
      table.addContent(tableBody);
      
      final Element tr = tableBody.addContent(new Element("tr"));
      for (final String headerField:  tableHeaderFields) {
         tr.addContent(new Element("th").addContent(new Text(headerField)));
      }
   }
   
   public void appendItem(final String itemName, final String orderNo) {<co
                      linkends="programlisting_catalog2html_insertproduct_co"
                      xml:id="programlisting_catalog2html_insertproduct"/>
      final Element tr = new Element("tr");
      tableBody.addContent(tr);
      tr.addContent(new Element("td").addContent(new Text(itemName)));
      tr.addContent(new Element("td").addContent(new Text(orderNo)));
   }
   public void serialize(PrintStream out){

      // Set formatting for the XML output
      final Format outFormat = Format.getPrettyFormat();

      // Serialize to console
      final XMLOutputter printer = new XMLOutputter(outFormat);
      try {
         printer.output(htmlOutput, System.out);
      } catch (IOException e) {
         e.printStackTrace();
         System.exit(1);
      }
   }
   /**
    * @return the table's &lt;tbody&gt; element
    */
   public Element getTable() {
      return tableBody;
   }
}

   </programlisting>

                  <calloutlist>
                    <callout arearefs="programlisting_catalog2html_htmlskel"
                             xml:id="programlisting_catalog2html_htmlskel_co">
                      <para>A basic HTML skeleton is is being created:</para>

                      <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
    &lt;head&gt;
        &lt;title&gt;Available articles&lt;/title&gt;
    &lt;/head&gt;
    &lt;body&gt;
        &lt;h1&gt;Available articles&lt;/h1&gt;
        &lt;table&gt;
            <emphasis role="bold">&lt;tbody&gt;</emphasis> &lt;!-- Data to be inserted here in next step --&gt;
            <emphasis role="bold">&lt;/tbody&gt;</emphasis>
        &lt;/table&gt;
    &lt;/body&gt;
&lt;/html&gt;</programlisting>

                      <para>The table containing the product's data is empty
                      at this point and thus invalid.</para>
                    </callout>

                    <callout arearefs="programlisting_catalog2html_tablehead"
                             xml:id="programlisting_catalog2html_tablehead_co">
                      <para>The table's header is appended but the actual data
                      from our two products is still missing:</para>

                      <programlisting>...     &lt;h1&gt;Available articles&lt;/h1&gt;
        &lt;table&gt;
            &lt;tbody&gt;
                &lt;tr&gt;
                    &lt;th&gt;Article Description&lt;/th&gt;
                    &lt;th&gt;Order Number&lt;/th&gt;
                <emphasis role="bold">&lt;/tr&gt;</emphasis>&lt;!-- Data to be appended after this row in next step --&gt;
            <emphasis role="bold">&lt;/tbody&gt;</emphasis>
        &lt;/table&gt; ...</programlisting>
                    </callout>

                    <callout arearefs="programlisting_catalog2html_insertproduct"
                             xml:id="programlisting_catalog2html_insertproduct_co">
                      <para>Calling
                      <methodname>solve.dom.HtmlTree.appendItem(String,String)</methodname>
                      once per product completes the creation of our HTML DOM
                      tree:</para>

                      <programlisting>...             &lt;/tr&gt;
                &lt;tr&gt;
                    &lt;td&gt;Swinging headset&lt;/td&gt;
                    &lt;td&gt;3218&lt;/td&gt;
                &lt;/tr&gt;
                &lt;tr&gt;
                    &lt;td&gt;200W Stereo Amplifier&lt;/td&gt;
                    &lt;td&gt;9921&lt;/td&gt;
                &lt;/tr&gt;
            &lt;/tbody&gt; ...</programlisting>
                    </callout>
                  </calloutlist>

                  <para>The class
                  <classname>solve.dom.Article2Html</classname> reads the
                  catalog data:</para>

                  <programlisting language="java">package solve.dom;
...
public class Article2Html {
   
  private final SAXBuilder builder = new SAXBuilder();
  private final HtmlTree htmlResult;

  public Article2Html() {
    
     builder.setErrorHandler(new MySaxErrorHandler(System.out));
    
     htmlResult = new HtmlTree("Available articles", new String[] { <co
                      linkends="programlisting_catalog2html_glue_createhtmldom_co"
                      xml:id="programlisting_catalog2html_glue_createhtmldom"/>
        "Article Description", "Order Number" });
  }

  /** Read an Xml catalog instance and insert product names among with their
   * order numbers into the HTML DOM. Then serialize HTML tree to a stream. 
   * 
   * @param 
   *   filename of the Xml source.
   * @param out
   *    The output stream for HTML serialization. 
   * @throws IOException 
   * @throws JDOMException
   */
  public void process(final String filename, final PrintStream out) throws JDOMException, IOException{
    final List&lt;Element&gt; items = 
      builder.build(filename).getRootElement().getChildren();
    
    for (final Element item : items) { <co
                      linkends="programlisting_catalog2html_glue_prodloop_co"
                      xml:id="programlisting_catalog2html_glue_prodloop"/>
       htmlResult.appendItem(item.getText(), item.getAttributeValue("orderNo")); <co
                      linkends="programlisting_catalog2html_glue_insertprod_co"
                      xml:id="programlisting_catalog2html_glue_insertprod"/>
    }
    htmlResult.serialize(out); <co
                      linkends="programlisting_catalog2html_glue_serialize_co"
                      xml:id="programlisting_catalog2html_glue_serialize"/>
  }
}</programlisting>

                  <calloutlist>
                    <callout arearefs="programlisting_catalog2html_glue_createhtmldom"
                             xml:id="programlisting_catalog2html_glue_createhtmldom_co">
                      <para>Create an instance holding a HTML <acronym
                      xlink:href="http://www.w3.org/DOM">DOM</acronym> with a
                      table header containing the strings <emphasis>Article
                      Description</emphasis> and <emphasis>Order
                      Number</emphasis>.</para>
                    </callout>

                    <callout arearefs="programlisting_catalog2html_glue_prodloop"
                             xml:id="programlisting_catalog2html_glue_prodloop_co">
                      <para>Iterate over all product nodes.</para>
                    </callout>

                    <callout arearefs="programlisting_catalog2html_glue_insertprod"
                             xml:id="programlisting_catalog2html_glue_insertprod_co">
                      <para>Insert the product's name an order number into the
                      HTML <acronym
                      xlink:href="http://www.w3.org/DOM">DOM</acronym>.</para>
                    </callout>

                    <callout arearefs="programlisting_catalog2html_glue_serialize"
                             xml:id="programlisting_catalog2html_glue_serialize_co">
                      <para>Serialize the completed HTML <acronym
                      xlink:href="http://www.w3.org/DOM">DOM</acronym> tree to
                      the output stream.</para>
                    </callout>
                  </calloutlist>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="domJavaScript">
          <title>Using <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> with
          HTML/Javascript</title>

          <para>Due to script language support in a variety of browsers we may
          also use the <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> to implement client
          side event handling. As an example we <link
          xlink:href="Ref/src/tablesort.html">demonstrate</link> how a HTML
          table can be made sortable by clicking on a header's column. The
          example code along with the code description can be found at <uri
          xlink:href="http://www.kryogenix.org/code/browser/sorttable">http://www.kryogenix.org/code/browser/sorttable</uri>.</para>

          <para>Quite remarkably there are only few ingredients required to
          enrich an ordinary static HTML table with this functionality:</para>

          <itemizedlist>
            <listitem>
              <para>An external Javascript library has to be included via
              <code>&lt;script type="text/javascript"
              src="sorttable.js"&gt;</code></para>
            </listitem>

            <listitem>
              <para>Each sortable HTML table needs:</para>

              <itemizedlist>
                <listitem>
                  <para>A unique <code>id</code> attribute</para>
                </listitem>

                <listitem>
                  <para>A <code>class="sortable"</code> attribute</para>
                </listitem>
              </itemizedlist>
            </listitem>
          </itemizedlist>
        </section>

        <section xml:id="domXpath">
          <title>Using <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym></title>

          <para><xref linkend="domTreeTraversal"/> demonstrated the
          possibility to traverse trees solely by using <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> Method calls.
          Though this approach is possible it will in general not lead to
          stable applications. Real world examples are often based on large
          XML documents with complex hierarchical structures. Thus using this
          rather primitive approach deeply nested method calls are necessary
          to access desired sets of nodes. In addition changing a DTD will
          require rewriting large code portions..</para>

          <para>As we already know from <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
          transformations <code>Xpath</code> allows to address node sets
          inside a XML tree. The role of <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> can be
          compared to SQL queries when working with relational databases.
          <acronym xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> may
          also be used within <link
          linkend="gloss_Java"><trademark>Java</trademark></link> code. As a
          first example we show an image filename extracting application
          operating on XHTML documents. The following example contains three
          <tag class="starttag">img</tag> elements:</para>

          <figure xml:id="htmlGallery">
            <title>A HTML document containing <code>IMG</code> tags.</title>

            <programlisting>&lt;?xml version="1.0"?&gt;
&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"&gt; 
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;Picture gallery&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;Picture gallery&lt;/h1&gt;
    &lt;p&gt;Images may appear inline:<emphasis role="bold">&lt;img src="inline.gif" alt="none"/&gt;</emphasis>&lt;/p&gt;
    &lt;table&gt;
      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;td&gt;Number one:&lt;/td&gt;
          &lt;td&gt;<emphasis role="bold">&lt;img src="one.gif" alt="none"/&gt;</emphasis>&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td&gt;Number two:&lt;/td&gt;
          &lt;td&gt;<emphasis role="bold">&lt;img src="http://www.hdm-stuttgart.de/favicon.ico" alt="none"/&gt;</emphasis>&lt;/td&gt;
        &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/body&gt;
&lt;/html&gt;
</programlisting>
          </figure>

          <para>A given HTML document may contain <tag
          class="starttag">img</tag> elements at
          <emphasis>arbitrary</emphasis> positions. It is sometimes desirable
          to check for existence and accessibility of such external objects
          being necessary for the page's correct rendering. A simple XSL
          script will do first part the job namely extracting the <tag
          class="starttag">img</tag> elements:</para>

          <figure xml:id="gallery2imagelist">
            <title>A <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> script for
            image name extraction.</title>

            <programlisting>&lt;xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                              xmlns:html="http://www.w3.org/1999/xhtml"&gt;
  &lt;xsl:output method="text"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;xsl:for-each select="//html:img"&gt;
      &lt;xsl:value-of select="@src"/&gt;
      &lt;xsl:text&gt; &lt;/xsl:text&gt;
    &lt;/xsl:for-each&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</programlisting>
          </figure>

          <para>Note the necessity for <code>html</code> namespace inclusion
          into the <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> expression
          in <code>&lt;xsl:for-each select="//html:img"&gt;</code>. A simple
          <code>select="//img"&gt;</code> results in an empty node set.
          Executing the <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> script yields
          a list of image filenames being contained in the HTML page i.e.
          <code>inline.gif one.gif two.gif</code>.</para>

          <para>Now we want to write a <link
          linkend="gloss_Java"><trademark>Java</trademark></link> application
          which allows to check whether these referenced image files do exist
          and have sufficient permissions to be accessed. A simple approach
          may pipe the <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> output to our
          application which then executes the readability checks. Instead we
          want to incorporate the <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> based search
          into the application. Ignoring Namespaces and trying to resemble the
          <abbrev xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
          actions as closely as possible our application will have to search
          for <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/org/w3c/dom/Element.html">Element</link>
          Nodes by the <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> expression
          <code>//html:img</code>:</para>

          <figure xml:id="domFindImages">
            <title>Extracting <tag class="emptytag">img</tag> element image
            references from a HTML document.</title>

            <programlisting language="java">package dom.xpath;
...
public class DomXpath {
   private final SAXBuilder builder = new SAXBuilder();

   public DomXpath() {
      builder.setErrorHandler(new MySaxErrorHandler(System.err));
   }
   public void process(final String xhtmlFilename) throws JDOMException, IOException {

      final Document htmlInput = builder.build(xhtmlFilename);<co
                linkends="programlisting_java_searchimg_parse_co"
                xml:id="programlisting_java_searchimg_parse"/>
      final XPathExpression&lt;Object&gt; xpath = XPathFactory.instance().compile( "//img" ); <co
                linkends="programlisting_java_searchimg_pf_co"
                xml:id="programlisting_java_searchimg_pf"/> <co
                linkends="programlisting_java_searchimg_newxpath_co"
                xml:id="programlisting_java_searchimg_newxpath"/>
      final List&lt;Object&gt; images = xpath.evaluate(htmlInput);<co
                linkends="programlisting_java_searchimg_execquery_co"
                xml:id="programlisting_java_searchimg_execquery"/>

      for (Object o: images) { <co
                linkends="programlisting_java_searchimg_loop_co"
                xml:id="programlisting_java_searchimg_loop"/>
         final Element image = (Element ) o;<co
                linkends="programlisting_java_searchimg_cast_co"
                xml:id="programlisting_java_searchimg_cast"/>
         System.out.print(image.getAttribute("src") + " "); 
      }
   }
}</programlisting>

            <caption>
              <para>This application searches for <tag
              class="emptytag">img</tag> elements and shows their
              <code>src</code> attribute value.</para>
            </caption>
          </figure>

          <calloutlist>
            <callout arearefs="programlisting_java_searchimg_parse"
                     xml:id="programlisting_java_searchimg_parse_co">
              <para>Parse a XHTML document instance into a DOM tree.</para>
            </callout>

            <callout arearefs="programlisting_java_searchimg_pf"
                     xml:id="programlisting_java_searchimg_pf_co">
              <para>Create a <acronym
              xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
              factory.</para>
            </callout>

            <callout arearefs="programlisting_java_searchimg_newxpath"
                     xml:id="programlisting_java_searchimg_newxpath_co">
              <para>Create a <acronym
              xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> query
              instance. This may be used to search for a set of nodes starting
              from a context node.</para>
            </callout>

            <callout arearefs="programlisting_java_searchimg_execquery"
                     xml:id="programlisting_java_searchimg_execquery_co">
              <para>Using the document's root node as the context node we
              search for <tag class="starttag">img</tag> elements appearing at
              arbitrary positions in our document.</para>
            </callout>

            <callout arearefs="programlisting_java_searchimg_loop"
                     xml:id="programlisting_java_searchimg_loop_co">
              <para>We iterate over the retrieved list of images.</para>
            </callout>

            <callout arearefs="programlisting_java_searchimg_cast"
                     xml:id="programlisting_java_searchimg_cast_co">
              <para>Casting to the correct type.</para>
            </callout>
          </calloutlist>

          <para>The result is a list of image filename references:</para>

          <programlisting>inline.gif one.gif http://www.hdm-stuttgart.de/favicon.ico </programlisting>

          <qandaset role="exercise">
            <title>Legal casting?</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Why is the cast in <coref
                  linkend="programlisting_java_searchimg_cast"/> in <xref
                  linkend="domFindImages"/> guaranteed to never cause a
                  <classname>java.lang.ClassCastException</classname>?</para>
                </question>

                <answer>
                  <para>The <acronym
                  xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
                  <code>//img</code> expression is guaranteed to return only
                  <tag class="starttag">img</tag> elements. Thus within our
                  <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  context we are sure to find only
                  <classname>org.jdom2.Element</classname> instances.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Verification of referenced images readability</title>

            <qandadiv>
              <qandaentry xml:id="exercise_htmlImageVerify">
                <question>
                  <para>We want to extend the example given in <xref
                  linkend="domFindImages"/> by testing the existence and
                  checking for readability of referenced images. The following
                  HTML document contains <quote>dead</quote> image
                  references:</para>

                  <programlisting xml:id="domCheckImageAccessibility">&lt;!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                  "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt; ...
  &lt;body&gt;
    &lt;h1&gt;External Pictures&lt;/h1&gt;
    &lt;p&gt;A local image reference:&lt;img src="inline.gif" alt="none"/&gt;&lt;/p&gt;
    &lt;table&gt;
      &lt;tbody&gt;
        &lt;tr&gt;
          &lt;td&gt;An existing picture:&lt;/td&gt;
          &lt;td&gt;&lt;img
            src="http://www.hdm-stuttgart.de/bilder_navigation/laptop.gif"
            alt="none"/&gt;&lt;/td&gt;
        &lt;/tr&gt;
        &lt;tr&gt;
          &lt;td&gt;A non-existing picture:&lt;/td&gt;
          &lt;td&gt;&lt;img src="<emphasis role="bold">http://www.hdm-stuttgart.de/rotfl.gif</emphasis>" alt="none"/&gt;&lt;/td&gt;
        &lt;/tr&gt;
      &lt;/tbody&gt;
    &lt;/table&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>

                  <para>Write an application which checks for readability of
                  <abbrev
                  xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev>
                  image references to <emphasis>external</emphasis> Servers
                  starting either with <code>http://</code> or
                  <code>ftp://</code> ignoring other protocol types. Internal
                  image references referring to the <quote>current</quote>
                  server typically look like <code>&lt;img
                  src="/images/test.gif"</code>. So in order to distinguish
                  these two types of references we may use the XSL built in
                  function <link
                  xlink:href="http://www.cafeconleche.org/books/bible2/chapters/ch17.html">starts-with()</link>
                  testing for the <code>http</code> or <code>ftp</code>
                  protocol definition part of an <abbrev
                  xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev>.
                  A possible output for the example being given is:</para>

                  <programlisting>Received 'sun.awt.image.URLImageSource' from
                    http://www.hdm-stuttgart.de/bilder_navigation/laptop.gif
Unable to open 'http://www.hdm-stuttgart.de/rotfl.gif'</programlisting>

                  <para>The following code snippet shows a helpful class
                  method to check for both correctness of <abbrev
                  xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev>'s
                  and accessibility of referenced objects:</para>

                  <programlisting language="java">package dom.xpath;
...
public class CheckUrl {
  public static void checkReadability(final String urlRef) {
    try {
      final URL url = new URL(urlRef);
      try {
        final Object imgCandidate = url.getContent();
        if (null == imgCandidate) {
          System.err.println("Unable to open '" + urlRef + "'");
        } else {
          System.out.println("Received '"
              + imgCandidate.getClass().getName() + "' from "
              + urlRef);
        }
      } catch (IOException e) {
        System.err.println("Unable to open '" + urlRef + "'");
      }
    } catch (MalformedURLException e) {
      System.err.println("Adress '" + urlRef + "' is malformed");
    }
  }
}</programlisting>
                </question>

                <answer>
                  <para>We are interested in the set of images within a given
                  HTML document containing an <link
                  xlink:href="http://www.w3.org/Addressing">URL</link>
                  reference starting either with <code>http://</code> or
                  <code>ftp://</code>. This is achieved by the following
                  <acronym
                  xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
                  expression:</para>

                  <programlisting>//html:img[starts-with(@src, 'http://') or starts-with(@src, 'ftp://')]</programlisting>

                  <para>The application only needs to pass the corresponding
                  <abbrev
                  xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev>'s
                  to the method <link
                  xlink:href="domCheckUrlObjectExistence">CheckUrl.checkReadability()</link>.
                  The rest of the code is identical to the <link
                  linkend="domFindImages">introductory example</link>:</para>

                  <informalfigure xml:id="solutionFintExtImgRef">
                    <programlisting language="java">package dom.xpath;
...
public class CheckExtImage {
   private final SAXBuilder builder = new SAXBuilder();

   public CheckExtImage() {
      builder.setErrorHandler(new MySaxErrorHandler(System.err));
   }
   public void process(final String xhtmlFilename) throws JDOMException, IOException {

      final Document htmlInput = builder.build(xhtmlFilename);
      final XPathExpression&lt;Object&gt; xpath = XPathFactory.instance().compile(
            "<emphasis role="bold">//img[starts-with(@src, 'http://') or starts-with(@src, 'ftp://')]</emphasis>");
      final List&lt;Object&gt; images = xpath.evaluate(htmlInput);

      for (Object o: images) {
         final Element image = (Element ) o;
         <emphasis role="bold">CheckUrl.checkReadability(image.getAttributeValue("src"));</emphasis>
      }
   }
}</programlisting>
                  </informalfigure>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="domXsl">
          <title><acronym xlink:href="http://www.w3.org/DOM">DOM</acronym> and
          <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev></title>

          <para><link linkend="gloss_Java"><trademark>Java</trademark></link>
          based <link linkend="gloss_XML"><abbrev>XML</abbrev></link>
          applications may use XSL style sheets for processing. A <acronym
          xlink:href="http://www.w3.org/DOM">DOM</acronym> tree may for
          example be transformed into another tree. The package <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/transform/package-frame.html">javax.xml.transform</link>
          provides interfaces and classes for this purpose. We consider the
          following product catalog example:</para>

          <figure xml:id="climbingCatalog">
            <title>A simplified <link
            linkend="gloss_XML"><abbrev>XML</abbrev></link> product
            catalog</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE <emphasis role="bold">catalog</emphasis> SYSTEM "<emphasis
                role="bold">catalog.dtd</emphasis>"&gt;
&lt;catalog&gt;
  &lt;title&gt;Climbing gear&lt;/title&gt;
  &lt;introduction&gt;
    &lt;para&gt;We offer a great variety of basic stuff for mountaineering 
          such as ropes, harnesses and runners.&lt;/para&gt;
    &lt;para&gt;Our shop is proud on its large number of sleeping bags 
          available.&lt;/para&gt;
  &lt;/introduction&gt;
  &lt;product id="x-223"&gt;
    &lt;title&gt;Multi freezing bag  Nightmare camper&lt;/title&gt;
    &lt;description&gt;
      &lt;para&gt;You will feel comfortable till  minus 20 degrees - At 
            least if you are a penguin or a polar bear.&lt;/para&gt;
    &lt;/description&gt;
  &lt;/product&gt;
  &lt;product id="r-334"&gt;
    &lt;title&gt;Rope 40m&lt;/title&gt;
    &lt;description&gt;
      &lt;para&gt;Excellent for indoor climbing.&lt;/para&gt;
    &lt;/description&gt;
  &lt;/product&gt;
&lt;/catalog&gt;</programlisting>

            <para>A corresponding DTD is straightforward:</para>

            <programlisting>&lt;!ELEMENT catalog      (title, introduction, product+) &gt;
&lt;!ELEMENT introduction (para+) &gt;
&lt;!ELEMENT title        (#PCDATA) &gt;
&lt;!ELEMENT product      (title, description) &gt;
&lt;!ATTLIST product 
               id      ID      #REQUIRED
               price   NMTOKEN #IMPLIED&gt;
&lt;!ELEMENT description  (para+) &gt;
&lt;!ELEMENT para         (#PCDATA) &gt;</programlisting>
          </figure>

          <para>A <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> style sheet
          may be used to transform this document into the HTML Format:</para>

          <figure xml:id="catalog2html">
            <title>A <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> style sheet
            for catalog transformation to HTML.</title>

            <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" 
  version="2.0" xmlns="http://www.w3.org/1999/xhtml"&gt;

  &lt;xsl:template match="/catalog"&gt;
    &lt;html&gt;
      &lt;head&gt;&lt;title&gt;&lt;xsl:value-of select="title"/&gt;&lt;/title&gt;&lt;/head&gt;
      &lt;body style="background-color:#FFFFFF"&gt;
        &lt;h1&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h1&gt;
        &lt;xsl:apply-templates select="product"/&gt;
      &lt;/body&gt;
    &lt;/html&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="product"&gt;
    &lt;h3&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h3&gt;
    &lt;xsl:for-each select="description/para"&gt;
      &lt;p&gt;&lt;xsl:value-of select="."/&gt;&lt;/p&gt;
    &lt;/xsl:for-each&gt;
    &lt;xsl:if test="price"&gt;
      &lt;p&gt;
        &lt;xsl:text&gt;Price:&lt;/xsl:text&gt;
        &lt;xsl:value-of select="price/@value"/&gt;
      &lt;/p&gt;
    &lt;/xsl:if&gt;
  &lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;</programlisting>
          </figure>

          <para>As a preparation for <xref linkend="exercise_catalogRdbms"/>
          we now demonstrate the usage of <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> within a <link
          linkend="gloss_Java"><trademark>Java</trademark></link> application.
          This is done by a <link
          xlink:href="http://java.sun.com/j2se/1.5.0/docs/api/javax/xml/transform/Transformer.html">Transformer</link>
          instance:</para>

          <figure xml:id="xml2xml">
            <title>Transforming an XML document instance to HTML by a XSL
            style sheet.</title>

            <programlisting language="java">package dom.xsl;
...
public class Xml2Html {
   private final SAXBuilder builder = new SAXBuilder();

   final XSLTransformer transformer;
   
  public Xml2Html(final String xslFilename) throws XSLTransformException {
     builder.setErrorHandler(new MySaxErrorHandler(System.err));
     transformer =  new XSLTransformer(xslFilename);
  }
  public void transform(final String xmlInFilename,
      final String resultFilename) throws JDOMException, IOException {
    
    final Document inDoc = builder.build(xmlInFilename);
    Document result = transformer.transform(inDoc);
    
    // Set formatting for the XML output
    final Format outFormat = Format.getPrettyFormat();
    
    // Serialize to console
    final XMLOutputter printer = new XMLOutputter(outFormat);
    printer.output(result.getDocument(), System.out);

  }
}</programlisting>
          </figure>

          <para>A corresponding driver file is needed to invoke a
          transformation:</para>

          <figure xml:id="xml2xmlDriver">
            <title>A driver class for the xml2xml transformer.</title>

            <programlisting language="java">package dom.xsl;
...
public class Xml2HtmlDriver {
...
  public static void main(String[] args) {
    final String 
     inFilename = "Input/Dom/climbing.xml",
     xslFilename = "Input/Dom/catalog2html.xsl", 
     htmlOutputFilename = "Input/Dom/climbing.html";
    try {
      final Xml2Html converter = new Xml2Html(xslFilename);
      converter.transform(inFilename, htmlOutputFilename);
    } catch (Exception e) {
      System.err.println("The conversion of '" + inFilename
          + "' by stylesheet '" + xslFilename
          + "' to output HTML file '" + htmlOutputFilename
          + "' failed with the following error:" + e);
      e.printStackTrace();
    }
  }
}</programlisting>
          </figure>

          <qandaset role="exercise">
            <title>HTML from XML and relational data</title>

            <qandadiv>
              <qandaentry xml:id="exercise_catalogRdbms">
                <question>
                  <label>Catalogs and RDBMS</label>

                  <para>We want to extend the transformation being described
                  before in <xref linkend="xml2xml"/> by reading price
                  information from a RDBMS. Consider the following schema and
                  <code>INSERT</code>s:</para>

                  <programlisting>CREATE TABLE Product(
  orderNo CHAR(10)
 ,price NUMERIC(10,2) 
);

INSERT INTO Product VALUES('x-223', 330.20);
INSERT INTO Product VALUES('w-124', 110.40);</programlisting>

                  <para>Adding prices may be implemented the following
                  way:</para>

                  <mediaobject>
                    <imageobject>
                      <imagedata fileref="Ref/Fig/xml2html.fig"/>
                    </imageobject>
                  </mediaobject>

                  <para>You may implement this by following these
                  steps:</para>

                  <orderedlist>
                    <listitem>
                      <para>You may reuse class
                      <classname>sax.rdbms.RdbmsAccess</classname> from <xref
                      linkend="saxRdbms"/>.</para>
                    </listitem>

                    <listitem>
                      <para>Use the previous class to modify <xref
                      linkend="xml2xml"/> by introducing a new method
                      <code>addPrices(final Document catalog)</code> which
                      adds prices to the <acronym
                      xlink:href="http://www.w3.org/DOM">DOM</acronym> tree
                      accordingly. The insertion points may be reached by an
                      <acronym
                      xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
                      expression.</para>
                    </listitem>
                  </orderedlist>
                </question>

                <answer>
                  <para>The additional functionality on top of <xref
                  linkend="xml2xml"/> is represented by a method
                  <methodname>dom.xsl.XmlRdbms2Html.addPrices()</methodname>.
                  This method modifies the <acronym
                  xlink:href="http://www.w3.org/DOM">DOM</acronym> input tree
                  prior to applying the XSL. Prices are being inserting based
                  on data received from an RDBMS via <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>:</para>

                  <programlisting language="java">package dom.xsl;
...
public class XmlRdbms2Html {
   private final SAXBuilder builder = new SAXBuilder();

   DbAccess db = new DbAccess();
   
   final XSLTransformer transformer;
   Document catalog;
   
   final org.jdom2.xpath.XPathExpression&lt;Object&gt; selectProducts = 
         XPathFactory.instance().compile("/catalog/product");

   /**
    * @param xslFilename the stylesheet being used for subsequent
    * transformations by {@link #transform(String, String)}.
    * 
    * @throws XSLTransformException
    */
   public XmlRdbms2Html(final String xslFilename) throws XSLTransformException {
      builder.setErrorHandler(new MySaxErrorHandler(System.err));
      transformer =  new XSLTransformer(xslFilename);
   }
   
   /**
    * The actual workhorse carrying out the transformation
    * and adding prices from the database table.
    * 
    * @param xmlInFilename input file to be transformed
    * @param resultFilename the result file holding the generated HTML document
    * @throws JDOMException The transformation may fail for various reasons.
    * @throws IOException
    */
   public void transform(final String xmlInFilename,
         final String resultFilename) throws JDOMException, IOException {

      catalog = builder.build(xmlInFilename);

      addPrices();

      final Document htmlResult = transformer.transform(catalog);

      // Set formatting for the XML output
      final Format outFormat = Format.getPrettyFormat();

      // Serialize to console
      final XMLOutputter printer = new XMLOutputter(outFormat);
      printer.output(htmlResult, System.out);

   }
   private void addPrices() {
      final List&lt;Object&gt; products = selectProducts.evaluate(catalog.getRootElement());
      
      db.connect("jdbc:mysql://localhost:3306/hdm", "hdmuser", "XYZ");
      for (Object p: products) {
         final Element product = (Element ) p;
         final String productId = product.getAttributeValue("id");
         product.setAttribute("price", db.readPrice(productId));
      }
      db.close();
   }
}</programlisting>

                  <para>The method <code>addPrices(...)</code> utilizes our
                  RDBMS access class:</para>

                  <programlisting language="java">package dom.xsl;
...
public class DbAccess {
  public void connect(final String jdbcUrl, 
      final String userName, final String password) {
    try {
      conn = DriverManager.getConnection(jdbcUrl, userName, password);
      priceQuery = conn.prepareStatement(sqlPriceQuery);
    } catch (SQLException e) {
      System.err.println("Unable to open connection to database:" + e);}
  }
  public String readPrice(final String articleNumber) {
    String result;
    try {
      priceQuery.setString(1, articleNumber);
      final ResultSet rs = priceQuery.executeQuery();
      if (rs.next()) {
        result = rs.getString("price");
      } else {
        result = "No price available for article '" + articleNumber + "'";
      }
    } catch (SQLException e) {
      result = "Error reading price for article '" + articleNumber + "':" + e;
    }
    return result;
  }
  ...
}</programlisting>

                  <para>Of course the connection details should be moved to a
                  configuration file.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>
      </section>
    </chapter>

    <chapter xml:id="xsl">
      <title>The Extensible Stylesheet Language XSL</title>

      <para>XSL is a <link xlink:href="http://www.w3.org/Style/XSL">W3C
      standard</link> which defines a language to transform XML documents into
      the following output formats:</para>

      <itemizedlist>
        <listitem>
          <para>Ordinary text e.g in <link
          xlink:href="http://unicode.org">Unicode</link> encoding.</para>
        </listitem>

        <listitem>
          <para>XML.</para>
        </listitem>

        <listitem>
          <para>HTML</para>
        </listitem>

        <listitem>
          <para>XHTML</para>
        </listitem>
      </itemizedlist>

      <para>Transforming a source XML document into a target XML document may
      be required if:</para>

      <itemizedlist>
        <listitem>
          <para>The target document expresses similar semantics but uses a
          different XML dialect i.e. different tag names.</para>
        </listitem>

        <listitem>
          <para>The target document is only a view on the source document. We
          may for example extract the chapter names from a <tag
          class="starttag">book</tag> document to create a table of
          contents.</para>
        </listitem>
      </itemizedlist>

      <section xml:id="xsl_helloworld">
        <title>A <quote>Hello, world</quote> <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> example</title>

        <para>We start from an extended version of our
        <filename>memo.dtd</filename>:</para>

        <programlisting>&lt;!ELEMENT memo     (from, to+, subject, content)&gt;
&lt;!ATTLIST memo date     CDATA             #REQUIRED
               priority (low|medium|high) #IMPLIED&gt;
&lt;!ELEMENT from     (#PCDATA)&gt;
&lt;!ATTLIST from id ID #IMPLIED &gt;
&lt;!ELEMENT to       (#PCDATA)&gt;
&lt;!ATTLIST to   id ID #IMPLIED &gt;

&lt;!ELEMENT subject  (#PCDATA)&gt;
&lt;!ELEMENT content  (para)+&gt;
&lt;!ELEMENT para     (#PCDATA|link)*&gt;
&lt;!ELEMENT link (#PCDATA) &gt;
&lt;!ATTLIST link linkend IDREF #REQUIRED &gt;</programlisting>

        <para>This DTD allows a memo's document content to be structured into
        paragraphs. A paragraph may contain links either to the sender or to
        one of the memo's recipients.</para>

        <figure xml:id="figure_memoref_instance">
          <title>A memo document instance with an internal reference.</title>

          <programlisting>&lt;?xml version="1.0" ?&gt;
&lt;!DOCTYPE memo SYSTEM "memo.dtd"&gt;
&lt;memo date="9.9.2099" priority="high"&gt;
  &lt;from id="goik"&gt;Martin Goik&lt;/from&gt;
  &lt;to&gt;Adam Hacker&lt;/to&gt;
  &lt;to id="eve"&gt;Eve Intruder&lt;/to&gt;
  &lt;subject&gt;Firewall problems&lt;/subject&gt;
  &lt;content&gt;
    &lt;para&gt;Thanks for your excellent work.&lt;/para&gt;
    &lt;para&gt;Our firewall is definitely broken! This bug has been reported by 
      the &lt;link linkend="goik"&gt;sender&lt;/link&gt;.&lt;/para&gt;
  &lt;/content&gt;
&lt;/memo&gt;</programlisting>
        </figure>

        <para>We want to extract the sender's name from an arbitrary <tag
        class="element">memo</tag> document instance. Using <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> this task can be
        accomplished by a script <filename>memo2sender.xsl</filename>:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0"&gt;
  
  &lt;xsl:output method="text"/&gt;
  
  &lt;xsl:template match="/memo"&gt;
    &lt;xsl:value-of select="from"/&gt;
  &lt;/xsl:template&gt;
  
&lt;/xsl:stylesheet&gt;</programlisting>

        <para>Before closer examining this code we first show its effect. We
        need a piece of software called a <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor. It
        reads both a <tag>memo</tag> document instance and a style sheet and
        produces the following output:</para>

        <programlisting><computeroutput>[goik@mupter Memoref]$ xml2xml message.xml  memo2sender.xsl
Martin Goik</computeroutput></programlisting>

        <para>The result is the sender's name <computeroutput>Martin
        Goik</computeroutput>. We may sketch the transformation
        principle:</para>

        <figure xml:id="figure_xsl_principle">
          <title>An <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor
          transforming a XML document into a result using a stylesheet</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/xslconvert.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>The executable <filename>xml2xml</filename> defined at the MI
        department is actually a script wrapping the <productname
        xlink:href="http://saxon.sourceforge.net">Saxon XSLT
        processor</productname>. We may also use the Eclipse/Oxygen plug in
        <!-- goik
     and <uri
      xlink:href="src/viewlet/xslt_config/xslt_config_viewlet_swf.html">
      and define
      a transformation scenario</uri> thus --> replacing the shell command by
        a GUI. Next we closer examine the <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> example
        code:</para>

        <programlisting>&lt;xsl:stylesheet <co
            xml:id="programlisting_helloxsl_stylesheet"/> xmlns:xsl <co
            xml:id="programlisting_helloxsl_namespace_abbv"/> ="http://www.w3.org/1999/XSL/Transform" 
                version="2.0" <co xml:id="programlisting_helloxsl_xsl_version"/> &gt;
  
  &lt;xsl:output method="text" <co
            xml:id="programlisting_helloxsl_method_text"/>/&gt;
  
  &lt;xsl:template <co xml:id="programlisting_helloxsl_template"/> match <co
            xml:id="programlisting_helloxsl_match"/> ="/memo"&gt;
    &lt;xsl:value-of <co xml:id="programlisting_helloxsl_value-of"/> select <co
            xml:base="" xml:id="programlisting_helloxsl_valueof_select_att"/> ="from" /&gt;
  &lt;/xsl:template&gt;
  
&lt;/xsl:stylesheet&gt;</programlisting>

        <calloutlist>
          <callout arearefs="programlisting_helloxsl_stylesheet">
            <para>The element stylesheet belongs the the namespace
            <code>http://www.w3.org/1999/XSL/Transform</code>. This namespace
            is <emphasis>represented</emphasis> by the literal
            <literal>xsl</literal>. As an alternative we might also use <tag
            class="starttag">stylesheet
            xmlns="http://www.w3.org/1999/XSL/Transform"</tag> instead of <tag
            class="starttag">xsl:stylesheet ...</tag>. The value of the
            namespace itself gets defined next.</para>
          </callout>

          <callout arearefs="programlisting_helloxsl_namespace_abbv">
            <para>The keyword <code>xmlns</code> is reserved by the <link
            xlink:href="http://www.w3.org/TR/REC-xml-names/">Namespaces in
            XML</link> specification. In <quote>pure</quote> XML the whole
            term <code>xmlns:xsl</code> would simply define an attribute. In
            presence of a namespace aware XML parser however the literal
            <literal>xsl</literal> represents the attribute value <tag
            class="attvalue">http://www.w3.org/1999/XSL/Transform</tag>. This
            value <emphasis>must not</emphasis> be changed! Otherwise a XSL
            converter will fail since it cannot distinguish processing
            instructions from other XML elements. An element <tag
            class="starttag">stylesheet</tag> belonging to a different
            namespace <code>http//someserver.org/SomeNamespace</code> may have
            to be generated.</para>
          </callout>

          <callout arearefs="programlisting_helloxsl_xsl_version">
            <para>The <link xlink:href="http://www.w3.org/TR/xslt20">XSL
            standard</link> is still evolving. The version number identifies
            the conformance level for the subsequent code.</para>
          </callout>

          <callout arearefs="programlisting_helloxsl_method_text">
            <para>The <tag class="attribute">method</tag> attribute in the
            <link
            xlink:href="http://www.w3.org/TR/xslt20/#element-output">&lt;xsl:output&gt;</link>
            element specifies the type of output to be generated. Depending on
            this type we may also define indentation depths and/or encoding.
            Allowed <tag class="attvalue">method</tag> values are:</para>

            <glosslist>
              <glossentry>
                <glossterm>text</glossterm>

                <glossdef>
                  <para>Ordinary text.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm>html</glossterm>

                <glossdef>
                  <para><link
                  xlink:href="http://www.w3.org/TR/html4">HTML</link>
                  markup.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm>xhtml</glossterm>

                <glossdef>
                  <para><link
                  xlink:href="http://www.w3.org/TR/xhtml1">Xhtml</link> markup
                  differing from the former by e.g. the closing
                  <quote>/&gt;</quote> in <tag>&lt;img
                  src="..."/&gt;</tag>.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm>xml</glossterm>

                <glossdef>
                  <para>XML code. This is most commonly used to create views
                  on or different dialects of a XML document instance.</para>
                </glossdef>
              </glossentry>
            </glosslist>
          </callout>

          <callout arearefs="programlisting_helloxsl_template">
            <para>A <tag class="starttag">xsl:template</tag> defines the
            output that will be created for document nodes being defined by a
            selector.</para>
          </callout>

          <callout arearefs="programlisting_helloxsl_match">
            <para>The attribute <tag class="attribute">match</tag> tells us
            for which nodes of a document instance the given <tag
            class="starttag">xsl:template</tag> is appropriate. In the given
            example the value <code>/memo</code> tells us that the template is
            only responsible for <tag class="element">memo</tag> nodes
            appearing at top level i.e. being the root element of the document
            instance.</para>
          </callout>

          <callout arch=""
                   arearefs="programlisting_helloxsl_value-of programlisting_helloxsl_valueof_select_att">
            <para>A <tag class="element">value-of</tag> element writes content
            to the <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> process'
            output. In this example the <code>#PCDATA</code> content from the
            element <tag class="element">from</tag> will be written to the
            output.</para>
          </callout>
        </calloutlist>
      </section>

      <section xml:id="xpath">
        <title><link xlink:href="http://www.w3.org/TR/xpath">XPath</link> and
        node sets</title>

        <para>The <acronym
        xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> standard
        allows us to retrieve node sets from XML documents by predicate based
        queries. Thus its role may be compared to <acronym
        xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>
        <code>SELECT</code> ... <code>FROM</code> ...<code>WHERE</code>
        queries. Some simple examples:</para>

        <figure xml:id="fig_Xpath">
          <title>Simple <acronym
          xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
          queries</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/xpath.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>We are now interested in a list of all recipients being defined
        in a <tag class="element">memo</tag> element. We introduce the element
        <tag class="element">xsl:for-each</tag> which iterates over a result
        set of nodes:</para>

        <figure xml:id="programlisting_tolist_xpath">
          <title>Iterating over the list of recipient nodes.</title>

          <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;

&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
                version="2.0"&gt;
  
  &lt;xsl:output method="text"/&gt;
  
  &lt;xsl:template match="/" <co xml:id="programlisting_tolist_match_root"/>&gt;
    &lt;xsl:for-each select="memo/to" <co
              xml:id="programlisting_tolist_xpath_memo_to"/> &gt;
      &lt;xsl:value-of select="." <co xml:id="programlisting_tolist_value_of"/> /&gt;
      &lt;xsl:text&gt;,&lt;/xsl:text&gt; <co
              xml:id="programlisting_tolist_xsl_text"/>    
    &lt;/xsl:for-each&gt;
  &lt;/xsl:template&gt;
  
&lt;/xsl:stylesheet&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="programlisting_tolist_match_root">
            <para>This template matches the XML document instance,
            <emphasis>not</emphasis> the visible <tag
            class="element">&lt;memo&gt;</tag> node.</para>
          </callout>

          <callout arearefs="programlisting_tolist_xpath_memo_to">
            <para>The <link
            xlink:href="http://www.w3.org/TR/xpath">XPath</link> expression
            <tag class="attvalue">memo/to</tag> gets evaluated starting from
            the invisible top level document node being the context node. For
            the given document instance this will define a result set
            containing both <tag class="element">&lt;to&gt;</tag> recipient
            nodes, see <xref linkend="figure_memo_xpath_memo_to"/>.</para>
          </callout>

          <callout arearefs="programlisting_tolist_value_of">
            <para>The dot <quote>.</quote> represents the <code>#PCDATA</code>
            content of the current <tag class="element">to</tag>
            element.</para>
          </callout>

          <callout arearefs="programlisting_tolist_xsl_text">
            <para>A comma is appended. This is not quite correct since it
            should be absent for the last element.</para>
          </callout>
        </calloutlist>

        <figure xml:id="figure_recipientlist_trailing_comma">
          <title>A list of recipients.</title>

          <para>The <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> presented
          before yields:</para>

          <programlisting><computeroutput>Adam Hacker,Eve Intruder</computeroutput><emphasis
              role="bold">,</emphasis></programlisting>
        </figure>

        <para>Right now we do not bother about the trailing <quote>,</quote>
        after the last recipient. The surrounding
        <code>&lt;xsl:text&gt;</code>,<code>&lt;/xsl:text&gt;</code> elements
        <emphasis>may</emphasis> be omitted. We encourage the reader to leave
        them in place since they increase readability when a template's body
        gets more complex. The element <tag class="starttag">xsl:text</tag> is
        used to append static text to the output. This way we append a
        separator after each recipient. We now discuss the role of the two
        attributes <tag class="attribute">match="/"</tag> and <tag
        class="attribute">select=memo/to</tag>. Both are examples of so called
        <link xlink:href="http://www.w3.org/TR/xpath">XPath</link>
        expressions. They allow to define <emphasis>node sets</emphasis> being
        subsets from the set of all nodes from a given document
        instance.</para>

        <para>Conceptually <link
        xlink:href="http://www.w3.org/TR/xpath">XPath</link> expressions may
        be compared to the <acronym
        xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym> language
        the latter allowing the retrieval of data<emphasis>sets</emphasis>
        from a relational database. We illustrate the current example by a
        figure:</para>

        <figure xml:id="figure_memo_xpath_memo_to">
          <title>Selecting node sets from <tag class="element">memo</tag>
          document instances</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/memoxpath.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>This figure needs some explanation. We observe an additional
        node <quote>above</quote> <tag class="starttag">memo</tag> being
        represented as <quote>filled</quote>. This node represents the
        document instance as a whole and has got <tag>memo</tag> as its only
        child. We will rediscover this additional root node when we discuss
        the <abbrev
        xlink:href="http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407">DOM</abbrev>
        application programming interface.</para>

        <para>As already mentioned the expression <code>memo/to</code>
        evaluates to a <emphasis>set</emphasis> of nodes. In our example this
        set consists of two nodes of type <tag class="starttag">to</tag> each
        of them representing a recipient of the memo. We observe a subtle
        difference between the two <abbrev
        xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
        expressions:</para>

        <glosslist>
          <glossentry>
            <glossterm><code>match="/"</code></glossterm>

            <glossdef>
              <para>The expression starts and actually consists of the string
              <quote>/</quote>. Thus it can be called an
              <emphasis>absolute</emphasis> <abbrev
              xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
              expression. Like a file specification
              <filename>C:\dos\myprog.exe</filename> it starts on top level
              and needs no further context information to get
              evaluated.</para>

              <para>A <abbrev
              xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> style
              sheet <emphasis>must</emphasis> have an <link
              xlink:href="http://www.w3.org/TR/xslt20/#initiating">initial
              context node</link> to start the transformation. This is
              achieved by providing exactly one <tag
              class="starttag">xsl:template</tag> with an absolute <abbrev
              xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev> value for
              its <tag class="attribute">match</tag> attribute like <tag
              class="attvalue">/memo</tag>.<emphasis/></para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm><code>select="memo/to"</code></glossterm>

            <glossdef>
              <para>This expression can be compared to a
              <emphasis>relative</emphasis> file path specification like e.g.
              <filename>../images/hdm.gif</filename>. We need to add the base
              (context) directory in order for a relative file specification
              to become meaningful. If the base directory is
              <filename>/home/goik/xml</filename> than this
              <emphasis>relative</emphasis> file specification will address
              the file <filename>/home/goik/images/hdm.gif</filename>.</para>

              <para>Likewise we have to define a <emphasis>context</emphasis>
              node if we want to evaluate a relative <abbrev
              xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
              expression. In our example this is the root node. The XSL
              specification introduces the term <link
              xlink:href="http://www.w3.org/TR/xslt20/#context">evaluation
              context</link> for this purpose.</para>
            </glossdef>
          </glossentry>
        </glosslist>

        <para>In order to explain relative <abbrev
        xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev> expressions we
        consider <code>content/para</code> starting from the (unique!) <tag
        class="element">memo</tag> node:</para>

        <figure xml:id="memoXpathPara">
          <title>The node set represented by <code>content/para</code>
          starting at the context node <tag
          class="starttag">memo</tag>.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/memorelativexpath.fig"/>
            </imageobject>

            <caption>
              <para>The dashed lines represent the relative <abbrev
              xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
              expressions starting from the context node to each of the nodes
              in the result set.</para>
            </caption>
          </mediaobject>
        </figure>
      </section>

      <section xml:id="xsl_important_elements">
        <title>Some important <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> elements</title>

        <section xml:id="xsl_if">
          <title><tag class="starttag">xsl:if</tag></title>

          <para>Sometimes we need conditional processing rules. We might want
          create a list of sender and recipients with a defined value for the
          attribute <tag class="attribute">id</tag>. In the <link
          linkend="figure_memoref_instance">given example</link> this is only
          valid for the (unique) sender and the recipient <code>&lt;to
          id="eve"&gt;Eve Intruder&lt;/to&gt;</code>. We assume this set of
          persons shall be inserted into a relational database table
          <code>Customer</code> consisting of two <code>NOT NULL</code>
          columns <code>id</code> an <code>name</code>. Thus both attributes
          <emphasis>must</emphasis> be specified and we must exclude <tag
          class="starttag">from</tag> or <tag class="starttag">to</tag> nodes
          with undefined <tag class="attribute">id</tag> attributes:</para>

          <figure xml:id="programlisting_memo_export_sql">
            <title>Exporting SQL statements.</title>

            <programlisting>...
&lt;xsl:variable name="newline" <co xml:id="programlisting_xsl_if_definevar"/>&gt; &lt;!-- A newline \n --&gt;
  &lt;xsl:text&gt;
&lt;/xsl:text&gt;
&lt;/xsl:variable&gt;

&lt;xsl:template match="/memo"&gt;
  &lt;xsl:for-each select="from|to" <co xml:id="programlisting_xsl_if_foreach"/>&gt;
    &lt;xsl:if <emphasis role="bold">test="@id"</emphasis> <co
                xml:id="programlisting_xsl_if_test"/>&gt;
      &lt;xsl:text&gt;INSERT INTO Customer (id, name) VALUES ('&lt;/xsl:text&gt;
      &lt;xsl:value-of select="@id" <co
                xml:id="programlisting_xsl_if_select_idattrib"/>/&gt;
      &lt;xsl:text&gt;', '&lt;/xsl:text&gt;
      &lt;xsl:value-of select="." <co
                xml:id="programlisting_xsl_if_selectcontent"/>/&gt;
      &lt;xsl:text&gt;')&lt;/xsl:text&gt;
      &lt;xsl:value-of select="$newline" <co
                xml:id="programlisting_xsl_if_usevar"/>/&gt;
    &lt;/xsl:if&gt;
  &lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;</programlisting>

            <caption>
              <para>We want to export data from XML documents to a database
              server. For this purpose INSERT statements are being crafted
              from a XML document containing relevant data.</para>
            </caption>
          </figure>

          <calloutlist>
            <callout arearefs="programlisting_xsl_if_definevar">
              <para>Define a file local variable <code>newline</code>. Dealing
              with text output frequently requires the insertion of newlines.
              Due to the syntax of the <tag class="element">xsl:text</tag>
              elements this tends to clutter the code.</para>
            </callout>

            <callout arearefs="programlisting_xsl_if_foreach">
              <para>Iterate over the set of the sender node and all recipient
              nodes.</para>
            </callout>

            <callout arearefs="programlisting_xsl_if_test">
              <para>The attribute value of <tag class="attribute">test</tag>
              will be <link
              xlink:href="http://www.w3.org/TR/xslt20/#xsl-if">evaluated</link>
              as a boolean. In this example it evaluates to <code>true</code>
              iff the attribute <tag class="attribute">id</tag> is defined for
              the context node. Since we are inside the <tag
              class="element">xsl:for-each</tag> block all context nodes are
              either of type <tag class="starttag">from</tag> or <tag
              class="starttag">to</tag> and thus <emphasis>may</emphasis> have
              an <tag class="attribute">id</tag> attribute.</para>
            </callout>

            <callout arearefs="programlisting_xsl_if_select_idattrib">
              <para>The <tag class="attribute">id</tag> attributes value is
              copied to the output. The <quote>@</quote> character in
              <code>select="@id"</code> tells the <abbrev
              xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor
              to read the value of an <emphasis>attribute</emphasis> with name
              <tag class="attribute">id</tag> rather then the content of a
              nested sub<emphasis>element</emphasis> like in <code>&lt;to
              id="foo"&gt;&lt;id&gt;I am
              nested!&lt;/id&gt;&lt;/to&gt;</code>.</para>
            </callout>

            <callout arearefs="programlisting_xsl_if_selectcontent">
              <para>As stated earlier the dot <quote>.</quote> denotes the
              current context element. In this example simply the
              <code>#PCDATA</code> content is copied to the output.</para>
            </callout>

            <callout arearefs="programlisting_xsl_if_usevar">
              <para>The <quote>$</quote> sign in front of <code>newline</code>
              tells the <abbrev
              xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor
              to access the variable <varname>newline</varname> previously
              defined in <coref linkend="programlisting_xsl_if_definevar"/>
              rather then interpreting it as the name of a sub element or an
              attribute.</para>
            </callout>
          </calloutlist>

          <para>As expected the recipient entry <quote>Adam Hacker</quote>
          does not appear due to the fact that no <tag
          class="attribute">id</tag> attribute is defined in its <tag
          class="starttag">to</tag> element:</para>

          <programlisting><computeroutput>INSERT INTO Customer (id, name) VALUES ('goik', 'Martin Goik')
INSERT INTO Customer (id, name) VALUES ('eve', 'Eve intruder')</computeroutput></programlisting>

          <qandaset role="exercise">
            <title>The XPath functions position() and last()</title>

            <qandadiv>
              <qandaentry xml:id="example_position_last">
                <question>
                  <para>We return to our recipient list in <xref
                  linkend="figure_recipientlist_trailing_comma"/>. We are
                  interested in a list of recipients avoiding the trailing
                  comma:</para>

                  <programlisting><computeroutput>Adam Hacker,Eve Intruder</computeroutput></programlisting>

                  <para>We may use a <tag class="element">xsl:if</tag> to
                  insert a comma for all but the very last recipient node.
                  This can be achieved by using the <abbrev
                  xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                  functions <link
                  xlink:href="http://www.w3.org/TR/xpath#function-position">position()</link>
                  and <link
                  xlink:href="http://www.w3.org/TR/xpath#function-last">last()</link>.
                  Hint: The arithmetic operator <quote>&lt;</quote> may be
                  used in <abbrev
                  xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> to
                  compare two integer numbers. However it must be escaped as
                  <code>&amp;lt;</code> in order to be XML compatible.</para>
                </question>

                <answer>
                  <para>We have to exclude the comma for the last node of the
                  recipient list. If we have e.g. 10 recipients the function
                  <code>position()</code> will return values integer values
                  starting at 1 and ending with 10. So for the last node the
                  comparison <code>10 &lt; 10</code> will evaluate to
                  false:</para>

                  <programlisting>&lt;xsl:for-each select="memo/to"&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:if test="position() &amp;lt; last()"&gt;
    &lt;xsl:text&gt;,&lt;/xsl:text&gt;     
  &lt;/xsl:if&gt;
&lt;/xsl:for-each&gt;</programlisting>
                </answer>
              </qandaentry>

              <qandaentry xml:id="example_avoid_xsl_if">
                <question>
                  <label>Avoiding xsl:if</label>

                  <para>In <xref linkend="programlisting_memo_export_sql"/> we
                  used the <abbrev
                  xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev> value
                  <quote>from|to</quote> to select the desired sender and
                  recipient nodes. Inside the <tag
                  class="element">xsl:for-each</tag> block we permitted only
                  those nodes which have an <tag class="attribute">id</tag>
                  attribute. These two steps may be combined into a single
                  <abbrev
                  xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                  expression obsoleting the <tag
                  class="element">xsl:if</tag>.</para>
                </question>

                <answer>
                  <para>We simply need a modified <abbrev
                  xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev> in
                  the <tag class="element">for-each</tag>:</para>

                  <programlisting>&lt;xsl:for-each select="<emphasis
                      role="bold">from[@id]|to[@id]</emphasis>"&gt;
  &lt;xsl:text&gt;INSERT INTO Customer (id, name) VALUES ('&lt;/xsl:text&gt;
  &lt;xsl:value-of select="@id"/&gt;
  &lt;xsl:text&gt;', '&lt;/xsl:text&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:text&gt;')&lt;/xsl:text&gt;
  &lt;xsl:value-of select="$newline"/&gt;
&lt;/xsl:for-each&gt;</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="xsl_apply_templates">
          <title><tag class="starttag">xsl:apply-templates</tag></title>

          <para>We already used <tag class="element">xsl:for-each</tag> to
          iterate over a list of element nodes. <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> offers a
          different possibility for this purpose. The idea is to define the
          formatting rules at a centralized location. So the solution to <xref
          linkend="example_position_last"/> in an equivalent way:</para>

          <programlisting>&lt;xsl:template match="/"&gt;
  &lt;xsl:apply-templates select="memo/to" <co
              xml:id="programlisting_apply_templates_apply"/>/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="to" <co xml:id="programlisting_apply_templates_match"/>&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:if test="<emphasis role="bold">position()</emphasis> &amp;lt; <emphasis
              role="bold">last()</emphasis>"&gt;
    &lt;xsl:text&gt;,&lt;/xsl:text&gt;     
  &lt;/xsl:if&gt;    
&lt;/xsl:template&gt;</programlisting>

          <calloutlist>
            <callout arearefs="programlisting_apply_templates_apply">
              <para>Definition of the recipient node list. Each element of
              this list shall be processed further.</para>
            </callout>

            <callout arearefs="programlisting_apply_templates_match">
              <para>This template <emphasis>may</emphasis> be used by a XSL
              processor to format nodes of type <tag
              class="starttag">to</tag>. Since the processor is asked to do
              exactly this in <xref
              linkend="programlisting_apply_templates_apply"/> the current
              template will <emphasis>really</emphasis> be used in this
              example.</para>
            </callout>
          </calloutlist>

          <para>The procedure outlined above may have the following
          advantages:</para>

          <itemizedlist>
            <listitem>
              <para>Some elements being central for a DTD may appear at
              different places. For example a <tag
              class="starttag">title</tag> element is likely to appear as a
              child of chapters, sections, tables figures and so on. It may be
              sufficient to define a single template with a
              <code>match="title"</code> attribute which contains all rules
              being required.</para>
            </listitem>

            <listitem>
              <para>Sometimes the body of a <tag
              class="starttag">xsl:for-each</tag> ... <tag
              class="endtag">xsl:for-each</tag> spans multiple screens thus
              limiting code readability. Factoring out the body into a
              template may avoid this obstacle.</para>
            </listitem>
          </itemizedlist>

          <para>This method is well known from programming languages: If the
          code inside a loop is needed multiple times or reaches a painful
          line count <emphasis>good</emphasis> programmers tend to define a
          separate method. For example:</para>

          <programlisting language="java">for (int i = 0; i &lt; 10; i++){
  if (a[i] &lt; b[i]){
    max[i] = b;
  } else {
    max[i] = a;
  }
  ...
}</programlisting>

          <para>Inside the loop's body the relative maximum value of two
          variables gets computed. This may be needed at several locations and
          thus it is convenient to centralize this code into a method:</para>

          <programlisting language="java">// cf. &lt;xsl:template match="..."&gt;
static int maximum(int a, int b){
 if (a &lt; b){
   return b;
 } else {
   return a;
  }
}
...
// cf. &lt;xsl:apply-templates select="..."/&gt;
for (int i = 0; i &lt; 10; i++){
  max[i] = maximum(a[i], b[i]);
}</programlisting>

          <para>So far calling a static method in <link
          linkend="gloss_Java"><trademark>Java</trademark></link> may be
          compared to a <tag class="starttag">xsl:apply-templates</tag>. There
          is however one big difference. In <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> the
          <quote>method</quote> being called may not exist at all. A <tag
          class="starttag">xsl:apply-templates</tag> instructs a processor to
          format a set of nodes. It does not contain information about any
          rules being defined to do this job:</para>

          <programlisting>&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
  version="2.0"&gt;
  
  &lt;xsl:output method="text"/&gt;
  
  &lt;xsl:template match="/memo"&gt;
    &lt;xsl:apply-templates <emphasis role="bold">select="content"</emphasis>/&gt;
  &lt;/xsl:template&gt;
  
&lt;/xsl:stylesheet&gt;</programlisting>

          <para>Since no suitable template supplying rules for <tag
          class="starttag">content</tag> nodes exists a <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor uses
          a default formatting rule instead:</para>

          <programlisting><computeroutput>Thanks for your excellent work.Our firewall is definitely
broken! This bug has been reported by the sender.</computeroutput></programlisting>

          <para>We observe that the <code>#PCDATA</code> content strings of
          the element itself and all (recursive) sub elements get glued
          together into one string. In most cases this is definitely not
          intended. Omitting a necessary template is usually a programming
          error. It is thus good programming practice during style sheet
          development to define a special template catching forgotten
          rules:</para>

          <programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;xsl:apply-templates select="content"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="*"&gt;
  &lt;xsl:message&gt;
    &lt;xsl:text&gt;Error: No template defined matching element '&lt;/xsl:text&gt;
    &lt;xsl:value-of select="name(.)"/&gt;
    &lt;xsl:text&gt;'&lt;/xsl:text&gt;
  &lt;/xsl:message&gt;
&lt;/xsl:template&gt;</programlisting>

          <para>The <quote>*</quote> matches any element if there is no <link
          xlink:href="http://www.w3.org/TR/xslt20/#conflict">better
          matching</link> rule defined. Since we did not supply any template
          for <tag class="starttag">content</tag> nodes at all this default
          template will match nodes of type <tag
          class="starttag">content</tag>. The function <code>name()</code> is
          predefined in <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> and returns
          the element type name of a node. During the formatting process we
          will now see the following warning message:</para>

          <programlisting><computeroutput>Error: No template defined matching element 'content'</computeroutput></programlisting>

          <para>We note that for document nodes <tag
          class="starttag">xyz</tag><code>foo</code><tag
          class="endtag">xyz</tag> containing only <code>#PCDATA</code> a
          simple <tag class="emptytag">xsl:apply-templates select="xyz"</tag>
          is sufficient: A <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> processor uses
          its default rule and copies the node's content <code>foo</code> to
          its output.</para>

          <qandaset role="exercise">
            <title>Extending the export to a RDBMS</title>

            <qandadiv>
              <qandaentry xml:id="example_rdbms_person">
                <question>
                  <para>We assume that our RDBMS table <code>Customer</code>
                  from <xref linkend="programlisting_memo_export_sql"/> shall
                  be replaced by a table <code>Person</code>. We expect the
                  senders of memo documents to be employees of a given
                  company. Conversely the recipients of memos are expected to
                  be customers. Our <code>Person</code> table shall have a
                  <quote>tag</quote> like column named <code>type</code>
                  having exactly two allowed values <code>customer</code> or
                  <code>employee</code> being controlled by a
                  <code>CHECK</code> constraint, see <xref
                  linkend="table_person"/>. Create a style sheet generating
                  the necessary SQL statements from a memo document instance.
                  Hint: Define two different templates for <tag
                  class="starttag">from</tag> and <tag
                  class="starttag">to</tag> nodes.</para>
                </question>

                <answer>
                  <para>We define two templates differing only in the static
                  string value for a person's type. The relevant <abbrev
                  xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                  portion reads:<programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;xsl:apply-templates select="from|to"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="from"&gt;
  &lt;xsl:text&gt;INSERT INTO Person (name, type) VALUES('&lt;/xsl:text&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:text&gt;', <emphasis role="bold">'employee'</emphasis>)&lt;/xsl:text&gt;
  &lt;xsl:value-of select="$newline"/&gt;
&lt;/xsl:template&gt;

  &lt;xsl:template match="to"&gt;
  &lt;xsl:text&gt;INSERT INTO Person (name, type) VALUES('&lt;/xsl:text&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:text&gt;', <emphasis role="bold">'customer'</emphasis>)&lt;/xsl:text&gt;
  &lt;xsl:value-of select="$newline"/&gt;
&lt;/xsl:template&gt;</programlisting></para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <table xml:id="table_person">
            <title>The Person table</title>

            <?dbhtml table-width="30%" ?>

            <?dbfo table-width="40%" ?>

            <tgroup cols="2">
              <colspec colwidth="3*"/>

              <colspec colwidth="2*"/>

              <thead>
                <row>
                  <entry>name</entry>

                  <entry>type</entry>
                </row>
              </thead>

              <tbody>
                <row>
                  <entry>Martin Goik</entry>

                  <entry>employee</entry>
                </row>

                <row>
                  <entry>Adam Hacker</entry>

                  <entry>customer</entry>
                </row>

                <row>
                  <entry>Eve intruder</entry>

                  <entry>customer</entry>
                </row>
              </tbody>
            </tgroup>
          </table>
        </section>

        <section xml:id="xsl_choose">
          <title><tag class="starttag">xsl:choose</tag></title>

          <para>We already described the <tag class="starttag">xsl:if</tag>
          which can be compared to an <code>if(..){...}</code> statement in
          many programming languages. The <tag
          class="starttag">xsl:choose</tag> element can be compared to
          multiple <code>else</code> conditions including an optional final
          <code>else</code> block being reached if all boolean tests
          fail:</para>

          <programlisting language="java">if (condition a){
...//block 1
} else if (condition b){
... //block b
} ...
...
else {
  ... //code being reached whan all conditions evaluate to false
}</programlisting>

          <para>We want to generate a list of memo recipient names with roman
          type numeration up to 10. Higher numbers shall be displayed in
          ordinary decimal notation:</para>

          <programlisting><computeroutput>I:Adam Hacker
II:Eve intruder
III: ...
IV: ...
...</computeroutput></programlisting>

          <para>Though <abbrev
          xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> offers <link
          xlink:href="http://www.w3.org/TR/xslt20/#convert">a better
          way</link> we may generate these number literals by:</para>

          <programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;xsl:apply-templates select="to"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="to"&gt;
  &lt;xsl:choose&gt;
    &lt;xsl:when test="1 = position()"&gt;I&lt;/xsl:when&gt;
    &lt;xsl:when test="2 = position()"&gt;II&lt;/xsl:when&gt;
    &lt;xsl:when test="3 = position()"&gt;III&lt;/xsl:when&gt;
    &lt;xsl:when test="4 = position()"&gt;IV&lt;/xsl:when&gt;
    &lt;xsl:when test="5 = position()"&gt;V&lt;/xsl:when&gt;
    &lt;xsl:when test="6 = position()"&gt;VI&lt;/xsl:when&gt;
    &lt;xsl:when test="7 = position()"&gt;VII&lt;/xsl:when&gt;
    &lt;xsl:when test="8 = position()"&gt;VIII&lt;/xsl:when&gt;
    &lt;xsl:when test="9 = position()"&gt;IX&lt;/xsl:when&gt;
    &lt;xsl:when test="10 = position()"&gt;X&lt;/xsl:when&gt;
    &lt;xsl:otherwise&gt;
      &lt;xsl:value-of select="position()"/&gt;
    &lt;/xsl:otherwise&gt;
  &lt;/xsl:choose&gt;

  &lt;xsl:text&gt;:&lt;/xsl:text&gt;
  &lt;xsl:value-of select="."/&gt;
  &lt;xsl:value-of select="$newline"/&gt;
&lt;/xsl:template&gt;</programlisting>

          <para>Note that this conversion is incomplete: If the number in
          question is larger than 10 it will be formatted in ordinary decimal
          style according to the <tag class="starttag">xsl:otherwise</tag>
          clause.</para>
        </section>

        <section xml:id="section_html_book">
          <title>A complete HTML formatting example</title>

          <para>We now present a series of exercises showing how to format
          <tag class="starttag">book</tag> document instances to XHTML. This
          is done in a step by step manner each time showing correspondent
          code snippets for our <filename>memo.dtd</filename>.</para>

          <section xml:id="section_memo_to_list">
            <title>Listing the recipients of a memo</title>

            <para>In order to generate a XHTML <link
            xlink:href="http://www.w3.org/TR/html401/struct/lists.html#h-10.2">list</link>
            of all <tag class="starttag">memo</tag> recipients of a memo we
            have to use <tag class="starttag">xsl:output method="xhtml"</tag>
            and embed the required HTML tags in our <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> style
            sheet:</para>

            <programlisting>&lt;xsl:output method="xhtml" indent="yes"/&gt;

&lt;xsl:template match="/memo"&gt;
  &lt;html&gt;
    &lt;head&gt;
      &lt;title&gt;Recipient list&lt;/title&gt;
    &lt;/head&gt;
    &lt;body&gt;
      &lt;ul&gt;
        &lt;xsl:apply-templates select="to"/&gt;
      &lt;/ul&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="to"&gt;
  &lt;li&gt;
    &lt;xsl:value-of select="."/&gt;
  &lt;/li&gt;    
&lt;/xsl:template&gt;</programlisting>

            <para>Processing this style sheet for a <tag
            class="starttag">memo</tag> document instance yields:</para>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;html&gt;
   &lt;head&gt;
      &lt;title&gt;Recipient list&lt;/title&gt;
   &lt;/head&gt;
   &lt;body&gt;
      &lt;ul&gt;
         &lt;li&gt;Adam Hacker&lt;/li&gt;
         &lt;li&gt;Eve intruder&lt;/li&gt;
      &lt;/ul&gt;
   &lt;/body&gt;
&lt;/html&gt;</programlisting>

            <para>The generated Xhtml code does not contain a reference to a
            DTD. We may supply this reference by modifying our <tag
            class="emptytag">xsl:output</tag> directive:</para>

            <programlisting>&lt;xsl:output method="xhtml" indent="yes"
    <emphasis role="bold">doctype-public</emphasis>="-//W3C//DTD XHTML 1.0 Strict//EN"
    <emphasis role="bold">doctype-system</emphasis>="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/&gt;</programlisting>

            <para>This adds a corresponding header which allows to validate
            the generated HTML:</para>

            <programlisting>&lt;!DOCTYPE html
  PUBLIC "<emphasis role="bold">-//W3C//DTD XHTML 1.0 Strict//EN</emphasis>"
     "<emphasis role="bold">http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd</emphasis>"&gt;
&lt;html&gt;&lt;head&gt; ...</programlisting>

            <para>This may be improved further by instructing the XSL
            formatter to use <uri
            xlink:href="http://www.w3.org/1999/xhtml">http://www.w3.org/1999/xhtml</uri>
            as default namespace:</para>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;xsl:stylesheet  <emphasis role="bold">xmlns="http://www.w3.org/1999/xhtml"</emphasis>
                 xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"&gt;

&lt;xsl:output method="xhtml" indent="yes"
    doctype-public="-//W3C//DTD XHTML 1.0 Strict//EN"
    doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"/&gt;
    
    &lt;xsl:template match="/"&gt;
        &lt;html&gt;&lt;head&gt; ...
     &lt;/xsl:template&gt;
...
&lt;/xsl:stylesheet&gt;</programlisting>

            <para>This yields the following output::</para>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html
  PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
     "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"&gt;

&lt;html <emphasis role="bold">xmlns="http://www.w3.org/1999/xhtml"</emphasis>&gt;
   &lt;head&gt; ...
&lt;/html&gt;</programlisting>

            <para>The top level element <tag class="element">html</tag> is now
            declared to belong to the namespace
            <code>xmlns="http://www.w3.org/1999/xhtml</code>. This will be
            inherited by all inner Xhtml elements.</para>

            <qandaset role="exercise">
              <title>Transforming book instances to Xhtml</title>

              <qandadiv>
                <qandaentry xml:id="example_xsl_book_1_dtd">
                  <question>
                    <para>Create a <abbrev
                    xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                    style sheet to transform instances of the first version of
                    <link endterm="example_bookDtd"
                    linkend="example_bookDtd">book.dtd</link> (<xref
                    linkend="example_bookDtd"/>) into <uri
                    xlink:href="http://www.w3.org/TR/xhtml1/#a_dtd_XHTML-1.0-Strict">Xhtml
                    1.0 strict</uri>.</para>

                    <para>You should first construct a Xhtml document
                    <emphasis>manually</emphasis> before coding the XSL. After
                    you have a <quote>working</quote> Xhtml example document
                    create a <abbrev
                    xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                    style sheet which transforms arbitrary
                    <filename>book.dtd</filename> document instances into a
                    corresponding Xhtml file.</para>
                  </question>

                  <answer>
                    <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="2.0"&gt;
  
  &lt;xsl:output indent="yes" method="xhtml"/&gt;
  
  &lt;xsl:template match="/book"&gt;
    &lt;html&gt;
      &lt;head&gt;
        &lt;title&gt;&lt;xsl:value-of select="title"/&gt;&lt;/title&gt;
      &lt;/head&gt;
      &lt;body&gt;
        &lt;h1&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h1&gt;
        &lt;xsl:apply-templates select="chapter"/&gt;
      &lt;/body&gt;
    &lt;/html&gt;
  &lt;/xsl:template&gt;
  
  &lt;xsl:template match="chapter"&gt;
    &lt;h2&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h2&gt;
    &lt;xsl:apply-templates select="para"/&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="para"&gt;
    &lt;p&gt;&lt;xsl:value-of select="."/&gt;&lt;/p&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</programlisting>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>

          <section xml:id="section_xsl_attribute">
            <title><tag class="starttag">xsl:attribute</tag></title>

            <para>Sometimes we want to set attribute values in a generated XML
            document. For example we might want to set the background color
            <quote>red</quote> if a memo has a priority value of <tag
            class="attvalue">high</tag>:</para>

            <programlisting>&lt;h1 style="background:red"&gt;Firewall problems&lt;/h1&gt;</programlisting>

            <para>Regarding our memo example this may be achieved by:</para>

            <programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;html&gt;
   ...
    &lt;body&gt;
      &lt;xsl:variable name="<emphasis role="bold">messageColor</emphasis>" <co
                xml:id="programlisting_priority_lolor_vardef"/>&gt;
        &lt;xsl:choose&gt;
          &lt;xsl:when test="@priority = 'low'"&gt;green&lt;/xsl:when&gt;
          &lt;xsl:when test="@priority = 'medium'"&gt;yellow&lt;/xsl:when&gt;
          &lt;xsl:when test="@priority = 'high'"&gt;red&lt;/xsl:when&gt;
        &lt;/xsl:choose&gt;
      &lt;/xsl:variable&gt;
      &lt;h1 style="background:{<emphasis role="bold">$messageColor</emphasis>};" <co
                xml:id="programlisting_priority_lolor_usevar"/>&gt;
        &lt;xsl:value-of select="subject"/&gt;
      &lt;/h1&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;</programlisting>

            <calloutlist>
              <callout arearefs="programlisting_priority_lolor_vardef">
                <para>Definition of a color name depending on the attribute
                <tag class="attvalue">priority</tag>'s value. The set off
                possible attribute values (low,medium,high) is mapped to the
                color names (green, yellow,red).</para>
              </callout>

              <callout arearefs="programlisting_priority_lolor_usevar">
                <para>The color variable is used to compose the attribute <tag
                class="attribute">style</tag>'s value. The curly
                <code>{...}</code> braces are part of the <abbrev
                xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                standard's syntax. They are required here to instruct the
                <abbrev xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                processor to substitute the local variable
                <code>messageColor</code>'s value instead of simply copying
                the literal string <quote><code>$messageColor</code></quote>
                itself to the output document e.g. generating <tag
                class="starttag">h1 style =
                "background:$messageColor;"</tag>.</para>
              </callout>
            </calloutlist>

            <para>Instead of constructing an extra variable <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> offers a
            slightly more compact way for the same purpose. The <tag
            class="starttag">xsl:attribute</tag> element allows us to define
            the name of an attribute to be added together with an attribute
            value specification:</para>

            <programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;html&gt;
   ...
      &lt;h1&gt;
        &lt;xsl:attribute name="<emphasis role="bold">style</emphasis>"&gt;
          &lt;xsl:text&gt;background:&lt;/xsl:text&gt;
          &lt;xsl:choose&gt;
            &lt;xsl:when test="@priority = 'low'"&gt;green&lt;/xsl:when&gt;
            &lt;xsl:when test="@priority = 'medium'"&gt;yellow&lt;/xsl:when&gt;
            &lt;xsl:when test="@priority = 'high'"&gt;red&lt;/xsl:when&gt;
          &lt;/xsl:choose&gt;
        &lt;/xsl:attribute&gt;
        &lt;xsl:value-of select="subject"/&gt;
      &lt;/h1&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;</programlisting>

            <qandaset role="exercise">
              <title>Adding a table of contents (toc)</title>

              <qandadiv>
                <qandaentry xml:id="example_book_toc">
                  <question>
                    <para>For larger document instances it is convenient to
                    add a table of contents to the generated Xhtml document.
                    <!-- We
                  demonstrate the desired result as an <uri
                  xlink:href="src/viewlet/bookhtmltoc/bookhtmltoc_viewlet_swf.html">animation</uri>.--></para>

                    <para>For this exercise you need a unique string value for
                    each <tag class="starttag">chapter</tag> node. If a <tag
                    class="starttag">chapter</tag>'s <tag
                    class="attribute">id</tag> attribute had been declared as
                    <code>#REQUIRED</code> its value would do this job
                    perfectly. Unfortunately you cannot rely on its existence
                    since it is declared to be <code>#IMPLIED</code> and may
                    thus be absent.</para>

                    <para>XSL offers a standard function for this purpose
                    namely <link
                    xlink:href="http://www.w3.org/TR/xslt20/#generate-id">generate-id(...)</link>.
                    In a nutshell this function takes a XML node as an
                    argument (or being called without arguments it uses the
                    context node) and creates a string value being unique with
                    respect to <emphasis>all</emphasis> other nodes in the
                    document. For a given node the function may be called
                    repeatedly and is guaranteed to always return the same
                    value during the <emphasis>same</emphasis> transformation
                    run. So it suffices to add something like <tag
                    class="starttag">a href="#{generate-id(...)}"</tag> or use
                    it in conjunction with <tag
                    class="starttag">xsl:attribute</tag>.</para>
                  </question>

                  <answer>
                    <para>We use the <code>generate-id()</code> function to
                    create a unique identity string for each chapter node.
                    Since we also want to define links to the table of
                    contents we need another unique string value. It is
                    tempting to simply use a static value like
                    <quote>__toc__</quote> for this purpose. However we can
                    not be sure that this value coincides with one of the
                    <code>generate-id()</code> function return values.</para>

                    <para>A cleaner solution uses the <tag
                    class="starttag">book</tag> node's generated identity
                    string for this purpose. As stated before this value is
                    definitively unique:</para>

                    <programlisting>&lt;xsl:template match="/book"&gt;
...
    &lt;body&gt;
      &lt;h1&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h1&gt;
        &lt;h2 id="{generate-id(.)}" <co xml:base=""
                        xml:id="programlisting_book_toc_def_toc"/>&gt;Table of contents&lt;/h2&gt;
        &lt;ul&gt;
        &lt;xsl:for-each select="chapter"&gt;
          &lt;li&gt;
            &lt;a href="#{generate-id(.)}" <co xml:base=""
                        xml:id="programlisting_book_toc_ref_chap"/>&gt;&lt;xsl:value-of select="title"&gt;&lt;/xsl:value-of&gt;&lt;/a&gt;
          &lt;/li&gt;
        &lt;/xsl:for-each&gt;
      &lt;/ul&gt;
      &lt;xsl:apply-templates select="chapter"/&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="chapter"&gt;
  &lt;h2 id="{generate-id(.)}" <co xml:base=""
                        xml:id="programlisting_book_toc_def_chap"/>&gt;
    &lt;a href="#{generate-id(/book)}" <co xml:base=""
                        xml:id="programlisting_book_toc_ref_toc"/>&gt;
      &lt;xsl:value-of select="title"/&gt;
    &lt;/a&gt;
  &lt;/h2&gt;
  &lt;xsl:apply-templates select="para"/&gt;
&lt;/xsl:template&gt;
...</programlisting>

                    <calloutlist>
                      <callout arearefs="programlisting_book_toc_def_toc">
                        <para>The current context node is <tag
                        class="starttag">book</tag>. We use it as argument to
                        <code>generate-id()</code> to create a unique identity
                        string.</para>
                      </callout>

                      <callout arearefs="programlisting_book_toc_ref_chap">
                        <para>The <tag class="starttag">xsl:for-each</tag>
                        iterates over all <tag class="starttag">chapter</tag>
                        nodes. We reference the corresponding target nodes
                        being created in <xref
                        linkend="programlisting_book_toc_def_chap"/>.</para>
                      </callout>

                      <callout arearefs="programlisting_book_toc_def_chap">
                        <para>Each <tag class="starttag">chapter</tag>'s
                        heading is supplied with a unique identity string
                        being referenced from <xref
                        linkend="programlisting_book_toc_ref_chap"/>.</para>
                      </callout>

                      <callout arearefs="programlisting_book_toc_ref_toc">
                        <para>Clicking on a chapter's title shall take us back
                        to the table of contents (toc). So we create a
                        hypertext link referencing our toc heading's identity
                        string being defined in <xref
                        linkend="programlisting_book_toc_def_toc"/>.</para>
                      </callout>
                    </calloutlist>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>

          <section xml:id="section_xsl_mixed">
            <title>XSL and mixed content</title>

            <para>We come back to our memo example from <xref
            linkend="figure_memo_content_mixed"/> and ask ourselves how to
            format mixed content. In the example the following part of a
            document instance was given:</para>

            <programlisting>&lt;content&gt;The <emphasis role="bold">&lt;url href="http://w3.org/XML"&gt;XML&lt;/url&gt;</emphasis> language
  is <emphasis role="bold">&lt;emphasis&gt;easy&lt;/emphasis&gt;</emphasis> to learn. However you need 
  some <emphasis role="bold">&lt;emphasis&gt;time&lt;/emphasis&gt;</emphasis>.&lt;/content&gt;</programlisting>

            <para>Embedded element nodes have been set to bold style in order
            to distinguish them from <code>#PCDATA</code> text nodes. We may
            also use <xref linkend="figure_memo_content_mixed"/> to help
            understanding the formatting process of mixed content. First we
            mention a possible way our Xhtml output might look like:</para>

            <programlisting>&lt;p&gt;The <emphasis role="bold">&lt;a href="http://w3.org/XML"&gt;XML&lt;/a&gt;language is&lt;em&gt;easy&lt;/em&gt;</emphasis> to learn. However you
need some <emphasis role="bold">&lt;em&gt;time&lt;/em&gt;</emphasis>.&lt;/p&gt;</programlisting>

            <para>We start with a first version of an <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
            template:</para>

            <programlisting>  &lt;xsl:template match="content"&gt;
    &lt;p&gt;
      &lt;xsl:value-of select="."/&gt;
    &lt;/p&gt;
  &lt;/xsl:template&gt;</programlisting>

            <para>As mentioned earlier all <code>#PCDATA</code> text nodes of
            the whole subtree are glued together leading to:</para>

            <programlisting>&lt;p&gt;The XML language is easy to learn. However you need some time.&lt;/p&gt;</programlisting>

            <para>Our next attempt is to define templates to format the
            elements <tag class="starttag">url</tag> and <tag
            class="starttag">emphasis</tag>:</para>

            <programlisting>...
&lt;xsl:template match="content"&gt;
  &lt;p&gt;
    &lt;xsl:apply-templates select="emphasis|url"/&gt;
  &lt;/p&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="url"&gt;
  &lt;a href="{@href}"&gt;&lt;xsl:value-of select="."/&gt;&lt;/a&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="emphasis"&gt;
  &lt;em&gt;&lt;xsl:value-of select="."/&gt;&lt;/em&gt;
&lt;/xsl:template&gt;
...</programlisting>

            <para>As expected the sub elements are formatted correctly.
            Unfortunately the <code>#PCDATA</code> text nodes between the
            element nodes are lost:</para>

            <programlisting>&lt;p&gt;
  &lt;a href="http://w3.org/XML"&gt;XML&lt;/a&gt;
  &lt;em&gt;easy&lt;/em&gt;
  &lt;em&gt;time&lt;/em&gt;
&lt;/p&gt;</programlisting>

            <para>To correct this transformation script we have to tell the
            formatting processor to include bare text nodes into the output.
            The <abbrev xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
            standard defines a function <link
            xlink:href="http://www.w3.org/TR/xpath#path-abbrev">text()</link>
            for this purpose. It returns the boolean value <code>true</code>
            for an argument node of type text:</para>

            <programlisting>...
&lt;xsl:template match="content"&gt;
 &lt;p&gt;
   &lt;xsl:apply-templates select="<emphasis role="bold">text()</emphasis>|emphasis|url"/&gt;
 &lt;/p&gt;
&lt;/xsl:template&gt;
...</programlisting>

            <para>The yields the desired output. The text node result elements
            are shown in bold style</para>

            <programlisting>&lt;p&gt;<emphasis role="bold">The</emphasis> &lt;a href="http://w3.org/XML"&gt;XML&lt;/a&gt;<emphasis
                role="bold"> language is </emphasis>&lt;em&gt;easy&lt;/em&gt;<emphasis
                role="bold"> to learn. However
you need some </emphasis>&lt;em&gt;time&lt;/em&gt;<emphasis role="bold">.</emphasis>&lt;/p&gt;</programlisting>

            <para>Some remarks:</para>

            <orderedlist>
              <listitem>
                <para>The <abbrev
                xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                expression <code>select="text()|emphasis|url"</code>
                corresponds nicely to the content model definition in the
                DTD:</para>

                <programlisting>&lt;!ELEMENT content  (#PCDATA|emphasis|url)*&gt;</programlisting>
              </listitem>

              <listitem>
                <para>In most mixed content models <emphasis>all</emphasis>
                sub elements of e.g. <tag class="starttag"
                role="">content</tag> have to be formatted. During development
                some of the elements defined in a DTD are likely to be omitted
                by accidence. For this reason the <quote>typical</quote>
                <abbrev xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                expression acting on mixed content models is defined to match
                <emphasis>any</emphasis> sub element nodes:</para>

                <programlisting>select="text()|<emphasis role="bold">*</emphasis>"</programlisting>
              </listitem>

              <listitem>
                <para>Regarding <code>select="text()|emphasis|url"</code> we
                have defined two templates for element nodes <tag
                class="starttag">emphasis</tag> and <tag
                class="starttag">url</tag>. What happens to those text nodes
                being matched by <code>text()</code>? These are subject to a
                default rule: The content of bare text nodes is written to the
                output. We may however redefine this default rule by adding a
                template:</para>

                <programlisting>&lt;xsl:template match="text()"&gt;
  <emphasis role="bold">&lt;span style="color:red"&gt;
    &lt;xsl:value-of select="."/&gt;
  &lt;/span&gt;</emphasis>
&lt;/xsl:template&gt;</programlisting>

                <para>This yields:</para>

                <programlisting>&lt;p&gt;
   <emphasis role="bold">&lt;span style="color:red"&gt;The &lt;/span&gt;</emphasis>
   &lt;a href="http://w3.org/XML"&gt;XML&lt;/a&gt;
   <emphasis role="bold">&lt;span style="color:red"&gt; language is &lt;/span&gt;</emphasis>
   &lt;em&gt;easy&lt;/em&gt;
   <emphasis role="bold">&lt;span style="color:red"&gt; to learn. However you need some &lt;/span&gt;</emphasis>
   &lt;em&gt;time&lt;/em&gt;
   <emphasis role="bold">&lt;span style="color:red"&gt;.&lt;/span&gt;</emphasis>
&lt;/p&gt;</programlisting>

                <para>In most cases it is not desired to replace all text
                nodes throughout the whole document. In the current example we
                might only format text nodes being
                <emphasis>immediate</emphasis> children of <tag
                class="starttag">content</tag>. This may be achieved by
                restricting the <abbrev
                xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                expression to <tag class="starttag">xsl:template
                match="content/text()"</tag>.</para>
              </listitem>
            </orderedlist>
          </section>

          <section xml:id="section_xsl_functionid">
            <title>The function <code>id()</code></title>

            <para>In <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> we sometimes
            want to lookup nodes by an attribute value of type <link
            linkend="section_id_idref">ID</link>. We consider our product
            catalog from <xref linkend="figure_intern_reference_xml"/>. The
            following <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> may be used
            to create Xhtml documents from <tag class="starttag">catalog</tag>
            instances:</para>

            <programlisting xml:lang="">&lt;xsl:template match="/catalog"&gt;
  &lt;html&gt;
    &lt;head&gt;&lt;title&gt;Product catalog&lt;/title&gt;&lt;/head&gt;
    &lt;body&gt;
      &lt;h1&gt;List of Products&lt;/h1&gt;
      &lt;xsl:apply-templates select="product"/&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="product"&gt;
  &lt;h2 id="{@id}" <co xml:base=""
                xml:id="programlisting_catalog2html_v1_defid"/>&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h2&gt;
  &lt;xsl:apply-templates select="para"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="para"&gt;
  &lt;p&gt;&lt;xsl:apply-templates select="text()|*" <co
                xml:id="programlisting_catalog2html_v1_mixed"/>/&gt;&lt;/p&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="link"&gt;
  &lt;a href="#{@ref}" <co xml:id="programlisting_catalog2html_v1_refid"/>&gt;&lt;xsl:value-of select="."/&gt;&lt;/a&gt;
&lt;/xsl:template&gt;</programlisting>

            <calloutlist>
              <callout arearefs="programlisting_catalog2html_v1_defid">
                <para>The <code>ID</code> attribute <tag
                class="starttag">product id="foo"</tag> is unique within the
                document instance. We may thus use it as an unique string
                value in the generated Xhtml, too.</para>
              </callout>

              <callout arearefs="programlisting_catalog2html_v1_mixed">
                <para>Mixed content consisting of text and <tag
                class="starttag">link</tag> nodes.</para>
              </callout>

              <callout arearefs="programlisting_catalog2html_v1_refid">
                <para>We define a file local Xhtml reference to a
                product.</para>
              </callout>
            </calloutlist>

            <para>The <tag class="starttag">para</tag> element from the
            example document instance containing a <tag class="starttag">link
            ref="homeTrainer"</tag> reference will be formatted as:</para>

            <programlisting>&lt;p&gt;If you hate rain look &lt;a href="#homeTrainer"&gt;here&lt;/a&gt;.&lt;/p&gt;</programlisting>

            <para>Now suppose we want to add the product's title
            <emphasis>Home trainer</emphasis> here to give the reader an idea
            about the product without clicking the hypertext link:</para>

            <programlisting>&lt;p&gt;If you hate rain look &lt;a href="#homeTrainer"&gt;here&lt;/a&gt; <emphasis
                role="bold">(Home trainer)</emphasis>.&lt;/p&gt;</programlisting>

            <para>This title text node is part of the <tag
            class="starttag">product</tag>node being referenced from the
            current <tag class="starttag">para</tag>:</para>

            <figure xml:id="linkIdrefProduct">
              <title>A graphical representation of our <tag
              class="starttag">catalog</tag>.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/xsl_id.fig"/>
                </imageobject>

                <caption>
                  <para>The dashed line shows the <code>IDREF</code> based
                  reference from the <tag class="starttag">link</tag> to the
                  <tag class="starttag">product</tag> node.</para>
                </caption>
              </mediaobject>
            </figure>

            <para>In <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> we may
            follow <code>ID</code> reference by means of the built in function
            <link
            xlink:href="http://www.w3.org/TR/xpath#function-id">id(...)</link>:</para>

            <programlisting>&lt;xsl:template match="link"&gt;
  &lt;a href="#{@ref}"&gt;&lt;xsl:value-of select="."/&gt;&lt;/a&gt;
  &lt;xsl:text&gt; (&lt;/xsl:text&gt;
  &lt;xsl:value-of select="<emphasis role="bold">id(@ref)</emphasis>/title" <co
                xml:id="programlisting_xsl_id_follow"/>/&gt;
  &lt;xsl:text&gt;)&lt;/xsl:text&gt;
&lt;/xsl:template&gt;</programlisting>

            <para>Evaluating <code>id(@ref)</code> at <xref
            linkend="programlisting_xsl_id_follow"/> returns the first <tag
            class="starttag">product</tag> <emphasis>node</emphasis>. We
            simply take its <tag class="starttag">title</tag> value and embed
            it into a pair of braces. This way the desired text portion
            <emphasis role="bold">(Home trainer)</emphasis> gets added after
            the hypertext link.</para>

            <qandaset role="exercise">
              <title>Extending the memo style sheet by mixed content and
              itemized lists</title>

              <qandadiv>
                <qandaentry xml:id="example_book_xsl_mixed">
                  <question>
                    <para>In <xref linkend="example_book.dtd_v5"/> we
                    constructed a DTD allowing itemized lists an mixed content
                    for <tag class="starttag">book</tag> instances. This DTD
                    also allowed to define <tag
                    class="starttag">emphasis</tag>, <tag
                    class="starttag">table</tag> and <tag
                    class="starttag">link</tag> elements being part of a mixed
                    content definition. Extend the current book2html.xsl to
                    account for these extensions.</para>

                    <para
                    xlink:href="http://www.w3.org/TR/xslt20/#element-copy-of">As
                    we already saw in our memo example itemized lists in Xhtml
                    are represented by the element <tag
                    class="starttag">ul</tag> containing <tag
                    class="starttag">li</tag> elements. Since <tag
                    class="starttag">p</tag> elements are also allowed to
                    appear as children our itemized lists can be easily mapped
                    to Xhtml tags. A<tag class="starttag">link</tag> node may
                    be transformed into <tag class="starttag">a
                    href="..."</tag> Xhtml node.</para>

                    <para>The table model is a simplified version of the Xhtml
                    table model. Read the <abbrev
                    xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                    documentation of the element <tag
                    class="emptytag">xsl:copy-of</tag> at <link
                    xlink:href="http://www.w3.org/TR/xslt20/#element-copy-of">copy-of</link>
                    for processing tables.</para>
                  </question>

                  <answer>
                    <para>The full source code of the solution is available at
                    <link
                    xlink:href="Ref/src/Dtd/book/v5/book2html.1.xsl">(Online
                    HTML version) ... book2html.1.xsl</link>. We discuss some
                    important aspects. The following table provides mapping
                    rules from <filename>book.dtd</filename> to Xhtml:</para>

                    <table xml:id="table_book2xhtml_element_mappings">
                      <title>Mapping elements from
                      <filename>book.dtd</filename> to Xhtml</title>

                      <?dbhtml table-width="50%" ?>

                      <?dbfo table-width="50%" ?>

                      <tgroup cols="2">
                        <colspec colwidth="3*"/>

                        <colspec colwidth="2*"/>

                        <thead>
                          <row>
                            <entry>book.dtd</entry>

                            <entry>Xhtml</entry>
                          </row>
                        </thead>

                        <tbody>
                          <row>
                            <entry><tag class="starttag">book</tag>/<tag
                            class="starttag">title</tag></entry>

                            <entry><tag class="starttag">h1</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">chapter</tag>/<tag
                            class="starttag">title</tag></entry>

                            <entry><tag class="starttag">h2</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">para</tag> (mixed
                            content)</entry>

                            <entry><tag class="starttag">p</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">link
                            href="foo"</tag></entry>

                            <entry><tag class="starttag">a
                            href="foo"</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">emphasis</tag></entry>

                            <entry><tag class="starttag">em</tag></entry>
                          </row>

                          <row>
                            <entry><tag
                            class="starttag">itemizedlist</tag></entry>

                            <entry><tag class="starttag">ul</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">listitem</tag></entry>

                            <entry><tag class="starttag">li</tag></entry>
                          </row>

                          <row>
                            <entry><tag class="starttag">table</tag>, <tag
                            class="starttag">caption</tag>,<tag
                            class="starttag">tr</tag>, <tag
                            class="starttag">td</tag> along with all
                            attributes</entry>

                            <entry>Identity copy</entry>
                          </row>
                        </tbody>
                      </tgroup>
                    </table>

                    <para>Since our table model is a subset of the HTML table
                    model we may simply copy corresponding nodes to the
                    output:</para>

                    <programlisting>&lt;xsl:template match="table"&gt;
  &lt;xsl:copy-of select="."/&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>Next we need rules for itemized lists and
                    paragraphs. Our model already implements lists in a way
                    that closely resembles XHTML lists. Since the structure
                    are compatible we only have to provide a mapping:</para>

                    <programlisting>&lt;xsl:template match="para"&gt;
  &lt;p id="{generate-id(.)}"&gt;&lt;xsl:apply-templates select="text()|*" /&gt;&lt;/p&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="itemizedlist"&gt;
  &lt;ul&gt;&lt;xsl:apply-templates select="listitem"/&gt;&lt;/ul&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="listitem"&gt;
  &lt;li&gt;&lt;xsl:apply-templates select="*"/&gt;&lt;/li&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>Since <emphasis>all</emphasis> chapters are
                    reachable via hypertext links from the table of contents
                    we <emphasis>must</emphasis> supply a unique
                    <code>id</code> value <xref
                    linkend="programlisting_book2html_single_chapterid"/> for
                    <emphasis>all</emphasis> of them. Chapters and paragraphs
                    may be referenced by <tag class="starttag">link</tag>
                    elements and thus <emphasis>both</emphasis> need a unique
                    identity value. For simplicity we create both of them via
                    <code>generate-id()</code>. In a more sophisticated
                    solution the strategy would be slightly different:</para>

                    <itemizedlist>
                      <listitem>
                        <para>If a <tag class="starttag">chapter</tag> node
                        does have an <code>id</code> attribute defined then
                        take its value.</para>
                      </listitem>

                      <listitem>
                        <para>If a <tag class="starttag">chapter</tag> node
                        does <emphasis>not</emphasis> have an <code>id</code>
                        attribute defined then use
                        <code>generate-id()</code>.</para>
                      </listitem>

                      <listitem>
                        <para><tag class="starttag">para</tag> nodes only get
                        values in XHTML if they do have an <code>id</code>
                        attribute defined. This is consistent since these
                        nodes are never referenced from the table of contents.
                        Thus an identity is only required if the <tag
                        class="starttag">para</tag> node is referenced by a
                        <tag class="starttag">link</tag>. If that is a case
                        the <tag class="starttag">para</tag> surely does have
                        a defined identity value.</para>
                      </listitem>
                    </itemizedlist>

                    <para>We also have to provide a hypertext link <xref
                    linkend="programlisting_book2html_single_toclink"/> to the
                    table of contents:</para>

                    <programlisting>&lt;xsl:template match="chapter"&gt;
  &lt;h2 id="{<emphasis role="bold">generate-id(.)</emphasis>}" <co
                        xml:base=""
                        xml:id="programlisting_book2html_single_chapterid"/>&gt;
    &lt;a href="#{<emphasis role="bold">generate-id(/book)</emphasis>}" <co
                        xml:base=""
                        xml:id="programlisting_book2html_single_toclink"/>&gt;&lt;xsl:value-of select="title"/&gt;&lt;/a&gt;
  &lt;/h2&gt;
  &lt;xsl:apply-templates select="para|itemizedlist|table"/&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>Implementing the <tag class="starttag">link</tag>
                    element is somewhat more complicated. We cannot use the
                    <code>@ref</code> attribute values itself as <tag
                    class="starttag">a href="..."</tag> attribute values since
                    the target's identity string is generated via
                    <code>generate-id()</code>. But we may follow the
                    reference via the <abbrev
                    xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                    <link linkend="section_xsl_functionid">id()</link>
                    function and then use the target's identity value:</para>

                    <programlisting>&lt;xsl:template match="link"&gt;
  &lt;a href="#{generate-id(id(@linkend))}"&gt;
    &lt;xsl:value-of select="."/&gt;
  &lt;/a&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>The call to <code>id(@linkend)</code> returns either
                    a <tag class="starttag">chapter</tag> or a <tag
                    class="starttag">para</tag> node since according to the
                    DTD attributes of type <code>ID</code> are only defined
                    for these two elements. Using this node as input to
                    <code>generate-id()</code> returns the desired identity
                    value for the generated Xhtml.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>

          <section xml:id="xslAxis">
            <title>XSL axis definitions</title>

            <para>XSL allows us to traverse a document instance's graph in
            different directions. We start with a memo document
            instance:</para>

            <programlisting>&lt;!DOCTYPE memo SYSTEM "memo.dtd"&gt;
&lt;memo date="9.9.2099"&gt;
  &lt;from&gt;Joe&lt;/from&gt;
  &lt;to&gt;Jack&lt;/to&gt;
  &lt;to&gt;Eve&lt;/to&gt;
  &lt;to&gt;Jude&lt;/to&gt;
  &lt;to&gt;Tolstoi&lt;/to&gt;
  &lt;subject&gt;Ignore me!&lt;/subject&gt;
  &lt;content&gt;
    &lt;para&gt;Dumb text.&lt;/para&gt;
  &lt;/content&gt;
&lt;/memo&gt;</programlisting>

            <para>This instance defines four nodes of type <tag
            class="starttag">to</tag>. For each of these we want to create a
            line of text showing also the preceding and the following
            recipients:</para>

            <programlisting> &lt;----Jack----&gt; Eve Jude Tolstoi <co
                xml:id="programlisting_axis_jack"/>
Jack  &lt;----Eve----&gt; Jude Tolstoi <co xml:id="programlisting_axis_eve"/>
Jack Eve  &lt;----Jude----&gt; Tolstoi <co xml:id="programlisting_axis_jude"/>
Jack Eve Jude  &lt;----Tolstoi----&gt; <co
                xml:id="programlisting_axis_tolstoi"/></programlisting>

            <calloutlist>
              <callout arearefs="programlisting_axis_jack">
                <para>Jack has no predecessor and 3 successors</para>
              </callout>

              <callout arearefs="programlisting_axis_eve">
                <para>Eve has 1 predecessor and 2 successors</para>
              </callout>

              <callout arearefs="programlisting_axis_jude">
                <para>Jude has 2 predecessors and 1 successor</para>
              </callout>

              <callout arearefs="programlisting_axis_tolstoi">
                <para><personname>Tolstoi</personname> has 3 predecessors and
                no successor</para>
              </callout>
            </calloutlist>

            <para>XSL supports this type of transformation by supplying
            <acronym xlink:href="http://www.w3.org/TR/xpath">XPath</acronym>
            axis definitions. We consider a memo document with 9 <tag
            class="starttag">to</tag> nodes:</para>

            <figure xml:id="memo9recipients">
              <title>A memo with 9 recipients</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/memofour.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>We marked the 4-th recipient to represent the context node.
            All three <tag class="starttag">to</tag> nodes to the
            <quote>left</quote> belong to the <emphasis>set</emphasis> of
            preceding siblings with respect to the context node. Likewise the
            5 neighbours to the right are called following siblings. Returning
            to our <quote>four recipient</quote> example we may create the
            desired output by:</para>

            <programlisting>&lt;xsl:template match="/"&gt;
  &lt;xsl:apply-templates select="memo/to"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="to"&gt;

  &lt;xsl:for-each select="preceding-sibling::to" <co
                xml:id="programlisting_memo_four_xsl_preceding"/>&gt;
    &lt;xsl:value-of select="."/&gt;
    &lt;xsl:text&gt; &lt;/xsl:text&gt;
  &lt;/xsl:for-each&gt;

  &lt;xsl:text&gt; &amp;lt;----&lt;/xsl:text&gt;
  &lt;xsl:value-of select="."/&gt; <co
                xml:id="programlisting_memo_four_xsl_context"/>
  &lt;xsl:text&gt;----&amp;gt; &lt;/xsl:text&gt;

  &lt;xsl:for-each select="following-sibling::to"&gt; <co
                xml:id="programlisting_memo_four_xsl_following"/>
    &lt;xsl:value-of select="."/&gt;
    &lt;xsl:text&gt; &lt;/xsl:text&gt;
  &lt;/xsl:for-each&gt;
  &lt;xsl:value-of select="$newline"/&gt;
&lt;/xsl:template&gt;</programlisting>

            <calloutlist>
              <callout arearefs="programlisting_memo_four_xsl_preceding">
                <para>Iterate on the set of recipients <quote>left</quote> of
                the context node.</para>
              </callout>

              <callout arearefs="programlisting_memo_four_xsl_context">
                <para>Taking the context node's value embedded in
                <code>&lt;---- ... ----&gt;</code>.</para>
              </callout>

              <callout arearefs="programlisting_memo_four_xsl_following">
                <para>Iterate on the set of recipients <quote>right</quote> of
                the context node.</para>
              </callout>
            </calloutlist>

            <para>More formally the set of preceding siblings is defined to be
            the set of all nodes having the same parent as the context node
            and appearing <quote>before</quote> the context node. The notion
            <quote>before</quote> is meant in the sense of a <link
            xlink:href="http://en.wikipedia.org/wiki/Depth-first_search">depth-first</link>
            traversal of the document tree. <abbrev
            xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev> provides
            different axis definitions, see <uri
            xlink:href="http://www.w3.org/TR/xpath#axes">http://www.w3.org/TR/xpath#axes</uri>
            for details. We provide an illustration here:</para>

            <figure xml:id="disjointAxeSets">
              <title>Disjoint <acronym
              xlink:href="http://www.w3.org/TR/xpath">XPath</acronym> axis
              definitions.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/preceding.fig"/>
                </imageobject>

                <caption>
                  <para>The sets defined by ancestor, descendant, following,
                  preceding and self are disjoint. Their union forms the set
                  of all document nodes.</para>
                </caption>
              </mediaobject>
            </figure>

            <para>Some remarks:<itemizedlist>
                <listitem>
                  <para>If the context node is already the topmost node i.e.
                  the root node then the sets defined by <code>ancestor</code>
                  and <code>parent</code> are empty.</para>
                </listitem>

                <listitem>
                  <para>The <code>parent</code> set
                  <emphasis>always</emphasis> contains zero or one
                  node.</para>
                </listitem>
              </itemizedlist></para>
          </section>

          <section xml:id="xslChunking">
            <title>Splitting documents into chunks</title>

            <para>Sometimes we want to generate multiple output documents from
            a single XML source. It may for example be a bad idea to transform
            a book of 200 printed pages into a <emphasis>single</emphasis>
            online HTML page. Instead we may split each chapter into a
            separate HTML file and create navigation links between
            them.</para>

            <para>We consider a memo document instance. We want to generate
            one text file for each memo recipient containing just the
            recipient's name using the <abbrev
            xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> element
            <link
            xlink:href="http://www.w3.org/TR/xslt20/#element-result-document">&lt;xsl:result-document&gt;</link>:</para>

            <programlisting>&lt;xsl:template match="/memo"&gt;
  &lt;xsl:apply-templates select="to"/&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="to"&gt;
  <emphasis role="bold">&lt;xsl:result-document</emphasis> 
                  <co xml:id="programlisting_xsl_result_document_main"/> 
                  <emphasis role="bold">href="file_{position()}.txt"</emphasis> 
                  <co xml:id="programlisting_xsl_result_document_href"/> 
                  <emphasis role="bold">method="text"</emphasis> 
                  <co xml:id="programlisting_xsl_result_document_method"/>&gt;
    &lt;xsl:value-of select="."/&gt; <co
                xml:id="programlisting_xsl_result_document_content"/>
  
                  <emphasis role="bold">&lt;/xsl:result-document&gt;</emphasis>
&lt;/xsl:template&gt;</programlisting>

            <calloutlist>
              <callout arearefs="programlisting_xsl_result_document_main">
                <para>The output from all generating <abbrev
                xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev>
                directives will be redirected from standard output to another
                output channel.</para>
              </callout>

              <callout arearefs="programlisting_xsl_result_document_href">
                <para>The output will be written to a file named
                <filename>file_i.txt</filename> with the decimal number
                <code>i</code> ranging from the value 1 to the number of
                recipients.</para>
              </callout>

              <callout arearefs="programlisting_xsl_result_document_method">
                <para>The <code>method</code> attribute may possibly override
                a value being given in the <tag
                class="starttag">xsl:output</tag> element. We may also
                redefine <link
                xlink:href="http://www.w3.org/TR/xslt20/#element-result-document">other
                attributes</link> from <tag class="starttag">xsl:output</tag>
                like <code>doctype-{public.system}</code>, and
                <code>encoding</code>.</para>
              </callout>

              <callout arearefs="programlisting_xsl_result_document_content">
                <para>All output being generated in this region gets
                redirected to the channel specified in <xref
                linkend="programlisting_xsl_result_document_href"/>.</para>
              </callout>
            </calloutlist>

            <qandaset role="exercise">
              <title>Splitting book into chapter files</title>

              <qandadiv>
                <qandaentry xml:id="example_book_chunk">
                  <question>
                    <para>Extend your solution of <xref
                    linkend="example_book_xsl_mixed"/> by writing each <tag
                    class="starttag">chapter</tag>'s content into a separate
                    Xhtml file. In addition create a file
                    <filename>index.html</filename> which contains references
                    to the corresponding <tag class="starttag">chapter</tag>
                    documents. Thus for a document instance with two chapters
                    the overall navigation structure is illustrated by <xref
                    linkend="figure_book_navigation"/>.</para>

                    <para>Implementing the <tag class="starttag">link</tag>
                    tag may cause a problem: An internal link may reference a
                    <tag class="starttag">para</tag>. You need to identify the
                    <tag class="starttag">chapter</tag> node embedding this
                    para. This may be done by using a suitable <abbrev
                    xlink:href="http://www.w3.org/TR/xpath">XPath</abbrev>
                    axis direction.</para>
                  </question>

                  <answer>
                    <para>The full source code of the solution is available at
                    <link
                    xlink:href="Ref/src/Dtd/book/v5/book2chunks.1.xsl">(Online
                    HTML version) ... book2chunks.1.xsl</link>. First we
                    generate the table of contents as the file
                    <filename>index.html</filename>:</para>

                    <programlisting>&lt;xsl:template match="/"&gt;
  &lt;xsl:result-document href="index.html"&gt;
    &lt;xsl:apply-templates select="book"/&gt;
  &lt;/xsl:result-document&gt;
  
  &lt;xsl:for-each select="book/chapter"&gt;
    &lt;xsl:result-document href="{generate-id(.)}.html"&gt;
      &lt;xsl:apply-templates select="."/&gt;
    &lt;/xsl:result-document&gt;
  &lt;/xsl:for-each&gt;
&lt;/xsl:template&gt;

&lt;xsl:template match="book"&gt;
  &lt;html&gt;
    &lt;head&gt;&lt;title&gt;&lt;xsl:value-of select="title"/&gt;&lt;/title&gt;&lt;/head&gt;
    &lt;body&gt;
      &lt;h1&gt;&lt;xsl:value-of select="title"/&gt;&lt;/h1&gt;
      &lt;h2&gt;Table of contents&lt;/h2&gt;
      &lt;ul&gt;
        &lt;xsl:for-each select="<emphasis role="bold">chapter</emphasis>"&gt;
          &lt;li&gt;&lt;a href="{<emphasis role="bold">generate-id(.)</emphasis>}.html"&gt;&lt;xsl:value-of select="title"/&gt;&lt;/a&gt;&lt;/li&gt;
        &lt;/xsl:for-each&gt;
      &lt;/ul&gt;
    &lt;/body&gt;
  &lt;/html&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>The <tag class="starttag">link ref="..."</tag> may
                    reference a <tag class="starttag">chapter</tag> or a <tag
                    class="starttag">para</tag>. So we may need to <quote>step
                    up</quote> from a paragraph to the corresponding chapter
                    node:</para>

                    <programlisting>&lt;xsl:template match="link"&gt;
  &lt;xsl:variable name="reftargetNode" select="id(@linkend)"/&gt;
  &lt;xsl:variable name="reftargetParentChapter"
    select="$reftargetNode/ancestor-or-self::chapter"/&gt;
  
  &lt;a href="{generate-id($reftargetParentChapter)}.html#{
    generate-id($reftargetNode)}"&gt;
    &lt;xsl:value-of select="."/&gt;
  &lt;/a&gt;
&lt;/xsl:template&gt;</programlisting>

                    <para>This is consistent since <emphasis>all</emphasis>
                    <tag class="starttag">p</tag> nodes in the generated Xhtml
                    receive a unique <code>id</code> value regardless whether
                    the originating <tag class="starttag">para</tag> node does
                    have one.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>

            <figure xml:id="figure_book_navigation">
              <title>A <tag class="starttag">book</tag> document with two
              chapters</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/booknavigate.fig"/>
                </imageobject>
              </mediaobject>
            </figure>
          </section>
        </section>
      </section>
    </chapter>

    <chapter xml:id="introPersistence">
      <title>Accessing Relational Data</title>

      <section xml:id="persistence">
        <title>Persistence in Object Oriented languages</title>

        <para>Following <xref linkend="Bauer05"/> we may define persistence
        by:</para>

        <blockquote>
          <para>persistence allows an object to outlive the process that
          created it. The state of the object may be stored to disk and an
          object with the same state re-created at some point in the
          future.</para>
        </blockquote>

        <para>The notion of <quote>process</quote> refers to operating
        systems. Let us start wit a simple example assuming a <link
        linkend="gloss_Java"><trademark>Java</trademark></link> class
        User:</para>

        <programlisting>public class User {
  String cname;  //The user's common name e.g. 'Joe Bix'
  String uid;    //The user's unique system ID (login name) e.g. 'bix'

// getters, setters and other stuff
  ...
}</programlisting>

        <para>A relational implementation might look like:</para>

        <programlisting>CREATE TABLE User(
  CHAR(80) cname
 ,CHAR(10) uid PRIMARY KEY
)</programlisting>

        <para>Now a <link
        linkend="gloss_Java"><trademark>Java</trademark></link> application
        may create instances of class <code>User</code> and save these to a
        database:</para>

        <figure xml:id="processObjPersist">
          <title>Persistence across process boundaries</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/persistence.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>Both the <trademark
        xlink:href="http://www.oracle.com/technetwork/java/javase">JRE</trademark>
        instances and the RDBMS database server are processes (or sets of
        processes) typically existing in different address spaces. The two
        <trademark
        xlink:href="http://www.oracle.com/technetwork/java/javase">JRE</trademark>
        processes mentioned here may as well be started in disjoint address
        spaces. In fact we might even run two entirely different applications
        implemented in different programming languages like <abbrev
        xlink:href="http://www.php.net">PHP</abbrev>.</para>

        <para>It is important to mention that the two arrows
         <quote>save</quote> and <quote>load</quote> thus typically denote a
        communication across machine boundaries.</para>
      </section>

      <section xml:id="jdbcIntro">
        <title>Introduction to <trademark
        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark></title>

        <section xml:id="jdbcWrite">
          <title>Write access, principles</title>

          <para>Connecting an application to a database means to establish a
          connection from a client to a database server:</para>

          <figure xml:id="jdbcClientServer">
            <title>Networking between clients and database servers</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/clientserv.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>So <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          is just one among a whole bunch of protocol implementations
          connecting database servers and applications. Consequently
          <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          is expected to appear in the lower layer of multi-tier applications.
          We take a three-tier application as a starting point:</para>

          <figure xml:id="jdbcThreeTier">
            <title>The role of <trademark
            xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
            in a three-tier application</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcThreeTier.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>We may add an additional layer. Web applications are typically
          being build on top of an application server (<productname
          xlink:href="http://www.ibm.com/software/de/websphere/">WebSphere</productname>,
          <productname
          xlink:href="http://glassfish.java.net">Glassfish</productname>,
          <productname
          xlink:href="http://www.jboss.org/jbossas">Jboss</productname>,...)
          providing additional services:</para>

          <figure xml:id="jdbcFourTier">
            <title><trademark
            xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
            connecting application server and database.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcFourTier.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>So what is actually required to connect to a database server?
          A client requires the following parameter values to open a
          connection:</para>

          <orderedlist>
            <listitem xml:id="ItemJdbcProtocol">
              <para>The type of database server i.e. <productname
              xlink:href="http://www.oracle.com/us/products/database">Oracle</productname>,
              <productname
              xlink:href="www.ibm.com/software/data/db2">DB2</productname>,
              <productname
              xlink:href="http://www-01.ibm.com/software/data/informix">Informix</productname>,
              <productname
              xlink:href="http://www.mysql.com">Mysql</productname> etc. This
              information is needed because of vendor dependent <trademark
              xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
              protocol implementations.</para>
            </listitem>

            <listitem>
              <para>The server's <link
              xlink:href="http://en.wikipedia.org/wiki/Domain_Name_System">DNS</link>
              name or IP number</para>
            </listitem>

            <listitem>
              <para>The database service's port number at the previously
              defined host. The database server process listens for
              connections to this port number.</para>
            </listitem>

            <listitem xml:id="itemJdbcDatabaseName">
              <para>The database name within the given database server</para>
            </listitem>

            <listitem>
              <para>Optional: A database user's account name and
              password.</para>
            </listitem>
          </orderedlist>

          <para>Items <xref linkend="ItemJdbcProtocol"/> - <xref
          linkend="itemJdbcDatabaseName"/> will be encapsulated into a so
          called <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          <link
          xlink:href="http://en.wikipedia.org/wiki/Uniform_Resource_Locator">URL</link>.
          We consider a typical example corresponding to the previous
          parameter list:</para>

          <figure xml:id="jdbcUrlComponents">
            <title>Components of a <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            URL</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcurl.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>In fact this <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          URL example closely resembles other types of URL strings as being
          defined in <uri
          xlink:href="http://www.ietf.org/rfc/rfc2396.txt">http://www.ietf.org/rfc/rfc2396.txt</uri>.
          Look for <code>opaque_part</code> to understand the second
          <quote>:</quote> in the protocol definition part of a <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          URL. Common example for <abbrev
          xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev>s
          are:</para>

          <itemizedlist>
            <listitem>
              <para><code>http://www.hdm-stuttgart.de/aaa</code></para>
            </listitem>

            <listitem>
              <para><code>http://someserver.com:8080/someResource</code></para>
            </listitem>

            <listitem>
              <para><code>ftp://mirror.mi.hdm-stuttgart.de/Firmen</code></para>
            </listitem>
          </itemizedlist>

          <para>We notice the explicit mentioning of a port number 8080 in the
          second example; The default <abbrev
          xlink:href="http://www.w3.org/Protocols">http</abbrev> protocol port
          number is 80. So if a web server accepts connections at port 80 we
          do not have to specify this value. A web browser will automatically
          use this default port.</para>

          <para>Actually the notion <quote><code>jdbc:mysql</code></quote>
          denotes a sub protocol implementation namely<orgname>
          Mysql</orgname>'s implementation of <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>.
          Connecting to an IBM DB2 server would require jdbc:db2 for this
          protocol part.</para>

          <para>In contrast to <abbrev
          xlink:href="http://www.w3.org/Protocols">http</abbrev> no standard
          ports are <quote>officially</quote> assigned for <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          protocol variants. Due to vendor specific implementations this does
          not make any sense. Thus we <emphasis role="bold">always</emphasis>
          have to specify the port number when opening <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          connections.</para>

          <para>Writing <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          based applications follows a simple scheme:</para>

          <figure xml:id="jdbcArchitecture">
            <title>Architecture of JDBC</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcarch.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>From a programmer's point of view the
          <classname>java.sql.DriverManager</classname> is a bootstrapping
          object: Other objects like Statement instances are created from this
          central and unique object.</para>

          <para>The first instance being created by the
          <classname>java.sql.DriverManager</classname> is an object of type
          <classname>java.sql.Connection</classname>. In <xref
          linkend="exerciseJdbcWhyInterface"/> we discuss the way vendor
          specific implementation details are hidden by Interfaces. We can
          distinguish between:</para>

          <orderedlist>
            <listitem>
              <para>Vendor neutral specific parts of a <trademark
              xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
              environment. These are those components being shipped by Oracle
              or other organizations providing <link
              linkend="gloss_Java"><trademark>Java</trademark></link>
              runtimes. The class
              <classname>java.sql.DriverManager</classname> belongs to this
              domain.</para>
            </listitem>

            <listitem>
              <para>Vendor specific parts. In <xref
              linkend="jdbcArchitecture"/> this starts with the
              <classname>java.sql.Connection</classname> object.</para>
            </listitem>
          </orderedlist>

          <para>The <classname>java.sql.Connection</classname> object thus
          marks the boundary between a <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase">JDK</trademark>
          / <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase">JRE</trademark>
          and a <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          Driver implementation from e.g. Oracle or other institutions.</para>

          <para><xref linkend="jdbcArchitecture"/> does not show details about
          the relations between <classname>java.sql.Connection</classname>,
          <classname>java.sql.Statement</classname> and
          <classname>java.sql.ResultSet</classname> objects. We start by
          giving a rough description of the tasks and responsibilities these
          three types have:</para>

          <glosslist>
            <glossentry>
              <glossterm><classname>java.sql.Connection</classname></glossterm>

              <glossdef>
                <para>Holding a permanent connection to a database server.
                Both client and server can contact each other. The database
                server may for example terminate a transaction if problems
                like deadlocks occur.</para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm><classname>java.sql.Statement</classname></glossterm>

              <glossdef>
                <para>We have two distinct classes of actions:</para>

                <orderedlist>
                  <listitem>
                    <para>Instructions to modify data on the database server.
                    These include <code>INSERT</code>, <code>UPDATE</code> and
                    <code>DELETE</code> operations as far as
                    <abbrev>SQL-DML</abbrev> is concerned. <trademark
                    xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
                    acts as a means of transport and merely returns integer
                    values back to the client like the number of rows being
                    affected by an UPDATE.</para>
                  </listitem>

                  <listitem>
                    <para>Instructions reading data from the server. This is
                    done by sending SELECT statements. It is not sufficient to
                    just return integer values: Instead <trademark
                    xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
                    needs to copy complete datasets back to the client to fill
                    containers being accessible by applications. This is being
                    discussed in <xref linkend="jdbcRead"/>.</para>
                  </listitem>
                </orderedlist>
              </glossdef>
            </glossentry>
          </glosslist>

          <para>We shed some light on the relationship between these important
          <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          components and their respective creation:<figure
              xml:id="jdbcObjectCreation">
              <title>Important <trademark
              xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
              instances and relationships.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/jdbcObjectRelation.fig"/>
                </imageobject>
              </mediaobject>
            </figure></para>
        </section>

        <section xml:id="writeAccessCoding">
          <title>Write access, coding!</title>

          <para>So how does it actually work with respect to coding? You may
          want to read <xref linkend="toolingConfigJdbc"/> before starting
          your exercises. We first prepare a database table using Eclipse's
          database tools:</para>

          <figure xml:id="figSchemaPerson">
            <title>A relation <code>Person</code> containing names and email
            addresses</title>

            <programlisting><emphasis role="strong">CREATE</emphasis> <emphasis
                role="strong">TABLE</emphasis> Person (
   name CHAR(20) 
  ,email CHAR(20) <emphasis>UNIQUE</emphasis>)</programlisting>
          </figure>

          <para>Our actual (toy) <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          application will insert a single object ('Jim', 'jim@foo.org') into
          the <code>Person</code> relation. This is simpler than reading data
          since no client <classname>java.sql.ResultSet</classname> container
          is needed:</para>

          <figure xml:id="figJdbcSimpleWrite">
            <title>A simple <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            application inserting data into a relational table.</title>

            <programlisting language="java">01  package sda.jdbc.intro.v1;
02  
03  import java.sql.Connection;
04  import java.sql.DriverManager;
05  import java.sql.SQLException;
06  import java.sql.Statement;
07  
08  public class SimpleInsert {
09  
10     public static void main(String[] args) throws SQLException {
11        // Step 1: Open a connection to the database server
12        final Connection conn = DriverManager.getConnection(
13              "jdbc:mysql://localhost:3306/hdm", "hdmuser", "XYZ");
14        // Step 2: Create a Statement instance
15        final Statement stmt = conn.createStatement();
16        // Step 3: Execute the desired INSERT
17        final int updateCount = stmt.executeUpdate(
18                 "INSERT INTO Person VALUES('Jim', 'jim@foo.org')");
19        // Step 4: Give feedback to the enduser
20        System.out.println("Successfully inserted " + updateCount + " dataset(s)");
21     }
22  }</programlisting>
          </figure>

          <para>Looks simple? Unfortunately it does not (yet) work:</para>

          <programlisting>Exception in thread "main" java.sql.SQLException: <emphasis
              role="bold">
             No suitable driver found for jdbc:mysql://localhost:3306/hdm</emphasis>
  at java.sql.DriverManager.getConnection(DriverManager.java:604)
  at java.sql.DriverManager.getConnection(DriverManager.java:221)
  at sda.jdbc.intro.SimpleInsert.main(SimpleInsert.java:12)</programlisting>

          <para>What's wrong here? In <xref linkend="figureConfigJdbcDriver"/>
          we needed a <productname
          xlink:href="http://www.mysql.com">Mysql</productname> <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          Driver implementation <filename>mysql-connector-java.jar</filename>
          as a prerequisite to open connections to a database server. This
          implementation is mandatory for our toy application as well. All we
          have to do is adding <filename>mysql-connector-java.jar</filename>
          to our <link linkend="gloss_Java"><trademark>Java</trademark></link>
          <varname>CLASSPATH</varname> at <emphasis
          role="bold">runtime</emphasis>.</para>

          <para>Depending on our <link
          linkend="gloss_Java"><trademark>Java</trademark></link> environment
          this will be achieved by different means. Eclipse requires the
          definition of a run configuration as being described in <uri
          xlink:href="http://help.eclipse.org/juno/index.jsp?topic=/org.eclipse.jdt.doc.user/tasks/tasks-java-local-configuration.htm">http://help.eclipse.org/juno/index.jsp?topic=/org.eclipse.jdt.doc.user/tasks/tasks-java-local-configuration.htm</uri>.
          When configuring a run-time configuration for
          <classname>sda.jdbc.intro.SimpleInsert</classname> we have to add
          <filename>mysql-connector-java.jar</filename> to the
          <varname>Classpath</varname> tab. The following screen shot shows a
          working configuration:</para>

          <figure xml:id="figureConfigRunExtJar">
            <title>Creating an Eclipse run time configuration containing a
            <productname xlink:href="http://www.mysql.com">Mysql</productname>
            <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            Driver Jar marked red.</title>

            <screenshot>
              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Screen/runConfigJarAnnot.screen.png"
                             scale="70"/>
                </imageobject>
              </mediaobject>
            </screenshot>
          </figure>

          <para>This time execution works as expected:</para>

          <programlisting>Successfully inserted 1 dataset(s)</programlisting>

          <qandaset role="exercise">
            <title>Exception on inserting objects</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>A second invocation of
                  <classname>sda.jdbc.intro.v1.SimpleInsert</classname> yields
                  the following runtime error:</para>

                  <programlisting>Exception in thread "main" 
   com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException:
       <emphasis role="bold">Duplicate entry 'jim@foo.org' for key 'email'</emphasis>
...
 at com.mysql.jdbc.StatementImpl.executeUpdate(StatementImpl.java:1617)
 at sda.jdbc.intro.SimpleInsert.main(SimpleInsert.java:17)</programlisting>
                </question>

                <answer>
                  <para>This expected error is easy to understand: The
                  exception's message text <emphasis role="bold">Duplicate
                  entry 'Jim' for key 'PRIMARY'</emphasis> informs us about a
                  UNIQUE key constraint violation with respect to the
                  attribute <code>email</code> in our schema definition in
                  <xref linkend="figSchemaPerson"/>. We cannot add a second
                  entry with the same value <code>'jim@foo.org'</code>.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <para>It is worth to mention that the <productname
          xlink:href="http://www.mysql.com">Mysql</productname> driver
          implementation does not have to be available at compile time. JDBC
          uses interfaces in favour of concrete class. Only at runtime we do
          need concrete classes.</para>

          <para>On the other hand when working with eclipse we need a separate
          runtime configuration for each runnable <link
          linkend="gloss_Java"><trademark>Java</trademark></link> application.
          This becomes tedious after some time. So you may want to follow the
          author and just add <filename>mysql-connector-java.jar</filename> to
          your compile time <envar>CLASSPATH</envar>.</para>

          <para>We now discuss some important methods being defined in the
          relevant <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          interfaces:</para>

          <glosslist>
            <glossentry>
              <glossterm><classname>java.sql.Connection</classname></glossterm>

              <glossdef>
                <itemizedlist>
                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#createStatement()">createStatement()</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setAutoCommit(boolean)">setAutoCommit()</link>,
                    <link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#getAutoCommit()">getAutoCommit()</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#getWarnings()">getWarnings()</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#isClosed()">isClosed()</link>,
                    <link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#isValid(int)">isValid(int
                    timeout)</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#rollback()">rollback()</link>,
                    <link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#commit()">commit()</link>
                    and .</para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#close()">close()</link></para>
                  </listitem>
                </itemizedlist>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm><classname>java.sql.Statement</classname></glossterm>

              <glossdef>
                <itemizedlist>
                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#executeUpdate(java.lang.String)">executeUpdate(String
                    sql)</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#getConnection()">getConnection()</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#getResultSet()">getResultSet()</link></para>
                  </listitem>

                  <listitem>
                    <para><link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#close()">close()</link>
                    and <link
                    xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Statement.html#isClosed()">isClosed()</link></para>
                  </listitem>
                </itemizedlist>
              </glossdef>
            </glossentry>
          </glosslist>

          <qandaset role="exercise">
            <title><trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            and transactions</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para><link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#setAutoCommit(boolean)">How
                  does the method setAutoCommit()</link> relate to <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#commit()">commit()</link>
                  and <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html#rollback()">rollback()</link>?</para>
                </question>

                <answer>
                  <para>A connections default state is <code>autocommit ==
                  true</code>. This means that individual SQL statements are
                  executed as separate transactions.</para>

                  <para>If we want to group two or more statements into a
                  transaction we have to:</para>

                  <orderedlist>
                    <listitem>
                      <para>Call
                      <code>connection.setAutoComit(false)</code></para>
                    </listitem>

                    <listitem>
                      <para>From now on subsequent SQL statements will
                      implicitly become part of a transaction till either of
                      the three events happens:</para>

                      <orderedlist numeration="loweralpha">
                        <listitem>
                          <para><code>connection.commit()</code></para>
                        </listitem>

                        <listitem>
                          <para><code>connection.rollback()</code></para>
                        </listitem>

                        <listitem>
                          <para>The transaction gets aborted by the database
                          server. This may for example happen in case of a
                          deadlock conflict with a second transaction.</para>
                        </listitem>
                      </orderedlist>

                      <para>Note that the first two events are initiated by
                      our client software. The third possible action is being
                      carried out by the database server.</para>
                    </listitem>
                  </orderedlist>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Closing <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            connections</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Why is it very important to call the close() method
                  for <classname>java.sql.Connection</classname> and / or
                  <classname>java.sql.Statement</classname> instances?</para>
                </question>

                <answer>
                  <para>A <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  connection ties network resources (socket connections).
                  These may be used up if e.g. new connections get established
                  within a loop without being closed.</para>

                  <para>The situation is comparable to memory leaks when using
                  programming languages lacking a garbage collector.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Aborted transactions</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>In the previous exercise we mentioned the possibility
                  of a transaction abort issued by the database server. Which
                  responsibility arises for an application programmer? Hint:
                  How may an implementation become aware of such an abort
                  transaction event?</para>
                </question>

                <answer>
                  <para>If a database server aborts a transaction a
                  <classname>java.sql.SQLException</classname> will be thrown.
                  An application must be aware of this possibility and thus
                  implement a sensible <code>catch(...)</code> clause
                  accordingly.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Interfaces and classes in <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark></title>

            <qandadiv>
              <qandaentry xml:id="exerciseJdbcWhyInterface">
                <question>
                  <para>The <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  standard mostly defines interfaces as
                  <classname>java.sql.Connection</classname> and
                  <classname>java.sql.Statement</classname>. Why are these not
                  being defined as classes? Moreover why is
                  <classname>java.sql.DriverManager</classname> being defined
                  as a class rather than an interface?</para>

                  <para>You may want to supply code examples to explain your
                  argumentation.</para>
                </question>

                <answer>
                  <para>Figure <xref linkend="jdbcArchitecture"/> tells us
                  about the vendor independent architecture of <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>.
                  Oracle for example may implement a class
                  <code>com.oracle.jdbc.OracleConnection</code>:</para>

                  <programlisting annotations="nojavadoc">package com.oracle.jdbc;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.SQLException;

public class OracleConnection implements Connection {

...

Statement createStatement(int resultSetType,
                        int resultSetConcurrency)
                          throws SQLException) {
  // Implementation omitted here due to
  // limited personal hacking capabilities
  ...
}
...
}</programlisting>

                  <para>If a programmer only uses the <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  interfaces rather than a vendor's classes it is much easier
                  to make the resulting application work with different
                  databases from other vendors. This way a company's
                  implementation is not exposed to our own <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  code.</para>

                  <para>Regarding the special role of
                  <classname>java.sql.DriverManager</classname> we notice the
                  need of a starting point: We have to create an initial
                  instance of some class. In theory (<emphasis role="bold">BUT
                  NOT IN PRACTICE!!!</emphasis>) the following (ugly code)
                  might be possible:</para>

                  <programlisting>package my.personal.application;

import java.sql.Connection;
import java.sql.Statement;
import java.sql.SQLException;

public someClass {

  public void someMethod(){

      Connection conn = <emphasis role="bold">new OracleConnection()</emphasis>; // bad idea!
      ...
  }
 ...
}</programlisting>

                  <para>The problem with this approach is the explicit
                  constructor call: Whenever we want to use another database
                  we have two possibilities:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Rewrite our code.</para>
                    </listitem>

                    <listitem>
                      <para>Introduce some sort of switch statement to provide
                      a fixed number of databases beforehand:</para>

                      <programlisting>public void someMethod(final String vendor){

  final Connection conn;

  switch(vendor) {
     case "ORACLE":
        conn = new OracleConnection();
        break;

     case "DB2":
        conn = new Db2Connection();
        break;

     default:
        conn = null;
        break;
  }
  ...
}</programlisting>

                      <para>Adding a new database still requires code
                      rewriting.</para>
                    </listitem>
                  </itemizedlist>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Driver dispatch mechanism</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>In exercise <xref linkend="exerciseJdbcWhyInterface"/>
                  we saw a hypothetic way to resolve the interface/class
                  resolution problem by using a switch clause. How is this
                  <code>switch</code> clause's logic actually realized in a
                  <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  based application? (<quote>behind the scenes</quote>)</para>

                  <para>Hint: Read the documentation of
                  <classname>java.sql.DriverManager</classname>.</para>
                </question>

                <answer>
                  <para>Prior to opening a Connection a <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  driver registers itself at the
                  <classname>java.sql.DriverManager</classname> singleton
                  instance. For this purpose the standard defined the method
                  <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html#registerDriver(java.sql.Driver)">registerDriver(Driver)</link>.
                  On success the <classname>java.sql.DriverManager</classname>
                  adds the driver to an internal dictionary:</para>

                  <informaltable border="1">
                    <col width="20%"/>

                    <col width="30%"/>

                    <tr>
                      <th>protocol</th>

                      <th>driver instance</th>
                    </tr>

                    <tr>
                      <td>jdbc:mysql</td>

                      <td>mysqlDriver instance</td>
                    </tr>

                    <tr>
                      <td>jdbc:oracle</td>

                      <td>oracleDriver instance</td>
                    </tr>

                    <tr>
                      <td>...</td>

                      <td>...</td>
                    </tr>
                  </informaltable>

                  <para>So whenever the method <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/DriverManager.html#getConnection(java.lang.String,%20java.lang.String,%20java.lang.String)">getConnection()</link>
                  is being called the
                  <classname>java.sql.DriverManager</classname> will scan the
                  <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  URL and isolate the protocol part. If we start with
                  <code>jdbc:mysql://someserver.com:3306/someDatabase</code>
                  this is just <code>jdbc:mysql</code>. The value is then
                  being looked up in the above table of registered drivers to
                  choose an appropriate instance or null otherwise. This way
                  our hypothetic switch including the default value null is
                  actually implemented.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="propertiesFile">
          <title>Connection properties</title>

          <para>So far our application depicted in <xref
          linkend="figJdbcSimpleWrite"/> suffers both from missing error
          handling and hard-coded parameters.</para>

          <para>Professional applications must be configurable. Changing the
          password currently requires source code modification and
          recompilation. <link
          linkend="gloss_Java"><trademark>Java</trademark></link> offers a
          standard procedure to externalize parameters like
          <varname>username</varname>, <varname>password</varname> an
          <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          connection URL as being present in <xref
          linkend="figJdbcSimpleWrite"/>: We may externalize these parameters
          to external so called properties files:</para>

          <figure xml:id="propertyExternalization">
            <title>Externalize a single string <code>"User name"</code> to a
            separate file <filename>message.properties</filename>.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/externalize.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>The current figure shows the externalization of just a single
          property. The file <filename>message.properties</filename> contains
          key-value pairs. The key <code>PropHello.uname</code> contains the
          value <code>User name</code>. Multiple strings may be externalized
          to the same properties file.</para>

          <para>Eclipse does have tool support for externalization. Simply hit
          Source --&gt; Externalize Strings from the context menu. This
          activates a wizard to define property keys, renaming the generated
          helper class' name and finally create the actual
          <filename>message.properties</filename> file.</para>

          <qandaset role="exercise">
            <title>Moving <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            <abbrev
            xlink:href="http://www.ietf.org/rfc/rfc1738.txt">URL</abbrev> and
            credentials to a property file</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Start executing the code given in <xref
                  linkend="figJdbcSimpleWrite"/>. Then extend this example by
                  externalizing all <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  related connection parameters to a
                  <filename>jdbc.properties</filename> file like:</para>

                  <programlisting>SimpleInsert.jdbcUrl=jdbc:mysql://localhost:3306/hdm
SimpleInsert.password=XYZ
SimpleInsert.username=hdmuser</programlisting>

                  <para>As being stated earlier the eclipse wizard assists you
                  by generating both the properties file and a helper class
                  reading that file at runtime.</para>
                </question>

                <answer>
                  <para>The current exercise is mostly related to tooling.
                  From our <link
                  linkend="gloss_Java"><trademark>Java</trademark></link> code
                  the context menu allows us to choose the desired
                  wizard:</para>

                  <informalfigure>
                    <mediaobject>
                      <imageobject>
                        <imagedata fileref="Ref/Screen/externalize.screen.png"/>
                      </imageobject>
                    </mediaobject>
                  </informalfigure>

                  <para>We may now:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Select the strings to be externalized.</para>
                    </listitem>

                    <listitem>
                      <para>Supply key names. In the subsequent screenshot
                      this task has already been started by manually replacing
                      the default <code>SimpleInsert.1</code> by
                      <code>Simpleinsert.jdbc</code>.</para>
                    </listitem>

                    <listitem>
                      <para>Redefine other parameters like prefix, properties
                      file name etc. In the following screenshot only the
                      first of three keys has been manually renamed to the
                      sensible value
                      <varname>SimpleInsert.jdbc</varname>.</para>
                    </listitem>
                  </itemizedlist>

                  <informalfigure>
                    <mediaobject>
                      <imageobject>
                        <imagedata fileref="Ref/Screen/externalize2.screen.png"/>
                      </imageobject>
                    </mediaobject>
                  </informalfigure>

                  <para>The wizard also generates a class
                  <classname>sda.jdbc.intro.v1.DbProps</classname> to actually
                  access our properties:</para>

                  <programlisting language="java">package sda.jdbc.intro.v1;
...
public class DbProps {
   private static final String BUNDLE_NAME = "sda.jdbc.intro.v1.database";

   private static final ResourceBundle RESOURCE_BUNDLE = ResourceBundle
         .getBundle(BUNDLE_NAME);

   private DbProps() {
   }

   public static String getString(String key) {
      try {
         return RESOURCE_BUNDLE.getString(key);
      } catch (MissingResourceException e) {
         return '!' + key + '!';
      }
   }
}</programlisting>

                  <para>Our <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  related code now contains three references to external
                  properties:</para>

                  <programlisting language="java">package sda.jdbc.intro.v1;
...
public class SimpleInsert {

 
   public static void main(String[] args) throws SQLException {
      // Step 1: Open a connection to the database server
      final Connection conn = DriverManager.getConnection (
            <emphasis role="bold">DbProps.getString("PersistenceHandler.jdbcUrl"), </emphasis> 
            <emphasis role="bold">DbProps.getString("PersistenceHandler.username")</emphasis>,
            <emphasis role="bold">DbProps.getString("PersistenceHandler.password")</emphasis>);
      // Step 2: Create a Statement instance
      final Statement stmt = conn.createStatement();
      // Step 3: Execute the desired INSERT
      final int updateCount = stmt.executeUpdate(
               "INSERT INTO Person VALUES('Jim', 'jim@foo.org')");
      // Step 4: Give feedback to the enduser
      System.out.println("Successfully inserted " + updateCount + " dataset(s)"); 
   }
}</programlisting>

                  <para>The current base name
                  <classname>sda.jdbc.intro.v1.PersistenceHandler</classname>
                  is related to a later exercise.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="sectSimpleInsertGui">
          <title>A first GUI sketch</title>

          <para>So far all data records being transferred to the database
          server are still hard-coded in our application. In practice a user
          wants to enter data of persons to be submitted to the
          database.</para>

          <para>We now guide you to develop a first version of a simple GUI
          for this tasks. A more <link linkend="figureDataInsert2">elaborate
          version</link> will be presented in a follow-up exercise. The
          screenshot illustrates the intended application behaviour:</para>

          <figure xml:id="simpleInsertGui">
            <title>A simple GUI to insert data into a database server.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Screen/simpleInsertGui.screen.png"/>
              </imageobject>

              <caption>
                <para>After clicking <quote>Insert</quote> a message is being
                presented to the user. This message may as well indicate a
                failure.</para>
              </caption>
            </mediaobject>
          </figure>

          <para>Implementing Swing GUI applications requires knowledge as
          being taught in e.g. <link
          xlink:href="http://www.hdm-stuttgart.de/studenten/stundenplan/vorlesungsverzeichnis/vorlesung_detail?vorlid=5212221">113300
          Entwicklung von Web-Anwendungen</link>. If you do not (yet) feel
          comfortable writing <productname
          xlink:href="http://docs.oracle.com/javase/tutorial/uiswing/index.html">Swing</productname>
          applications you may want to read <uri
          xlink:href="http://www.javamex.com/tutorials/swing">http://www.javamex.com/tutorials/swing</uri>
          and <emphasis role="bold">really</emphasis> understand the examples
          being presented therein.</para>

          <qandaset role="exercise">
            <title>GUI for inserting Person data to a database server</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Write a GUI application as being outlined in <xref
                  linkend="simpleInsertGui"/>. You may proceed as
                  follows:</para>

                  <orderedlist>
                    <listitem>
                      <para>Write a dummy GUI without any database
                      functionality. Only present the two labels an input
                      fields and the Insert button.</para>
                    </listitem>

                    <listitem>
                      <para>Add an
                      <classname>java.awt.event.ActionListener</classname>
                      which generates a SQL INSERT Statement when clicking the
                      Insert button. Return this string to the user as being
                      shown in the message window of <xref
                      linkend="simpleInsertGui"/>.</para>

                      <para>At this point you still do not need a database
                      connection. The message shown to the user is just a
                      fake, so the GUI <emphasis
                      role="bold">appears</emphasis> to be working.</para>
                    </listitem>

                    <listitem>
                      <para>Establish a
                      <classname>java.sql.Connection</classname> and create a
                      <classname>java.sql.Statement</classname> instance when
                      launching your application. Use the latter in your
                      <classname>java.awt.event.ActionListener</classname> to
                      actually insert datasets into your database.</para>
                    </listitem>
                  </orderedlist>
                </question>

                <answer>
                  <para>The complete implementation resides in
                  <classname>sda.jdbc.intro.v01.InsertPerson</classname>:</para>

                  <programlisting language="java">package sda.jdbc.intro.v01;

import ...

public class InsertPerson extends JFrame {

   ...

   public InsertPerson () throws SQLException{
      super ("Add a person's data");

      setDefaultCloseOperation(JFrame.EXIT_ON_CLOSE);

      final JPanel databaseFieldPanel = new JPanel();
      databaseFieldPanel.setLayout(new GridLayout(0,2));
      add(databaseFieldPanel, BorderLayout.CENTER);

      databaseFieldPanel.add(new JLabel("Name:"));
      final JTextField nameField = new JTextField(15);
      databaseFieldPanel.add(nameField);

      databaseFieldPanel.add(new JLabel("E-mail:"));
      final JTextField emailField = new JTextField(15);
      databaseFieldPanel.add(emailField);

      final JButton insertButton = new JButton("Insert");
      add(insertButton, BorderLayout.SOUTH);

      final Connection conn = DriverManager.getConnection(
            "jdbc:mysql://localhost:3306/hdm", "hdmuser", "XYZ");
      final Statement stmt = conn.createStatement();

      insertButton.addActionListener(new ActionListener() {
         // Linking the GUI to the database server. We assume an open
         // connection and a correctly initialized Statement instance
         @Override
         public void actionPerformed(ActionEvent event) {
            final String sql = "INSERT INTO Person VALUES('" + nameField.getText()+ "', '" 
                  + emailField.getText() +  "')";
            // We have to catch this Exception because an ActionListener's signature
            // prohibits the existence of a "throws" clause.
            try {
               final int updateCount = stmt.executeUpdate(sql);
               JOptionPane.showMessageDialog(null, "Successfully executed \n'" + sql + "'\nand inserted "
                     + updateCount + " dataset");
            } catch (SQLException e) {
               e.printStackTrace();
            }
         }
      });
      pack();
   }
}</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="jdbcExceptions">
          <title>Handling possible exceptions</title>

          <para>Our current code lacks any kind of error handling: Exceptions
          will not be caught at all and invariably lead to program
          termination. This is of course inadequate regarding professional
          software. In case of problems we have to:</para>

          <itemizedlist>
            <listitem>
              <para>Gracefully recover or shut down our application. We may
              for example show a pop up window <quote>Terminating due to an
              internal error</quote>.</para>
            </listitem>

            <listitem>
              <para>Enable the customer to supply the development team with
              helpful information. The user may for example be asked to submit
              a log file in case of errors.</para>
            </listitem>
          </itemizedlist>

          <para>In addition the solution
          <classname>sda.jdbc.intro.v01.InsertPerson</classname> contains an
          ugly mix of GUI components and database related code. We take a
          first step to decouple these two distinct concerns:</para>

          <qandaset role="exercise" xml:id="exercicseGuiStateful">
            <title>Handling the database layer</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Implement a class <code>PersistenceHandler</code> to
                  be later used as a component of our next step GUI
                  application prototype. This class should have the following
                  methods:</para>

                  <programlisting language="java">...
/**
 * Handle database communication. There are two
 * distinct internal states &lt;q&gt;disconnected&lt;/q&gt; and &lt;q&gt;connected&lt;/q&gt;, see
 * {@link #isConnected()}. These two states may be toggled by invoking
 * {@link #connect()} and {@link #disconnect()} respectively.
 * 
 * The following snippet illustrates the intended usage:
 * &lt;pre&gt;   public static void main(String[] args) {
      final PersistenceHandler ph = new PersistenceHandler();
      if (ph.connect()) {
         if (!ph.add("Jim", "jim@foo.com")) {
            System.err.println("Insert Error:" + ph.getErrorMessage());
         }
      } else {
         System.err.println("Connect error:" + ph.getErrorMessage());
      }
   }&lt;/pre&gt;
 * 
 * @author goik
 */
public class PersistenceHandler {
   ...
   /**
    * Instance in &lt;q&gt;disconnected&lt;/q&gt; state. See {@link #isConnected()}
    */
   public PersistenceHandler() {/* only present here to supply Javadoc comment */}

   /**
    * Inserting a (name, email) record into the database server. In case of
    * errors corresponding messages may subsequently be retrieved by calling
    * {@link #getErrorMessage()}.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;/dt&gt; &lt;dd&gt;must be in
    * &lt;q&gt;connected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    * 
    * @param name
    *           A person's name
    * @param email
    *           A person's email address
    * 
    * @return true if the current data record has been successfully inserted
    *         into the database server. false in case of error(s).
    */
   public boolean add(final String name, final String email){
      ...
   }

   /**
    * Retrieving error messages in case a call to {@link #add(String, String)},
    * {@link #connect()}, or {@link #disconnect()} yields an error.
    * 
    * @return the error explanation corresponding to the latest failed
    *         operation, null if no error yet occurred.
    */
   public String getErrorMessage() {
      return ...;
   }

   /**
    * Open a connection to a database server.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;dd&gt;
    *  &lt;dd&gt;must be in &lt;q&gt;disconnected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    *  
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;dd&gt;
    *  &lt;dd&gt;The following properties must be set:
    *  &lt;pre&gt;PersistenceHandler.jdbcUrl=jdbc:mysql://localhost:3306/hdm
PersistenceHandler.password=XYZ
PersistenceHandler.username=foo&lt;/pre&gt;
    *  &lt;/dd&gt;
    * 
    * @return true if connecting was successful
    */
   public boolean connect () {
      ...
   }

   /**
    * Close a connection to a database server and clean up JDBC related resources
    * 
    * Error messages in case of failure may subsequently be retrieved by
    * calling {@link #getErrorMessage()}.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;/dt&gt;
    * &lt;dd&gt;must be in &lt;q&gt;connected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    * 
    * @return true if disconnecting was successful, false in case error(s) occur.
    */
   public boolean disconnect() {
      ...
   }

   /**
    * An instance can either be in &lt;q&gt;connected&lt;/q&gt; or &lt;q&gt;disconnected&lt;/q&gt; state. The
    * state can be toggled by invoking {@link #connect()} or
    * {@link #disconnect()} respectively.
    * 
    * @return true if connected, false otherwise
    */
   public boolean isConnected() {
      return ...;
   }
}</programlisting>

                  <para>Notice the two internal states
                  <quote>disconnected</quote> and
                  <quote>connected</quote>:</para>

                  <figure xml:id="figPersistenceHandlerStates">
                    <title>Possible states and transitions for instances of
                    <code>PersistenceHandler</code>.</title>

                    <mediaobject>
                      <imageobject>
                        <imagedata fileref="Ref/Fig/persistHandlerStates.fig"/>
                      </imageobject>
                    </mediaobject>
                  </figure>

                  <para>According to the above documentation a newly created
                  <code>PersistenceHandler</code> instance should be in
                  disconnected state. As being shown in the <link
                  linkend="gloss_Java"><trademark>Java</trademark></link>
                  class description you may test your implementation without
                  any GUI code. If you are already familiar with unit testing
                  this might be a good start as well.</para>
                </question>

                <answer>
                  <para>We show a possible implementation of
                  <classname>sda.jdbc.intro.v1.PersistenceHandler</classname>:</para>

                  <programlisting language="java">package sda.jdbc.intro.v1;
...

public class PersistenceHandler {

   Connection conn = null;
   Statement stmt = null;

   String errorMessage = null;

   /**
    * New instances are in &lt;q&gt;disconnected&lt;/q&gt; state. See {@link #isConnected()}
    */
   public PersistenceHandler() {/* only present here to supply Javadoc comment */}

   /**
    * Inserting a (name, email) record into the database server. In case of
    * errors corresponding messages may subsequently be retrieved by calling
    * {@link #getErrorMessage()}.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;/dt&gt; &lt;dd&gt;must be in
    * &lt;q&gt;connected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    * 
    * @param name
    *           A person's name
    * @param email
    *           A person's email address
    * 
    * @return true if the current data record has been successfully inserted
    *         into the database server. false in case of error(s).
    */
   public boolean add(final String name, final String email){
      final String sql = "INSERT INTO Person VALUES('" + name + "', '" + 
            email + "')"; 
      try {
         stmt.executeUpdate(sql);
         return true;
      } catch (SQLException e) {
         errorMessage = "Unable to execute '" + sql + "': '" + e.getMessage() + "'";
         return false;
      }
   }

   /**
    * Retrieving error messages in case a call to {@link #add(String, String)},
    * {@link #connect()}, or {@link #disconnect()} yields an error.
    * 
    * @return the error explanation corresponding to the latest failed
    *         operation, null if no error yet occurred.
    */
   public String getErrorMessage() {
      return errorMessage;
   }

   /**
    * Open a connection to a database server.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;dd&gt;
    *  &lt;dd&gt;must be in &lt;q&gt;disconnected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    *  
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;dd&gt;
    *  &lt;dd&gt;The following properties must be set:
    *  &lt;pre&gt;PersistenceHandler.jdbcUrl=jdbc:mysql://localhost:3306/hdm
PersistenceHandler.password=XYZ
PersistenceHandler.username=foo&lt;/pre&gt;
    *  &lt;/dd&gt;
    * 
    * @return true if connecting was successful
    */
   public boolean connect () {
      try {
         conn = DriverManager.getConnection(
               DbProps.getString("PersistenceHandler.jdbcUrl"),
               DbProps.getString("PersistenceHandler.username"),
               DbProps.getString("PersistenceHandler.password"));
         try {
            stmt = conn.createStatement();
            return true; 
         } catch (SQLException e) {
            errorMessage = "Connection opened but Statement creation failed:\"" + e.getMessage() + "\"."; 
            try {
               conn.close();
            } catch (SQLException ee) {
               errorMessage += "Closing connection failed:\"" + e.getMessage() + "\".";
            }
            conn = null;
         }

      } catch (SQLException e) {
         errorMessage = "Unable to open connection:\"" + e.getMessage() + "\".";
      }
      return false;
   }

   /**
    * Close a connection to a database server and clean up JDBC related resources
    * 
    * Error messages in case of failure may subsequently be retrieved by
    * calling {@link #getErrorMessage()}.
    * 
    * &lt;dt&gt;&lt;b&gt;Precondition:&lt;/b&gt;&lt;/dt&gt;
    * &lt;dd&gt;must be in &lt;q&gt;connected&lt;/q&gt; state, see {@link #isConnected()}&lt;/dd&gt;
    * 
    * @return true if disconnecting was successful, false in case error(s) occur.
    */
   public boolean disconnect() {
      boolean resultStatus = true;
      final StringBuffer messageCollector = new StringBuffer();
      try {
         stmt.close();
      } catch (SQLException e) {
         resultStatus = false;
         messageCollector.append("Unable to close Statement:\"" + e.getMessage() + "\".");
      }
      stmt = null;
      try {
         conn.close();
      } catch (SQLException e) {
         resultStatus = false;
         messageCollector.append("Unable to close connection:\"" + e.getMessage() + "\".");
      }
      conn = null;
      if (!resultStatus) {
         errorMessage = messageCollector.toString();
      }
      return resultStatus;
   }

   /**
    * An instance can either be in &lt;q&gt;connected&lt;/q&gt; or &lt;q&gt;disconnected&lt;/q&gt; state. The
    * state can be toggled by invoking {@link #connect()} or
    * {@link #disconnect()} respectively.
    * 
    * @return true if connected, false otherwise
    */
   public boolean isConnected() {
      return null != conn;
   }
}</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <para>We may now complete the next enhancement step of our GUI
          database client.</para>

          <qandaset role="exercise">
            <title>Connection on user action</title>

            <qandadiv>
              <qandaentry xml:id="exerciseGuiWriteTakeTwo">
                <question>
                  <label>An application writing records to a database
                  server</label>

                  <para>Our aim is to enhance the first GUI prototype being
                  described in <xref linkend="simpleInsertGui"/>. The
                  application shall start being disconnected from the database
                  server. Prior to entering data the user shall be guided to
                  open a connection. The following video illustrates the
                  desired user interface:</para>

                  <figure xml:id="figureDataInsert2">
                    <title>A GUI frontend for adding personal data to a
                    server.</title>

                    <mediaobject>
                      <videoobject>
                        <videodata fileref="Ref/Video/dataInsert.mp4"/>
                      </videoobject>
                    </mediaobject>
                  </figure>

                  <para>In case a user closes the main window while still
                  being connected a disconnect from the database server shall
                  be enforced. For this purpose we must handle the event when
                  the user clicks on the closing button within the window
                  decoration. An exit handler method is being required to
                  terminate a potentially open database connection.</para>
                </question>

                <answer>
                  <para>Our implementation uses the class
                  <classname>sda.jdbc.intro.v1.PersistenceHandler</classname>
                  for handling all database communication. The GUI needs to
                  visualize the two different states
                  <quote>disconnected</quote> and <quote>connected</quote>. In
                  <quote>disconnected</quote> state the whole input pane for
                  entering datasets and clicking the <quote>Insert</quote>
                  button is locked. So the user is forced to actively open a
                  database connection.</para>

                  <para>Notice also the
                  <classname>java.awt.event.WindowAdapter</classname>
                  implementation being executed when closing the application's
                  main window. The
                  <methodname>java.awt.event.WindowAdapter.windowClosing(java.awt.event.WindowEvent)</methodname>
                  method disconnects any existing database connection thus
                  freeing resources.</para>

                  <programlisting language="java">package sda.jdbc.intro.v1;

import ...

public class InsertPerson extends JFrame {
   
   private static final long serialVersionUID = 6815975741605247675L;
   
   final PersistenceHandler persistenceHandler = new PersistenceHandler();
   
   final JTextField nameField = new JTextField(15),
                   emailField = new JTextField(20);

   final JButton toggleConnectButton = new JButton(),
                        insertButton = new JButton("Insert");

   final JPanel databaseFieldPanel = new JPanel();

   private void setGuiConnectionState(final boolean state) {
      if (state) {
         toggleConnectButton.setText("Disconnect");
      } else {
         toggleConnectButton.setText("Connect");
      }
      for (final Component c: databaseFieldPanel.getComponents()){
         c.setEnabled(state);
      }
   }

   public static void main(String[] args) throws SQLException {
      InsertPerson app = new InsertPerson();
      app.setVisible(true);
   }
   
   public InsertPerson (){
      super ("Add a person's data");
      
      setSize(500, 500);

      addWindowListener(new WindowAdapter() {
         // In case a user closes our application window while still being connected
         // we have to close the database connection.
         @Override
         public void windowClosing(WindowEvent e) {
            super.windowClosing(e);
            if (persistenceHandler.isConnected() &amp;&amp; !persistenceHandler.disconnect()) {
               System.exit(1);
            } else {
               System.exit(0);
            }
      });
      Box top = Box.createHorizontalBox();
      add(top, BorderLayout.NORTH);
      top.add(toggleConnectButton);
      
      toggleConnectButton.addActionListener(new ActionListener() {
         
         @Override
         public void actionPerformed(ActionEvent e) {
            if (persistenceHandler.isConnected()) {
               if (persistenceHandler.disconnect()){
                  setGuiConnectionState(false);
               } else {
                  JOptionPane.showMessageDialog(null, persistenceHandler.getErrorMessage());
               }
            } else {
               if (persistenceHandler.connect()){
                  setGuiConnectionState(true);
               } else {
                  JOptionPane.showMessageDialog(null, persistenceHandler.getErrorMessage());
               }
            }
         }
      });
      
      databaseFieldPanel.setLayout(new GridLayout(0,2));
      add(databaseFieldPanel);

      databaseFieldPanel.add(new JLabel("Name:"));
      databaseFieldPanel.add(nameField);
      
      databaseFieldPanel.add(new JLabel("E-mail:"));
      databaseFieldPanel.add(emailField);
      
      insertButton.addActionListener(new ActionListener() {
        @Override
        public void actionPerformed(ActionEvent e) {
           if (persistenceHandler.add(nameField.getText(), emailField.getText())) {
              nameField.setText("");
              emailField.setText("");
              JOptionPane.showMessageDialog(null, "Succesfully inserted dataset");
           } else {
              JOptionPane.showMessageDialog(null, persistenceHandler.getErrorMessage());
           }
        }
      });
      databaseFieldPanel.add(Box.createGlue());
      databaseFieldPanel.add(insertButton);   
      setGuiConnectionState(false);
      pack();
   }
}</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="jdbcSecurity">
          <title><trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          and security</title>

          <section xml:id="jdbcSecurityNetwork">
            <title>Network sniffing</title>

            <para>Sniffing <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            network traffic is one possibility for intruders to compromise
            database applications. This requires physical access to either
            of:</para>

            <itemizedlist>
              <listitem>
                <para>Server host</para>
              </listitem>

              <listitem>
                <para>Client host</para>
              </listitem>

              <listitem>
                <para>intermediate hub, switch or router.</para>
              </listitem>
            </itemizedlist>

            <figure xml:id="figJdbcSniffing">
              <title>Sniffing a <trademark
              xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
              connection by an intruder.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/jdbcSniffing.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>We demonstrate a possible attack by analyzing the network
            traffic between our application shown in <xref
            linkend="figJdbcSimpleWrite"/> and the <productname
            xlink:href="http://www.mysql.com">Mysql</productname> database
            server. Prior to starting the application we set up <productname
            xlink:href="http://www.wireshark.org">Wireshark</productname> for
            filtered capturing:</para>

            <itemizedlist>
              <listitem>
                <para>Connecting to the <varname>loopback</varname> (lo)
                interface only. This is sufficient since our client connects
                to <varname>localhost</varname>.</para>
              </listitem>

              <listitem>
                <para>Filtering packets if not of type <acronym
                xlink:href="http://en.wikipedia.org/wiki/Transmission_Control_Protocol">TCP</acronym>
                and having port number 3306</para>
              </listitem>
            </itemizedlist>

            <para>This yields the following capture being shortened for the
            sake of brevity:</para>

            <programlisting>[...
5.5.24-0ubuntu0.12.04.1.%...X*e?I1ZQ...................e,F[yoA5$T[N.mysql_native_password.
 A...........!.......................hdmuser <co xml:id="tcpCaptureUsername"/>......U.&gt;S.%..~h...!.xhdm............j..../* 

 ... INSERT INTO Person VALUES('Jim', 'jim@foo.org') <co
                xml:id="tcpCaptureSqlInsert"/>6...
  .&amp;.#23000Duplicate entry 'jim@foo.org' for key 'email' <co
                xml:id="tcpCaptureErrmsg"/></programlisting>

            <calloutlist>
              <callout arearefs="tcpCaptureUsername">
                <para>The <varname>username</varname> initiating the
                connection to the database server.</para>
              </callout>

              <callout arearefs="tcpCaptureSqlInsert">
                <para>The <code>INSERT ...</code> statement.</para>
              </callout>

              <callout arearefs="tcpCaptureErrmsg">
                <para>The resulting error message being sent back to the
                client.</para>
              </callout>
            </calloutlist>

            <para>Something seems to be missing here: The user's password. Our
            code in <xref linkend="figJdbcSimpleWrite"/> contains the password
            <quote><varname>XYZ</varname></quote> in clear text. But even
            using the search function of <productname
            xlink:href="http://www.wireshark.org">Wireshark</productname> does
            not show any such string within the above capture. The
            <productname xlink:href="http://www.mysql.com">Mysql</productname>
            documentation however <link
            xlink:href="http://dev.mysql.com/doc/refman/5.0/en/security-against-attack.html">reveals</link>
            that everything but the password is transmitted in clear text. So
            all we might identify is a hash of <code>XYZ</code>.</para>

            <para>So regarding our (current) <productname
            xlink:href="http://www.mysql.com">Mysql</productname>
            implementation the impact of this attack type is somewhat limited
            but still severe: All data being transmitted between client and
            server may be disclosed. This typically comprises sensible data as
            well. Possible solutions:</para>

            <itemizedlist>
              <listitem>
                <para>Create an encrypted tunnel between client and server
                like e.g. <link
                xlink:href="http://www.debianadmin.com/howto-use-ssh-local-and-remote-port-forwarding.html">ssh
                port forwarding</link> or <link
                xlink:href="http://de.wikipedia.org/wiki/Virtual_Private_Network">VPN</link>.</para>
              </listitem>

              <listitem>
                <para>Many database vendors <link
                xlink:href="http://dev.mysql.com/doc/refman/5.1/de/connector-j-reference-using-ssl.html">supply
                SSL</link> or similar <trademark
                xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                protocol encryption extensions. This requires additional
                configuration procedures like setting up server side
                certificates. Moreover similar to the http/https protocols
                encryption generally slows down data traffic.</para>
              </listitem>
            </itemizedlist>

            <para>Of course this is only relevant if the transport layer is
            considered to be insecure. If both server and client reside within
            the same trusted infrastructure no action has to be taken. We also
            note that this kind of problem is not limited to <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>.
            In fact all protocols lacking encryption are subject to this type
            of attack.</para>
          </section>

          <section xml:id="sqlInjection">
            <title>SQL injection</title>

            <para>Before diving into technical details we shed some light on
            the possible impact of this common attack type being described in
            this chapter. Our example is the well known Heartland Payment
            Systems data breach:</para>

            <figure xml:id="figHeartlandSecurityBreach">
              <title>Summary about possible SQL injection impact based on the
              Heartland security breach</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/heartland.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>Why should we be concerned with SQL injection? In the
            introduction of <xref linkend="bibClarke09"/> a compelling
            argument is being given:</para>

            <blockquote>
              <para>Many people say they know what SQL injection is, but all
              they have heard about or experienced are trivial examples. SQL
              injection is one of the most devastating vulnerabilities to
              impact a business, as it can lead to exposure of all of the
              sensitive information stored in an application's database,
              including handy information such as usernames, passwords, names,
              addresses, phone numbers, and credit card details.</para>
            </blockquote>

            <para>In this lecture due to limited resources we only deal with
            trivial examples mentioned above. One possible way SQL injection
            attacks work is by inserting SQL code into fields being designed
            for end user input:</para>

            <figure xml:id="figSqlInject">
              <title>SQL injection triggered by ordinary user input.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/sqlinject.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <qandaset role="exercise">
              <title>Attack from the dark side</title>

              <qandadiv>
                <qandaentry xml:id="sqlInjectDropTable">
                  <question>
                    <para>Use the application from <xref
                    linkend="exerciseGuiWriteTakeTwo"/> and <xref
                    linkend="figSqlInject"/> to launch a SQL injection attack.
                    We provide some hints:</para>

                    <orderedlist>
                      <listitem>
                        <para>The <productname
                        xlink:href="http://www.mysql.com">Mysql</productname>
                        <trademark
                        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                        driver implementation already provides precautions to
                        hamper SQL injection attacks. In its default
                        configuration a sequence of SQL commands separated by
                        semicolons (<quote>;</quote>) will not be executed but
                        flagged as a SQL syntax error. We take an
                        example:</para>

                        <programlisting>INSERT INTO Person VALUES (...);DROP TABLE Person</programlisting>

                        <para>In order to execute these so called multi user
                        queries we explicitly have to enable a <productname
                        xlink:href="http://www.mysql.com">Mysql</productname>
                        property. This may be achieved by extending our
                        <trademark
                        xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                        URL:</para>

                        <programlisting>jdbc:mysql://localhost:3306/hdm?<emphasis
                            role="bold">allowMultiQueries=true</emphasis></programlisting>

                        <para>The <productname
                        xlink:href="http://www.mysql.com">Mysql</productname>
                        manual <link
                        xlink:href="http://dev.mysql.com/doc/refman/5.1/en/connector-j-reference-configuration-properties.html">contains
                        </link>a remark regarding this parameter:</para>

                        <remark>Notice that this has the potential for SQL
                        injection if using plain java.sql.Statements and your
                        code doesn't sanitize input correctly.</remark>

                        <para>In other words: You have been warned!</para>
                      </listitem>

                      <listitem>
                        <para>You may now use either of the two input fields
                        <quote>name</quote> or <quote>email</quote> to inject
                        arbitrary SQL code.</para>
                      </listitem>
                    </orderedlist>
                  </question>

                  <answer>
                    <para>We construct a suitable string being injected to
                    drop our <code>Person</code> table:</para>

                    <programlisting>Jim', 'jim@c.com');DROP TABLE Person;INSERT INTO Person VALUES('Joe</programlisting>

                    <para>This being entered into the name field kills our
                    <code>Table</code> relation effectively. As the error
                    message shows two INSERT statements are separated by a
                    DROP TABLE statement. So after executing the first INSERT
                    our database server drops the whole table. At last the
                    second INSERT statement fails giving rise to an error
                    message no end user will ever understand:</para>

                    <figure xml:id="figSqlInjectDropPerson">
                      <title>Dropping the <code>Person</code> table by SQL
                      injection</title>

                      <mediaobject>
                        <imageobject>
                          <imagedata fileref="Ref/Screen/sqlInject.screen.png"/>
                        </imageobject>
                      </mediaobject>
                    </figure>

                    <para>According to the message text the table
                    <code>Person</code> gets dropped as expected. Thus the
                    subsequent (second) <code>INSERT</code> action is bound to
                    fail.</para>

                    <para>In practice this result my be avoided. The database
                    user will (hopefully!) not have sufficient permissions to
                    drop the whole table. Malicious modifications by INSERT,
                    UPDATE or DELETE statements are still possible.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>

          <section xml:id="sanitizeUserInput">
            <title>Sanitizing user input</title>

            <para>There are at least two general ways to deal with the
            disastrous result of <xref linkend="sqlInjectDropTable"/>:</para>

            <itemizedlist>
              <listitem>
                <para>Keep the database server from interpreting user input
                completely. This is probably the best way and will be
                discussed in <xref linkend="sectPreparedStatements"/>.</para>
              </listitem>

              <listitem>
                <para>Let the application check and process user input.
                Dangerous user input may be modified prior to being embedded
                in SQL statements or being rejected completely.</para>
              </listitem>
            </itemizedlist>

            <para>The first method is definitely superior in most cases. There
            are however cases where the restrictions being implied are too
            severe. We may for example choose dynamically which tables shall
            be accessed. So an SQL statement's structure rather than just its
            predicates is affected by user input. There are at least two
            standard procedures dealing with this problem:</para>

            <glosslist>
              <glossentry>
                <glossterm>Input Filtering</glossterm>

                <glossdef>
                  <para>In the simplest case we check a user's input by
                  regular expressions. An example is an input field in a login
                  window representing a system user name. Legal input may
                  allows letters and digits only. Special characters,
                  whitespace etc. are typically prohibited. The input does
                  have a minimum length of one character. A maximum length may
                  be imposed as well. So we may choose the regular expression
                  <code>[A-Za-z0-9]+</code> to check valid user names.</para>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm><foreignphrase>Whitelisting</foreignphrase></glossterm>

                <glossdef>
                  <para>In many cases Input fields only allow a restricted set
                  of values. Consider an input field for names of planets. An
                  application may keep a dictionary table to validate user
                  input:</para>

                  <informaltable border="1">
                    <col width="10%"/>

                    <col width="5%"/>

                    <tr>
                      <td>Mercury</td>

                      <td>1</td>
                    </tr>

                    <tr>
                      <td>Venus</td>

                      <td>2</td>
                    </tr>

                    <tr>
                      <td>Earth</td>

                      <td>3</td>
                    </tr>

                    <tr>
                      <td>...</td>

                      <td>...</td>
                    </tr>

                    <tr>
                      <td>Neptune</td>

                      <td>9</td>
                    </tr>

                    <tr>
                      <td><emphasis role="bold">Default:</emphasis></td>

                      <td><emphasis role="bold">0</emphasis></td>
                    </tr>
                  </informaltable>

                  <para>So if a user enters a valid planet name a
                  corresponding number representing this particular planet
                  will be sent to the database. If the user enters an invalid
                  string an error message may be raised.</para>

                  <para>In a GUI in many situations this may be better
                  accomplished by presenting the list of planets to choose
                  from. In this case a user has no chance to enter invalid or
                  even malicious code.</para>
                </glossdef>
              </glossentry>
            </glosslist>

            <para>So we have an <quote>interceptor</quote> sitting between
            user input fields and SQL generating code:</para>

            <figure xml:id="figInputFiltering">
              <title>Validating user input prior to dynamically composing SQL
              statements.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/filtering.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <qandaset role="exercise">
              <title>Using regular expressions in <link
              linkend="gloss_Java"><trademark>Java</trademark></link></title>

              <qandadiv>
                <qandaentry>
                  <question>
                    <para>This exercise is a preparation for <xref
                    linkend="exercisefilterUserInput"/>. The aim is to deal
                    with regular expressions and to use them in <link
                    linkend="gloss_Java"><trademark>Java</trademark></link>.
                    If you don't know yet about regular expressions / pattern
                    matching you may want to read either of:</para>

                    <itemizedlist>
                      <listitem>
                        <para><link
                        xlink:href="http://www.aivosto.com/vbtips/regex.html">Regular
                        expressions - An introduction</link></para>
                      </listitem>

                      <listitem>
                        <para><link
                        xlink:href="http://www.codeproject.com/Articles/939/An-Introduction-to-Regular-Expressions">An
                        Introduction to Regular Expressions</link></para>
                      </listitem>

                      <listitem>
                        <para><link
                        xlink:href="http://www.regular-expressions.info/tutorial.html">Regular
                        Expression Tutorial</link></para>
                      </listitem>
                    </itemizedlist>

                    <para>Complete the implementation of the following
                    skeleton:</para>

                    <programlisting language="java">...
import java.util.regex.Matcher;
import java.util.regex.Pattern;
 
public static void main(String[] args) {
   final String [] wordList = new String [] {"Eric", "126653BBb", "_login","some text"};
   final String [] regexpList = new String[] {"[A-K].*", "[^0-9]+.*", "_[a-z]+", ""};
   
   for (final String word: wordList) {
      for (final String regexp: regexpList) {
         testMatch(word, regexp);
      }
   }
}

/**
 * Matching a given word by a regular expression. A log message is being
 * written to stdout.
 * 
 * Hint: The implementation is based on the explanation being given in the
 * introduction to {@link Pattern}
 * 
 * @param word This string will be matched by the subsequent argument.
 * @param regexp The regular expression tested to match the previous argument.
 * @return true if regexp matches word, false otherwise.
 */
public static boolean testMatch(final String word, final String regexp) {
.../* to be implemented by <emphasis role="bold">**YOU**</emphasis>   */
}</programlisting>

                    <para>As being noted in the <link
                    linkend="gloss_Java"><trademark>Java</trademark></link>
                    above you may want to read the documentation of class
                    <classname>java.util.regex.Pattern</classname>. The
                    intended output of the above application is:</para>

                    <programlisting>The expression '[A-K].*' matches 'Eric'
The expression '[^0-9]+.*' ...
...</programlisting>
                  </question>

                  <answer>
                    <para>A possible implementation is given by
                    <classname>sda.regexp.RegexpPrimer</classname>.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>

            <qandaset role="exercise">
              <title>Input validation by regular expressions</title>

              <qandadiv>
                <qandaentry xml:id="exercisefilterUserInput">
                  <question>
                    <para>The application of <xref
                    linkend="sqlInjectDropTable"/> proved to be vulnerable to
                    SQL injection. Sanitize the two user input field's values
                    to prevent such behaviour.</para>

                    <itemizedlist>
                      <listitem>
                        <para>Find appropriate regular expressions to check
                        both username and email. Some hints:</para>

                        <glosslist>
                          <glossentry>
                            <glossterm>username</glossterm>

                            <glossdef>
                              <para>Regarding SQL injection the <quote>;</quote>
                              character is among the most critical. You may want
                              to exclude certain special characters. This doesn't
                              harm since their presence in a user's name is likely
                              to be a typo rather then any sensitive input.</para>
                            </glossdef>
                          </glossentry>

                          <glossentry>
                            <glossterm>email</glossterm>

                            <glossdef>
                              <para>There are tons of <quote>ultimate</quote>
                              regular expressions available to check email
                              addresses. Remember that rather avoiding
                              <quote>wrong</quote> email addresses the present
                              task is to avoid SQL injection. So find a reasonable
                              one which may be too permissive regarding RFC email
                              syntax rules but sufficient to secure your
                              application.</para>

                              <para>A concise definition of an email's syntax is
                              being given in <link
                              xlink:href="http://tools.ietf.org/html/rfc5322#section-3.4.1">RFC5322</link>.
                              Its implementation is beyond scope of the current
                              lecture. Moreover it is questionable whether E-mail
                              clients and mail transfer agents implement strict
                              RFC compliance.</para>
                            </glossdef>
                          </glossentry>
                        </glosslist>

                        <para>Both regular expressions must cover the whole
                        user input from the beginning to the end. This can be
                        achieved by using <code>^ ... $</code>.</para>
                      </listitem>

                      <listitem>
                        <para>The <link
                        linkend="gloss_Java"><trademark>Java</trademark></link>
                        standard class
                        <classname>javax.swing.InputVerifier</classname> may
                        help you validating user input.</para>
                      </listitem>

                      <listitem>
                        <para>The following screenshot may provide an idea for
                        GUI realization and user interaction in case of
                        errors. Of course the submit button's action should be
                        disabled in case of erroneous input. The user should
                        receive a helpful error message instead.</para>

                        <figure xml:id="figInsertValidate">
                          <title>Error message being presented to the
                          user.</title>

                          <mediaobject>
                            <imageobject>
                              <imagedata fileref="Ref/Screen/insertValidate.screen.png"/>
                            </imageobject>

                            <caption>
                              <para>In the current example the trailing
                              <quote>;</quote> within the E-Mail field is
                              invalid.</para>
                            </caption>
                          </mediaobject>
                        </figure>
                      </listitem>
                    </itemizedlist>
                  </question>

                  <answer>
                    <para>Extending
                    <classname>javax.swing.InputVerifier</classname> allows us
                    to build a generic class to filter user text input by
                    arbitrary regular expressions:</para>

                    <programlisting language="java">package sda.jdbc.intro.v1.sanitize;
...
public class RegexpVerifier extends InputVerifier {

   final Pattern syntaxPattern;
   final JLabel validationLabel;
   private boolean inputValid = false;
   private final String errMsg;   
...
   public RegexpVerifier (final String regex, final JLabel validationLabel, final String errMsg) {
      this.validationLabel = validationLabel;
      this.errMsg = errMsg;
      syntaxPattern = Pattern.compile(regex);
   }

   @Override
   public boolean verify(JComponent input) {
      if (input instanceof JTextField) {
         final String userInput =  ((JTextField) input).getText();
         if (syntaxPattern.matcher(userInput).find()) {
            validationLabel.setText("");
            inputValid = true;
         } else {
            validationLabel.setText(errMsg);
            inputValid = false;
         }
      }
      return inputValid;
   }
   public boolean inputIsValid () {
      return inputValid;
   }
}</programlisting>

                    <para>Instances of
                    <classname>sda.jdbc.intro.v1.sanitize.RegexpVerifier</classname>
                    <coref linkend="emailVerifier"/> <coref
                    linkend="nameVerifier"/> may now be used to validate our
                    two input data fields <coref linkend="setNameValidation"/>
                    <coref linkend="setEmailValidation"/>. We put emphasis on
                    the changes with respect to
                    <classname>sda.jdbc.intro.v1.InsertPerson</classname>:</para>

                    <programlisting language="java">package sda.jdbc.intro.v1.sanitize;
...
public class InsertPerson extends JFrame {
   
   final JTextField nameField = new JTextField(15);
   final JLabel    nameFieldValidationLabel <co xml:id="nameVerifier"/> = new JLabel();
   final RegexpVerifier nameFieldVerifier = new RegexpVerifier(
                           "^[^;'\"]+$",
                           nameFieldValidationLabel,
                           "No special characters");
   
   final JTextField emailField = new JTextField(20);
   final JLabel    emailFieldValidationLabel <co xml:id="emailVerifier"/> = new JLabel();
   final RegexpVerifier emailFieldVerifier =
         new RegexpVerifier("^[\\w\\-\\.\\_]+@[\\w\\-\\.]*[a-zA-Z]{2,4}$",
                           emailFieldValidationLabel,
                           "email not valid");
...
   public static void main(String[] args) throws SQLException {
      InsertPerson app = new InsertPerson();
      app.setVisible(true);
   }
   public InsertPerson (){
...      
      databaseFieldPanel.add(nameField);
      <emphasis role="bold">nameFieldValidationLabel.setForeground(Color.RED);
      databaseFieldPanel.add(nameFieldValidationLabel);
      nameField.setInputVerifier(nameFieldVerifier);</emphasis> <co
                        xml:id="setNameValidation"/>
      
      databaseFieldPanel.add(new JLabel("E-mail:"));
      databaseFieldPanel.add(emailField);
      <emphasis role="bold">databaseFieldPanel.add(emailFieldValidationLabel);
      emailFieldValidationLabel.setForeground(Color.RED);
      emailField.setInputVerifier(emailFieldVerifier);</emphasis> <co
                        xml:id="setEmailValidation"/>
      
      insertButton.addActionListener(new ActionListener() {
         @Override
         public void actionPerformed(ActionEvent e) {
            <emphasis role="bold">if (!nameFieldVerifier.inputIsValid() || !emailFieldVerifier.inputIsValid()) {
               JOptionPane.showMessageDialog(null, "Invalid input value(s)");
            }</emphasis> else {
...</programlisting>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>

          <section xml:id="sectPreparedStatements">
            <title><classname>java.sql.PreparedStatement</classname>
            objects</title>

            <para>Sanitizing user input is an essential means to secure an
            application. The <trademark
            xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
            standard however provides a mechanism being superior regarding the
            purpose of protecting applications against SQL injection attacks.
            We shed some light on our current mechanism sending SQL statements
            to a database server:</para>

            <figure xml:id="sqlTransport">
              <title>SQL statements in <link
              linkend="gloss_Java"><trademark>Java</trademark></link>
              applications get parsed at the database server</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/sqlTransport.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>This architecture raises two questions:</para>

            <orderedlist>
              <listitem>
                <para>What happens in case identical SQL statements are
                executed repeatedly? This may happen inside a loop when
                thousands of records with identical structure are being sent
                to a database.</para>
              </listitem>

              <listitem>
                <para>Is this architecture adequate with respect to security
                concerns?</para>
              </listitem>
            </orderedlist>

            <para>The first question is related to performance: Parsing
            statements being identical despite the properties being contained
            within is a waste of resources. We consider the transfer of
            records between different databases:</para>

            <programlisting>INSERT INTO Person VALUES ('Jim', 'jim@q.org')
INSERT INTO Person VALUES ('Eve', 'eve@y.org')
INSERT INTO Person VALUES ('Pete', 'p@rr.com')
...</programlisting>

            <para>In this case it does not make sense to repeatedly parse
            identical SQL statements. Using single <code>INSERT</code>
            statements with multiple data records may not be an option when
            the number of records grows.</para>

            <para>The second question is related to our current security
            topic: The database server's interpreter my be so
            <quote>kind</quote> to interpret an attacker's malicious code as
            well.</para>

            <para>Both topics are being addressed by
            <classname>java.sql.PreparedStatement</classname> objects.
            Basically these objects allow for separation of an SQL statements
            structure from parameter values contained within. The scenario
            given in <xref linkend="sqlTransport"/> may be implemented
            as:</para>

            <figure xml:id="sqlTransportPrepare">
              <title>Using <classname>java.sql.PreparedStatement</classname>
              objects.</title>

              <mediaobject>
                <imageobject>
                  <imagedata fileref="Ref/Fig/sqlTransportPrepare.fig"/>
                </imageobject>
              </mediaobject>
            </figure>

            <para>Prepared statements are an example for parameterized SQL
            statements which exist in various programming languages. When
            using <classname>java.sql.PreparedStatement</classname> instances
            we actually have three distinct phases:</para>

            <orderedlist>
              <listitem>
                <para xml:id="exerciseGuiWritePrepared">Creating an instance
                of <classname>java.sql.PreparedStatement</classname>. The SQL
                statement possibly containing place holders gets
                parsed.</para>
              </listitem>

              <listitem>
                <para>Setting all placeholder values. This does not involve
                any further SQL syntax parsing.</para>
              </listitem>

              <listitem>
                <para>Execute the statement.</para>
              </listitem>
            </orderedlist>

            <para>Steps 2. and 3. may be repeated as often as desired without
            any re-parsing of SQL statements thus saving resources on the
            database server side.</para>

            <para>Our introductory toy application <xref
            linkend="figJdbcSimpleWrite"/> may be rewritten using
            <classname>java.sql.PreparedStatement</classname> objects:</para>

            <programlisting language="java">sda.jdbc.intro.v1;
...
public class SimpleInsert {
   
   public static void main(String[] args) throws SQLException {
      
      final Connection conn = DriverManager.getConnection (...
      
      // Step 2: Create a PreparedStatement instance
      final PreparedStatement pStmt = conn.prepareStatement(
                           "INSERT INTO Person VALUES(<emphasis role="bold">?, ?</emphasis>)");<co
                xml:id="listPrepCreate"/>
      
      // Step 3a: Fill in desired attribute values
      pStmt.setString(1, "Jim");<co xml:id="listPrepSet1"/>
      pStmt.setString(2, "jim@foo.org");<co xml:id="listPrepSet2"/>
      
      // Step 3b: Execute the desired INSERT
      final int updateCount = pStmt.executeUpdate();<co xml:id="listPrepExec"/>
      
      // Step 4: Give feedback to the enduser
      System.out.println("Successfully inserted " + updateCount + " dataset(s)"); 
   }
}</programlisting>

            <calloutlist>
              <callout arearefs="listPrepCreate">
                <para>An instance of
                <classname>java.sql.PreparedStatement</classname> is being
                created. Notice the two question marks representing two place
                holders for string values to be inserted in the next
                step.</para>
              </callout>

              <callout arearefs="listPrepSet1 listPrepSet2">
                <para>Fill in the two placeholder values being defined at
                <coref linkend="listPrepCreate"/>.</para>

                <caution>
                  <para>Since half the world of programming folks will index a
                  list of n elements starting from 0 to n-1, <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  apparently counts from 1 to n. Working with <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  would have been too easy otherwise.</para>
                </caution>
              </callout>

              <callout arearefs="listPrepExec">
                <para>Execute the beast! Notice the empty parameter list. No
                SQL is required since we already prepared it in <coref
                linkend="listPrepCreate"/>.</para>
              </callout>
            </calloutlist>

            <para>The problem of SQL injection disappears completely when
            using <classname>java.sql.PreparedStatement</classname> instances.
            An attacker may safely enter offending strings like:</para>

            <programlisting>Jim', 'jim@c.com');DROP TABLE Person;INSERT INTO Person VALUES('Joe</programlisting>

            <para>The above string will be taken <quote>as is</quote> and thus
            simply becomes part of the database server's content.</para>

            <qandaset role="exercise">
              <title>Prepared Statements to keep the barbarians at the
              gate</title>

              <qandadiv>
                <qandaentry xml:id="exerciseSqlInjectPrepare">
                  <question>
                    <para>In <xref linkend="sqlInjectDropTable"/> we found our
                    implementation in <xref
                    linkend="exerciseGuiWriteTakeTwo"/> to be vulnerable with
                    respect to SQL injection. Rather than sanitizing user
                    input you shall use
                    <classname>java.sql.PreparedStatement</classname> objects
                    to secure the application.</para>
                  </question>

                  <answer>
                    <para>Due to our separation of GUI and persistence
                    handling we only need to re-implement
                    <classname>sda.jdbc.intro.sqlinject.PersistenceHandler</classname>.
                    We have to replace
                    <classname>java.sql.Statement</classname> by
                    <classname>java.sql.PreparedStatement</classname>
                    instances. A possible implementation is
                    <classname>sda.jdbc.intro.v1.prepare.PersistenceHandler</classname>.
                    We may now safely enter offending strings like:</para>

                    <programlisting>Jim', 'jim@c.com');DROP TABLE Person;INSERT INTO Person VALUES('Joe</programlisting>

                    <para>This time the input value is taken <quote>as
                    is</quote> and yields the following error message:</para>

                    <informalfigure>
                      <mediaobject>
                        <imageobject>
                          <imagedata fileref="Ref/Screen/sqlInjectPrepare.screen.png"/>
                        </imageobject>
                      </mediaobject>
                    </informalfigure>

                    <para>The offending string exceeds the length of the
                    attribute <code>name</code> within the database table
                    <code>Person</code>. We may enlarge this value to allow
                    the <code>INSERT</code> operation:</para>

                    <programlisting>CREATE TABLE Person (
   name char(<emphasis role="bold">80</emphasis>) <emphasis role="bold">-- a little bit longer --</emphasis>
  ,email CHAR(20) UNIQUE
);</programlisting>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>

            <para>We may have followed the track of test-driven development.
            In that case we would have written tests before actually
            implementing our application. In the current lecture we will do
            this the other way round in the following exercise. The idea is to
            assure software quality when fixing bugs or extending an
            application.</para>

            <para>The subsequent exercise requires the <productname
            xlink:href="http://testng.org/doc/eclipse.html#eclipse-installation">TestNG</productname>
            plugin for Eclipse to be installed. This should already be the
            case both in the MI exercise classrooms and in the Virtualbox
            image provided at <uri
            xlink:href="ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi">ftp://mirror.mi.hdm-stuttgart.de/ubuntu/VirtualBox/lubuntu.vdi</uri>.
            If you use a private Eclipse installation you may want to follow
            <xref linkend="testngInstall"/>.</para>

            <qandaset role="exercise">
              <title>Testing
              <classname>sda.jdbc.intro.v1.PersistenceHandler</classname>
              using <productname
              xlink:href="http://testng.org">TestNG</productname></title>

              <qandadiv>
                <qandaentry>
                  <question>
                    <para>Read <xref linkend="chapUnitTesting"/>. Then
                    test:</para>

                    <itemizedlist>
                      <listitem>
                        <para>Proper behaviour when opening and closing
                        connections.</para>
                      </listitem>

                      <listitem>
                        <para>Proper behavior when inserting data</para>
                      </listitem>

                      <listitem>
                        <para>Expected behaviour when entering duplicate
                        values violating integrity constraints. Look for error
                        messages as well.</para>
                      </listitem>
                    </itemizedlist>

                    <para>You may write code to initialize the database state
                    appropriately prior to start tests.</para>
                  </question>

                  <answer>
                    <para><productname
                    xlink:href="http://testng.org">TestNG</productname> may be
                    directed by
                    <classname>sda.jdbc.intro.v1.prepare.PersistenceHandlerTest</classname>.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>
        </section>

        <section xml:id="jdbcRead">
          <title>Read Access</title>

          <para>So far we've sent records to a database server. Applications
          however need both directions: Pushing data to a Server and receiving
          data as well. The overall process looks like:</para>

          <figure xml:id="jdbcReadWrite">
            <title>Server / client object's life cycle</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcReadWrite.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>So far we've only covered the second (<code>UPDATE</code>)
          part of this picture. Reading objects from a database server into a
          client's (transient) address space requires a container object to
          hold the data in question. Though <link
          linkend="gloss_Java"><trademark>Java</trademark></link> offers
          standard container interfaces like
          <classname>java.util.List</classname> the <trademark
          xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
          standard has created separate specifications like
          <classname>java.sql.ResultSet</classname>. Instances of
          <classname>java.sql.ResultSet</classname> will hold transient copies
          of (database) objects. The next figure outlines the basic
          approach:</para>

          <figure xml:id="figJdbcRead">
            <title>Reading data from a database server.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/jdbcread.fig" scale="65"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>We take an example. Suppose our database contains a table of
          our friends' nicknames and their respective birth dates:</para>

          <table border="1" xml:id="figRelationFriends">
            <caption>Names and birth dates of friends.</caption>

            <tr>
              <td><programlisting>CREATE TABLE Friends (
   id INTEGER NOT NULL PRIMARY KEY
  ,nickname char(10)
  ,birthdate DATE
);</programlisting></td>

              <td><programlisting>INSERT INTO Friends VALUES
   (1, 'Jim', '1991-10-10')
  ,(2, 'Eve', '2003-05-24')
  ,(3, 'Mick','2001-12-30')
  ;</programlisting></td>
            </tr>
          </table>

          <para>Following the outline in <xref linkend="figJdbcRead"/> we may
          access our data by:</para>

          <figure xml:id="listingJdbcRead">
            <title>Accessing relational data</title>

            <programlisting language="java">package sda.jdbc.intro;
...
public class SimpleRead {

   public static void main(String[] args) throws SQLException {
      
      // Step 1: Open a connection to the database server
      final Connection conn = DriverManager.getConnection (
            DbProps.getString("PersistenceHandler.jdbcUrl"),  
            DbProps.getString("PersistenceHandler.username"),
            DbProps.getString("PersistenceHandler.password"));
      
      // Step 2: Create a Statement instance
      final Statement stmt = conn.createStatement();
      
      <emphasis role="bold">// Step 3: Creating the client side JDBC container holding our data records</emphasis>
      <emphasis role="bold">final ResultSet data = stmt.executeQuery("SELECT * FROM Friends");</emphasis> <co
                linkends="listingJdbcRead-1" xml:id="listingJdbcRead-1-co"/>
      
      <emphasis role="bold">// Step 4: Dataset iteration
      while (data.next()) {</emphasis> <co linkends="listingJdbcRead-2"
                xml:id="listingJdbcRead-2-co"/>
         <emphasis role="bold">System.out.println(data.getInt("id")</emphasis> <co
                linkends="listingJdbcRead-3" xml:id="listingJdbcRead-3-co"/>
                   <emphasis role="bold">+ ", " + data.getString("nickname")</emphasis> <co
                linkends="listingJdbcRead-3" xml:id="listingJdbcRead-4-co"/>
                   <emphasis role="bold">+ ", " + data.getString("birthdate"));</emphasis> <co
                linkends="listingJdbcRead-3" xml:id="listingJdbcRead-5-co"/>
      }
   }
}</programlisting>
          </figure>

          <para>The marked code segment above shows difference with respect to
          our data insertion application
          <classname>sda.jdbc.intro.SimpleInsert</classname>. Some remarks are
          in order:</para>

          <calloutlist>
            <callout arearefs="listingJdbcRead-1-co"
                     xml:id="listingJdbcRead-1">
              <para>As being mentioned in the introduction to this section the
              <trademark
              xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
              standard comes with its own container interface rather than
              <classname>java.util.List</classname> or similar.</para>
            </callout>

            <callout arearefs="listingJdbcRead-2-co"
                     xml:id="listingJdbcRead-2">
              <para>Calling <link
              xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#next()">next()</link>
              prior to actually accessing data on the client side is
              mandatory! The <link
              xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#next()">next()</link>
              method places the internal iterator to the first element of our
              dataset if not empty. Follow the link address and **read** the
              documentation.</para>
            </callout>

            <callout arearefs="listingJdbcRead-3-co listingJdbcRead-4-co listingJdbcRead-5-co"
                     xml:id="listingJdbcRead-3">
              <para>The access methods have to be chosen according to matching
              types. An overview of database/<link
              linkend="gloss_Java"><trademark>Java</trademark></link> type
              mappings is being given in <uri
              xlink:href="http://docs.oracle.com/javase/1.3/docs/guide/jdbc/getstart/mapping.html">http://docs.oracle.com/javase/1.3/docs/guide/jdbc/getstart/mapping.html</uri>.</para>
            </callout>
          </calloutlist>

          <qandaset role="exercise">
            <title>Getter methods and type conversion</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Apart from type mappings the <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  access methods like <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getString(int)">getString()</link>
                  may also be used for type conversion. Modify <xref
                  linkend="listingJdbcRead"/> by:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Read the database attribute <code>id</code> by
                      <link
                      xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getString(java.lang.String)">getString(String)</link>.</para>
                    </listitem>

                    <listitem>
                      <para>Read the database attribute nickname by <link
                      xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getInt(java.lang.String)">getInt(String)</link>.</para>
                    </listitem>
                  </itemizedlist>

                  <para>What do you observe?</para>
                </question>

                <answer>
                  <para>Modifying our iteration loop:</para>

                  <programlisting>// Step 4: Dataset iteration
while (data.next()) {
    System.out.println(data.<emphasis role="bold">getString</emphasis>("id") <co
                      linkends="jdbcReadWrongType-1"
                      xml:id="jdbcReadWrongType-1-co"/>
           + ", " + data.<emphasis role="bold">getInt</emphasis>("nickname") <co
                      linkends="jdbcReadWrongType-2"
                      xml:id="jdbcReadWrongType-2-co"/>
           + ", " + data.getString("birthdate"));
}</programlisting>

                  <para>We observe:</para>

                  <calloutlist>
                    <callout arearefs="jdbcReadWrongType-1-co"
                             xml:id="jdbcReadWrongType-1">
                      <para>Calling <link
                      xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getString(int)">getString()</link>
                      for a database attribute of type INTEGER does not cause
                      any trouble: The value gets silently converted to a
                      string value.</para>
                    </callout>

                    <callout arearefs="jdbcReadWrongType-2-co"
                             xml:id="jdbcReadWrongType-2">
                      <para>Calling <link
                      xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#getInt(java.lang.String)">getInt(String)</link>
                      for the database field of type CHAR yields an (expected)
                      Exception:</para>
                    </callout>
                  </calloutlist>

                  <programlisting>Exception in thread "main" java.sql.SQLException: Invalid value for getInt() - 'Jim'
  at com.mysql.jdbc.SQLError.createSQLException(SQLError.java:1073)
...</programlisting>

                  <para>We may however provide <quote>compatible</quote> data
                  records:</para>

                  <programlisting>DELETE FROM Friends;
INSERT INTO Friends VALUES (1, <emphasis role="bold">'31'</emphasis>, '1991-10-10');</programlisting>

                  <para>This time our application executes perfectly
                  well:</para>

                  <programlisting>1, 31, 1991-10-10</programlisting>

                  <para>Conclusion: The <trademark
                  xlink:href="http://electronics.zibb.com/trademark/jdbc/29545026">JDBC</trademark>
                  driver performs a conversion from a string type to an
                  integer similar like the <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/lang/Integer.html#parseInt(java.lang.String)">parseInt(String)</link>
                  method.</para>

                  <para>The next series of exercises aims on a more powerful
                  implementation of our person data insertion application in
                  <xref linkend="exerciseInsertLoginCredentials"/>.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Handling NULL values.</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>The attribute <code>birthday</code> in our database
                  table Friends allows <code>NULL</code> values:</para>

                  <programlisting>INSERT INTO Friends VALUES
   (1, 'Jim', '1991-10-10')
  ,(2, <emphasis role="bold"> NULL</emphasis>, '2003-5-24')
  ,(3, 'Mick', '2001-12-30');</programlisting>

                  <para>Starting our current application yields:</para>

                  <programlisting>1, Jim, 1991-10-10
2, null, 2003-05-24
3, Mick, 2001-12-30</programlisting>

                  <para>This might be confuses with a person having the
                  nickname <quote>null</quote>. Instead we would like to
                  have:</para>

                  <programlisting>1, Jim, 1991-10-10
2, -Name unknown- , 2003-05-24
3, Mick, 2001-12-30</programlisting>

                  <para>Extend the current code of
                  <classname>sda.jdbc.intro.SimpleRead</classname> to produce
                  the above result in case of nickname <code>NULL</code>
                  values.</para>

                  <para>Hint: Read the documentation of <link
                  xlink:href="http://docs.oracle.com/javase/7/docs/api/java/sql/ResultSet.html#wasNull()">wasNull()</link>.</para>
                </question>

                <answer>
                  <para>A possible implementation is being given in
                  <classname>sda.jdbc.intro.v1.SimpleRead</classname>.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>A user authentication <quote>strategy</quote></title>

            <qandadiv>
              <qandaentry xml:id="exerciseInsecureAuth">
                <question>
                  <para>Our current application for entering
                  <code>Person</code> records lacks authentication: A user
                  simply connects to the database using credentials being hard
                  coded in a properties file. A programmer suggests to
                  implement authentication based on the following extension of
                  the <code>Person</code> table:</para>

                  <programlisting>CREATE TABLE Person (
   name char(80) NOT NULL
  ,email CHAR(20) NOT NULL UNIQUE
  ,login CHAR(10)  UNIQUE -- login names must be unique --
  ,password CHAR(20)    
);</programlisting>

                  <para>On clicking <quote>Connect</quote> a user may enter
                  his login name and password, <quote>fred</quote> and
                  <quote>12345678</quote> in the following example:</para>

                  <figure xml:id="figLogin">
                    <title>Login credentials for database connection</title>

                    <mediaobject>
                      <imageobject>
                        <imagedata fileref="Ref/Screen/login.screen.png"
                                   scale="90"/>
                      </imageobject>
                    </mediaobject>
                  </figure>

                  <para>Based on these input values the following SQL query is
                  being executed by a
                  <classname>java.sql.Statement</classname> object:</para>

                  <programlisting>SELECT * FROM Person WHERE login='<emphasis
                      role="bold">fred</emphasis>' and password = '<emphasis
                      role="bold">12345678</emphasis>'</programlisting>

                  <para>Since the login attribute is UNIQUE we are sure to
                  receive either 0 or 1 dataset. Our programmer proposes to
                  grant login if the query returns at least one
                  dataset.</para>

                  <para>Discuss this implementation sketch with a colleague.
                  Do you think this is a sensible approach? <emphasis
                  role="bold">Write down</emphasis> your results.</para>
                </question>

                <answer>
                  <para>The approach is essentially unusable due to severe
                  security implications. Since it is based on
                  <classname>java.sql.Statement</classname> rater than on
                  <classname>java.sql.PreparedStatement</classname> objects it
                  is vulnerable to SQL injection attacks. A user my enter the
                  following password value in the GUI:</para>

                  <programlisting>sd' OR '1' = '1</programlisting>

                  <para>Based on the login name <quote>fred</quote> the
                  following SQL string is being crafted:</para>

                  <programlisting>SELECT * FROM Person WHERE login='fred' and password = 'sd' OR <emphasis
                      role="bold">'1' = '1'</emphasis>;</programlisting>

                  <para>Since the WHERE clause's last component always
                  evaluates to true, all objects from the <code>Person</code>
                  relation are returned thus permitting login.</para>

                  <para>The implementation approach suffers from a second
                  deficiency: The passwords are stored in clear text. If an
                  attacker gains access to the <code>Person</code> table he'll
                  immediately retrieve the passwords of all users. This
                  problem can be solved by storing hash values of passwords
                  rather than the clear text values themselves.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise" xml:id="passwordHashes">
            <title>Passwords and hash values</title>

            <qandadiv>
              <qandaentry xml:id="exerciseHashTraining">
                <question>
                  <para>In exercise <xref linkend="exerciseInsecureAuth"/> we
                  discarded the idea of clear text passwords in favour of
                  password hashes. In order to avoid Rainbow cracking so
                  called salted hashes are superior. You should read <uri
                  xlink:href="https://www.heckrothindustries.co.uk/articles/an-introduction-to-password-hashes">https://www.heckrothindustries.co.uk/articles/an-introduction-to-password-hashes</uri>
                  for overview purposes. The article contains further
                  references on the bottom of the page.</para>

                  <para>With respect to an implementation <uri
                  xlink:href="http://stackoverflow.com/questions/2860943/suggestions-for-library-to-hash-passwords-in-java#11038230">http://stackoverflow.com/questions/2860943/suggestions-for-library-to-hash-passwords-in-java</uri>
                  provides a simple example for:</para>

                  <itemizedlist>
                    <listitem>
                      <para>Creating a salted hash from a given password
                      string.</para>
                    </listitem>

                    <listitem>
                      <para>Verify if a hash string matches a given clear text
                      password.</para>
                    </listitem>
                  </itemizedlist>

                  <para>The example uses an external library. On <productname
                  xlink:href="http://www.ubuntu.com">Ubuntu</productname>
                  Linux this may be installed by issuing
                  <command>aptitude</command> <option>install</option>
                  <option>libcommons-codec-java</option>. On successful
                  install the file
                  <filename>/usr/share/java/commons-codec-1.5.jar</filename>
                  may be appended to your <envar>CLASSPATH</envar>.</para>

                  <para>You may as well use <uri
                  xlink:href="http://crackstation.net/hashing-security.htm#javasourcecode">http://crackstation.net/hashing-security.htm#javasourcecode</uri>
                  as a starting point. This example works standalone without
                  needing an external library. Note: Tis example produces
                  different (incompatible) hash values.</para>

                  <para>Create a simple main() method to experiment with the
                  two class methods.</para>
                </question>

                <answer>
                  <para>Starting from <uri
                  xlink:href="http://stackoverflow.com/questions/2860943/suggestions-for-library-to-hash-passwords-in-java#11038230">http://stackoverflow.com/questions/2860943/suggestions-for-library-to-hash-passwords-in-java</uri>
                  we create a slightly modified class
                  <classname>sda.jdbc.intro.auth.HashProvider</classname>
                  offering both hash providing <coref
                  linkend="hashProviderMethod"/> and verifying <coref
                  linkend="hashVerifyMethod"/> methods:</para>

                  <programlisting language="java">package sda.jdbc.intro.auth;
...
public class HashProvider {
...
    /** Computes a salted PBKDF2 hash of given plaintext password
        suitable for storing in a database. */
    public static <emphasis role="bold">String getSaltedHash</emphasis> <co
                      xml:id="hashProviderMethod"/>(char [] password) {
        byte[] salt;
      try {
         salt = SecureRandom.getInstance("SHA1PRNG").generateSeed(saltLen);
         // store the salt with the password
         return Base64.encodeBase64String(salt) + "$" + hash(password, salt);
      } catch (NoSuchAlgorithmException e) {
         e.printStackTrace();
      }
      System.exit(1);
      return null;
    }

    /** Checks whether given plaintext password corresponds 
        to a stored salted hash of the password. */
    public static <emphasis role="bold">boolean check</emphasis> <co
                      xml:id="hashVerifyMethod"/>(char[] password, String stored){
        String[] saltAndPass = stored.split("\\$");
        if (saltAndPass.length != 2)
            return false;
        String hashOfInput = hash(password, Base64.decodeBase64(saltAndPass[0]));
        return hashOfInput.equals(saltAndPass[1]);
    }
...}</programlisting>

                  <para>We may test the two class methods
                  <methodname>sda.jdbc.intro.auth.HashProvider.getSaltedHash(char[])</methodname>(...)
                  and
                  <methodname>sda.jdbc.intro.auth.HashProvider.check(char[],String)</methodname>
                  by a separate driver class. Notice the <quote>$</quote> sign
                  <coref linkend="saltPwhashSeparator"/> separating salt and
                  password hash:</para>

                  <programlisting language="java">package sda.jdbc.intro.auth;

public class TestHashProvider {

   public static void main(String [] args) throws Exception {
      final char [] clearText = {'s', 'e', 'c'};
      final String hash = <emphasis role="bold">HashProvider.getSaltedHash(clearText)</emphasis>;
      System.out.println("Hash:" + hash);
      if (HashProvider.check(clearText,                 <co
                      xml:id="saltPwhashSeparator"/>
            "<emphasis role="bold">HwX2DkuYiwp7xogm3AGndza8DKRVvCMntxRvCrCGFPw=</emphasis>$<emphasis
                      role="bold">6Ix11yHNB4uPZuF2IQYxVV/MYragJwTDE33OIFR9a24=</emphasis>")) {
         System.out.println("hash matches");
      } else {
         System.out.println("hash does not match"); ...</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise" xml:id="exercise_GuiEnterPersonAuth">
            <title>Gui authentication: The real McCoy</title>

            <qandadiv>
              <qandaentry xml:id="exerciseInsertLoginCredentials">
                <question>
                  <para>We now implement a refined version to enter
                  <code>Person</code> records based on the solutions of two
                  related exercises:</para>

                  <glosslist>
                    <glossentry>
                      <glossterm><xref
                      linkend="exercisefilterUserInput"/></glossterm>

                      <glossdef>
                        <para>Avoiding SQL injection by sanitizing user
                        input</para>
                      </glossdef>
                    </glossentry>

                    <glossentry>
                      <glossterm><xref
                      linkend="exerciseSqlInjectPrepare"/></glossterm>

                      <glossdef>
                        <para>Avoiding SQL injection by using
                        <classname>java.sql.PreparedStatement</classname>
                        objects.</para>
                      </glossdef>
                    </glossentry>
                  </glosslist>

                  <para>A better solution should combine both techniques.
                  Non-vulnerability a basic requirement. Checking an E-Mail
                  for minimal conformance is an added value.</para>

                  <para>In order to address authentication the relation Person
                  has to be extended appropriately. The GUI needs two
                  additional fields for login name and password as well. The
                  following video demonstrates the intended behaviour:</para>

                  <figure xml:id="videoConnectAuth">
                    <title>Intended usage behaviour for insertion of data
                    records.</title>

                    <mediaobject>
                      <videoobject>
                        <videodata fileref="Ref/Video/connectauth.mp4"/>
                      </videoobject>
                    </mediaobject>
                  </figure>

                  <para>Don't forget to use password hashes like those from
                  <xref linkend="exerciseHashTraining"/>. Due to their length
                  you may want to consider the data type
                  <code>TEXT</code>.</para>
                </question>

                <answer>
                  <para>In comparison to earlier versions it does make sense
                  to add some internal container structures. First we note,
                  that each GUI input field requires:</para>

                  <itemizedlist>
                    <listitem>
                      <para>A label like <quote>Enter password</quote>.</para>
                    </listitem>

                    <listitem>
                      <para>A corresponding field object to hold user entered
                      input.</para>
                    </listitem>

                    <listitem>
                      <para>A validator checking for correctness of entered
                      data.</para>
                    </listitem>

                    <listitem>
                      <para>A label or text field for warning messages in case
                      of invalid user input.</para>
                    </listitem>
                  </itemizedlist>

                  <para>First we start by grouping label <coref
                  linkend="uiuLabel"/>, input field's verifier <coref
                  linkend="uiuVerifier"/> and the error message label <coref
                  linkend="uiuErrmsg"/> in
                  <classname>sda.jdbc.intro.auth.UserInputUnit</classname>:</para>

                  <programlisting>package sda.jdbc.intro.auth;
...
public class UserInputUnit {
   
   final JLabel label; <co xml:id="uiuLabel"/>
   final InputVerifierNotify verifier; <co xml:id="uiuVerifier"/>
   final JLabel errorMessage; <co xml:id="uiuErrmsg"/>
   
   public UserInputUnit(final String guiText, final InputVerifierNotify verifier) {
      this.label = new JLabel(guiText);
      this.verifier = verifier;
      errorMessage = new JLabel();
   } ...</programlisting>

                  <para>The actual GUI text field is being defined <coref
                  linkend="verfierGuiField"/> in class
                  <classname>sda.jdbc.intro.auth.InputVerifierNotify</classname>:</para>

                  <programlisting language="java">package sda.jdbc.intro.auth;
...
public abstract class InputVerifierNotify extends InputVerifier {

   protected final String errorMessage;
   public final JLabel validationLabel;
   public final JTextField field; <co xml:id="verfierGuiField"/>
   
   public InputVerifierNotify(final JTextField field, final String errorMessage) { ...</programlisting>

                  <para>We need two field verifier classes being derived from
                  <classname>sda.jdbc.intro.auth.InputVerifierNotify</classname>:</para>

                  <glosslist>
                    <glossentry>
                      <glossterm><classname>sda.jdbc.intro.auth.RegexpVerifier</classname></glossterm>

                      <glossdef>
                        <para>This one is well known from earlier versions and
                        is used to validate text input fields by regular
                        expressions.</para>
                      </glossdef>
                    </glossentry>

                    <glossentry>
                      <glossterm><classname>sda.jdbc.intro.auth.InputVerifierNotify</classname></glossterm>

                      <glossdef>
                        <para>This verifier class is responsible for comparing
                        our two password fields to have identical
                        values.</para>
                      </glossdef>
                    </glossentry>
                  </glosslist>

                  <para>All these components get assembled in
                  <classname>sda.jdbc.intro.auth.InsertPerson</classname>. We
                  remark some important points:</para>

                  <programlisting>package sda.jdbc.intro.auth;
...
public class InsertPerson extends JFrame {
... // GUI attributes for user input
   final UserInputUnit name = <co linkends="listingInsertUserAuth-1"
                      xml:id="listingInsertUserAuth-1-co"/>
         new UserInputUnit(
               "Name",
               new RegexpVerifier(new JTextField(15), "^[^;'\"]+$", "No special characters allowed"));

   // We need a reference to the password field to avoid
   // casting from JTextField later.
   private final JPasswordField passwordField = new JPasswordField(10); <co
                      linkends="listingInsertUserAuth-2"
                      xml:id="listingInsertUserAuth-2-co"/>
   private final UserInputUnit password =
         new UserInputUnit(
               "Password",
               new RegexpVerifier(passwordField, "^.{6,20}$", "length from 6 to 20 characters"));
...
   private final UserInputUnit passwordRepeat =
         new UserInputUnit(
               "repeat pass.",
               new EqualValueVerifier <co linkends="listingInsertUserAuth-3"
                      xml:id="listingInsertUserAuth-3-co"/> (new JPasswordField(10), passwordField, "Passwords do not match"));
   
   private final UserInputUnit [] userInputUnits = <co
                      linkends="listingInsertUserAuth-4"
                      xml:id="listingInsertUserAuth-4-co"/>
      {name, email, login, password, passwordRepeat};
...   
   private void userLoginDialog() {...}
...
   public InsertPerson (){
...
      databaseFieldPanel.setLayout(new GridLayout(0, 3)); //Third column for validation label
      add(databaseFieldPanel);

      for (UserInputUnit unit: userInputUnits) { <co
                      linkends="listingInsertUserAuth-5"
                      xml:id="listingInsertUserAuth-5-co"/>
         databaseFieldPanel.add(unit.label);
         databaseFieldPanel.add(unit.verifier.field);
         databaseFieldPanel.add(unit.verifier.validationLabel);
      } 
      insertButton.addActionListener(new ActionListener() {
         @Override public void actionPerformed(ActionEvent e) {
            if (inputValuesAllValid()) {
               if (persistenceHandler.add( <co
                      linkends="listingInsertUserAuth-6"
                      xml:id="listingInsertUserAuth-6-co"/>
                     name.getText(),
                     email.getText(),
                     login.getText(),
                     passwordField.getPassword())) {
                  clearMask();
...}
   private void clearMask() {  <co linkends="listingInsertUserAuth-7"
                      xml:id="listingInsertUserAuth-7-co"/>
      for (UserInputUnit unit: userInputUnits) {
         unit.verifier.field.setText("");
         unit.verifier.clear();
      }
   }
   private boolean inputValuesAllValid() {<co
                      linkends="listingInsertUserAuth-8"
                      xml:id="listingInsertUserAuth-8-co"/>
      for (UserInputUnit unit: userInputUnits) {
         if (!unit.verifier.verify(unit.verifier.field)){
            return false;
         }
      }
      return true;
   }   
}</programlisting>

                  <calloutlist>
                    <callout arearefs="listingInsertUserAuth-1-co"
                             xml:id="listingInsertUserAuth-1">
                      <para>All GUI related stuff for entering a user's
                      name</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-2-co"
                             xml:id="listingInsertUserAuth-2">
                      <para>Password fields need special treatment:
                      <code>getText()</code> is superseded by
                      <code>getPassword()</code>. In order to avoid casts from
                      <classname>javax.swing.JTextField</classname> to
                      <classname>javax.swing.JPasswordField</classname> we
                      simply keep an extra reference.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-3-co"
                             xml:id="listingInsertUserAuth-3">
                      <para>In order to check both password fields for
                      identical values we need a different validator
                      <classname>sda.jdbc.intro.auth.EqualValueVerifier</classname>
                      expecting both password fields in its
                      constructor.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-4-co"
                             xml:id="listingInsertUserAuth-4">
                      <para>All 5 user input elements get grouped by an array.
                      This allows for iterations like in <coref
                      linkend="listingInsertUserAuth-7-co"/> or <coref
                      linkend="listingInsertUserAuth-8-co"/>.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-5-co"
                             xml:id="listingInsertUserAuth-5">
                      <para>Adding all GUI elements to the base pane in a
                      loop.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-6-co"
                             xml:id="listingInsertUserAuth-6">
                      <para>Providing user entered values to the persistence
                      provider.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-7-co"
                             xml:id="listingInsertUserAuth-7">
                      <para>Whenever a dataset has been successfully sent to
                      the database we have to clean our GUI to possibly enter
                      another record.</para>
                    </callout>

                    <callout arearefs="listingInsertUserAuth-8-co"
                             xml:id="listingInsertUserAuth-8">
                      <para>Thanks to our grouping aggregation of individual
                      input GUI field validation states becomes easy.</para>
                    </callout>
                  </calloutlist>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <qandaset role="exercise">
            <title>Architectural security considerations</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>In <xref linkend="exercise_GuiEnterPersonAuth"/> we
                  achieved end user credential protection. How about the
                  overall application security? Provide improvement proposals
                  if appropriate.</para>
                </question>

                <answer>
                  <para>Connecting the client to our database server solely
                  depends on credentials <coref
                  linkend="databaseUserHdmPassword"/> being stored in a
                  properties file
                  <filename>database.properties</filename>:</para>

                  <programlisting>PersistenceHandler.jdbcUrl=jdbc:mysql://localhost:3306/hdm
PersistenceHandler.username=hdmuser <co xml:id="databaseUserHdmUsername"/>
PersistenceHandler.password=<emphasis role="bold">XYZ</emphasis> <co
                      xml:id="databaseUserHdmPassword"/></programlisting>

                  <para>This properties file is user accessible and contains
                  the password in clear text. Arbitrary applications
                  connecting to the database server using this account do have
                  all permissions being granted to <code>hdmuser</code> <coref
                  linkend="databaseUserHdmUsername"/>. In order for our
                  application to work correctly the set of granted permissions
                  contains at least inserting datasets. Thus new users e.g.
                  <code>smith</code> including credentials may be inserted.
                  Afterwards the original application can be started by
                  logging in as <code>smith</code>.</para>

                  <para>Conclusion: The current application architecture is
                  seriously flawed with respect to security.</para>

                  <para>Rather then using a common database account
                  <code>hdmuser</code> we may configure per-user accounts on
                  the database server having individual user credentials. This
                  way user credentials are no longer stored in our
                  <code>Person</code> table but are being managed by the
                  database server's user management and privilege facilities.
                  This completely avoids storing credentials on the client
                  side.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>
      </section>
    </chapter>

    <chapter xml:id="chapUnitTesting">
      <title>Unit testing with <productname
      xlink:href="http://testng.org">TestNG</productname></title>

      <para>This chapter presents a very short introduction to the basic usage
      of unit testing. We start with a simple stack implementation:</para>

      <programlisting language="java">package sda.unittesting;

public class MyStack {
   int [] data = new int[5];  
   int numElements = 0;
   
   public void push(final int n) {
      data[numElements] = n;
      numElements++;
   }
   public int pop() {
      numElements--;
      return data[numElements];
   }
   public int top() {
      return data[numElements - 1];
   }
   public boolean empty() {
      return 0 == numElements;
   }
}</programlisting>

      <para>Readers being familiar with stacks will immediately notice a
      deficiency in the above code: This stack is actually bounded. It only
      allows us to store a maximum number of five integer values.</para>

      <para>The following implementation allows us to functionally test our
      <classname>sda.unittesting.MyStack</classname> implementation with
      respect to the usual stack behaviour:</para>

      <programlisting language="java" linenumbering="numbered">package sda.unittesting;

public class MyStackFuncTest {

   private static void assertTrue(boolean status) {
      if (!status) {
         throw new RuntimeException("Assert failed");
      }
   }
   public static void main(String[] args) {
      final MyStack stack = new MyStack();
      // Test 1: A new MyStack instance should not contain any elements.
      assertTrue(stack.empty());

      // Test 2: Adding and removal
      stack.push(4);
      assertTrue (!stack.empty());
      assertTrue (4 == stack.top());
      assertTrue (4 == stack.pop());
      assertTrue (stack.empty());

      // Test 3: Trying to add more than five values
      stack.push(1);stack.push(2);stack.push(3);stack.push(4);
      stack.push(5);
      stack.push(6);
      assertTrue(6 == stack.pop());
   }
}</programlisting>

      <para>Execution yields a runtime exception which is due to the attempted
      insert operation <code>stack.push(6)</code>:</para>

      <programlisting>Exception in thread "main" java.lang.ArrayIndexOutOfBoundsException: 5
 at sda.unittesting.MyStack.push(MyStack.java:8)
 at sda.unittesting.MyStackFuncTest.main(MyStackFuncTest.java:20)</programlisting>

      <para>The execution result is easy to understand since our
      <classname>sda.unittesting.MyStack </classname> implementation only
      allows to store 5 values.</para>

      <para>Our testing application is fine so far. It does however lack some
      features:</para>

      <itemizedlist>
        <listitem>
          <para>automatic initialization before starting tests and
          finalization at the end.</para>
        </listitem>

        <listitem>
          <para>Our test is monolithic: We used comments to document different
          tests. This knowledge is implicit and thus invisible to testing
          frameworks. Test results (failure/success) cannot be assigned to
          test 1, test 2 for example.</para>
        </listitem>

        <listitem>
          <para>Aggregation and visualization of test results</para>
        </listitem>

        <listitem>
          <para>Dependencies between individual tests</para>
        </listitem>

        <listitem>
          <para>Ability to enable and disable tests according to a project's
          maturity level. In our example test 3 might be disabled till an
          unbounded implementation gets completed.</para>
        </listitem>
      </itemizedlist>

      <para>Testing frameworks like <productname
      xlink:href="http://junit.org">Junit</productname> or <productname
      xlink:href="http://testng.org">TestNG</productname> provide means for
      efficient and flexible test organization. Using <productname
      xlink:href="http://testng.org">TestNG</productname> our current test
      application including only test 1 and test 2 reads:</para>

      <programlisting language="java">package sda.unittesting;

import org.testng.annotations.Test;

public class MyStackTestSimple {

   final MyStack stack = new MyStack();
   
  @Test
  public void empty() {
    assert(stack.empty());
  }
  @Test
  public void pushPopEmpty() {
    assert (stack.empty());
    stack.push(4);
    assert (!stack.empty());
    assert (4 == stack.top());
    assert (4 == stack.pop());
    assert (stack.empty());
  }
}</programlisting>

      <para>We notice the absence of a <function>main()</function> method. Our
      testing framework uses the above code for test definitions. In contrast
      to our homebrew solution the individual tests are now defined in a
      machine readable fashion. This allows for sophisticated statistics.
      Executing inside <productname
      xlink:href="http://testng.org">TestNG</productname> produces the
      following results:</para>

      <programlisting>PASSED: empty
PASSED: pushPopEmpty

===============================================
    Default test
    Tests run: 2, Failures: 0, Skips: 0
===============================================


===============================================
Default suite
Total tests run: 2, Failures: 0, Skips: 0
===============================================</programlisting>

      <para>Both tests run successfully. So why did we omit test 3 which is
      bound to fail? We now add it to the test suite:</para>

      <programlisting language="java">package sda.unittesting;
...
public class MyStackTestSimple1 {
...
  @Test
  public void empty() {
    assert(stack.empty());
...
  
  @Test
  public void push6() {
     stack.push(1);
     stack.push(2);
     stack.push(3);
     stack.push(4);
     stack.push(5);
     stack.push(6);
     assert (6 == stack.pop());
  } ...</programlisting>

      <para>As expected test 3 fails. But the result shows test 2 failing as
      well:</para>

      <programlisting>PASSED: empty
FAILED: push6
java.lang.ArrayIndexOutOfBoundsException: 5
	at sda.unittesting.MyStack.push(MyStack.java:8)
	at sda.unittesting.MyStackTestSimple1.push6(MyStackTestSimple1.java:30)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	...

FAILED: pushPopEmpty
java.lang.AssertionError
	at sda.unittesting.MyStackTestSimple1.pushPopEmpty(MyStackTestSimple1.java:15)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	...

===============================================
    Default test
    Tests run: 3, Failures: 2, Skips: 0
===============================================</programlisting>

      <para>This unexpected result is due to the execution order of the three
      individual tests. Within our class
      <classname>sda.unittesting.MyStackTestSimple1</classname> the three
      tests appear in the sequence test 1, test 2 and test 3. This however is
      just the order of source code. The testing framework will not infer any
      order and thus execute our three tests in <emphasis
      role="bold">arbitrary</emphasis> order. The execution log shows the
      actual order:</para>

      <orderedlist>
        <listitem>
          <para>Test <quote><code>empty</code></quote></para>
        </listitem>

        <listitem>
          <para>Test <quote><code>push6</code></quote></para>
        </listitem>

        <listitem>
          <para>Test <quote><code>pushPopEmpty</code></quote></para>
        </listitem>
      </orderedlist>

      <para>So the second test will raise an exception and leave the stack
      filled with the maximum possible five elements. Thus it is not empty and
      the <quote><code>pushPopEmpty</code></quote> test fails as well.</para>

      <para>If we want to avoid this type of errors we may:</para>

      <itemizedlist>
        <listitem>
          <para>Declare tests within separate (test class) definitions</para>
        </listitem>

        <listitem>
          <para>Define dependencies like test X can only be executed after
          test Y.</para>
        </listitem>
      </itemizedlist>

      <para>The <productname
      xlink:href="http://testng.org">TestNG</productname> framework offers a
      feature which allows the definition of test groups and dependencies
      between them. We use this feature to refine our test definition:</para>

      <programlisting language="java">package sda.unittesting;
...
public class MyStackTest {
  ...
  @Test (<emphasis role="bold">groups = "basic"</emphasis>)
  public void empty() {
    assert(stack.empty());
  }
  @Test (<emphasis role="bold">groups = "basic"</emphasis>)
  public void pushPopEmpty() {
    ...
  }
  
  @Test (<emphasis role="bold">dependsOnGroups = "basic"</emphasis>)
  public void push6() {
     ...  
  }</programlisting>

      <para>The first two tests will now belong to the same test group
      <quote>basic</quote>. The <emphasis role="bold"><code>dependsOnGroups =
      "basic"</code></emphasis> declaration will guarantee that our
      <code>push6</code> test will be launched as the last one. So we get the
      expected result:</para>

      <programlisting>PASSED: empty
PASSED: pushPopEmpty
FAILED: push6
java.lang.ArrayIndexOutOfBoundsException: 5
	at sda.unittesting.MyStack.push(MyStack.java:8)
	at sda.unittesting.MyStackTest.push6(MyStackTest.java:30)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
...


===============================================
    Default test
    Tests run: 3, Failures: 1, Skips: 0
===============================================</programlisting>

      <para>In fact the order between the first two tests might be critical as
      well. The <quote><code>pushPopEmpty</code></quote> test leaves our stack
      in an empty state. If this is not the case reversing the execution order
      of <quote><code>pushPopEmpty</code></quote> and
      <quote><code>empty</code></quote> would cause an error as well.</para>

      <para>Programming <abbrev
      xlink:href="http://en.wikipedia.org/wiki/Integrated_development_environment">IDE</abbrev>s
      like eclipse provide elements for test result visualization. Our last
      test gets summarized as:</para>

      <screenshot>
        <info>
          <title><productname
          xlink:href="http://testng.org">TestNG</productname> result
          presentation in eclipse</title>
        </info>

        <mediaobject>
          <imageobject>
            <imagedata fileref="Ref/Screen/eclipseTestngResult.screen.png"
                       scale="75"/>
          </imageobject>
        </mediaobject>
      </screenshot>

      <para>We can drill down from a result of type failure to its occurrence
      within the corresponding code.</para>
    </chapter>

    <chapter xml:id="fo">
      <title>Generating printed output</title>

      <titleabbrev>Print</titleabbrev>

      <section xml:id="foIntro">
        <title>Online and print versions</title>

        <titleabbrev>online / print</titleabbrev>

        <para>We already learned how to transform XML documents into HTML by
        means of a <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> style sheet
        processor. In principle we may create printed output by using a HTML
        Browser's print function. However the result will not meet reasonable
        typographical standards. A list of commonly required features for
        printed output includes:</para>

        <variablelist>
          <varlistentry>
            <term>Line breaks</term>

            <listitem>
              <para>Text paragraphs have to be divided into lines. To achieve
              best results the processor must implement the hyphenation rules
              of the language in question in order to automatically hyphenate
              long words. This is especially important for text columns of
              limited width as appearing in newspapers.</para>
            </listitem>
          </varlistentry>

          <varlistentry>
            <term>Page breaks</term>

            <listitem>
              <para>Since printed pages are limited in height the content has
              to be broken into pages. This may be difficult to
              achieve:</para>

              <itemizedlist>
                <listitem>
                  <para>Large images being indivisible may have to be deferred
                  to the following page leaving large amounts of empty
                  space.</para>
                </listitem>

                <listitem>
                  <para>Long tables may have to be subdivided into smaller
                  blocks. Thus it may be required to define sets of additional
                  footers like <quote>to be continued on the next page</quote>
                  and additional table headers containing column descriptions
                  on subsequent pages.</para>
                </listitem>
              </itemizedlist>
            </listitem>
          </varlistentry>

          <varlistentry>
            <term>Page references</term>

            <listitem>
              <para>Document internal references via <link
              xlink:href="http://www.w3.org/TR/xml#id">ID</link> / <link
              xlink:href="http://www.w3.org/TR/xml#idref">IDREF</link> pairs
              may be represented as page references like <quote>see page
              32</quote>.</para>
            </listitem>
          </varlistentry>

          <varlistentry>
            <term>Left and right pages</term>

            <listitem>
              <para>Books usually have a different layout for
              <quote>left</quote> and <quote>right</quote> pages. Page numbers
              usually appear on the left side of a <quote>left</quote> page
              and vice versa.</para>

              <para>Very often the head of each page contains additional
              information e.g. a chapter's name on each <quote>left</quote>
              page head and the actual section's name on each
              <quote>right</quote> page's head.</para>

              <para>In addition chapters usually start on a
              <quote>right</quote> page. Sometimes a chapter's starting page
              has special layout features e.g. a missing description in the
              page's head which will only be given on subsequent pages.</para>
            </listitem>
          </varlistentry>

          <varlistentry>
            <term>Footnotes</term>

            <listitem>
              <para>Footnotes have to be numbered on a per page basis and have
              to appear on the current page.</para>
            </listitem>
          </varlistentry>
        </variablelist>
      </section>

      <section xml:id="foStart">
        <title>A simple <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        document</title>

        <titleabbrev>Simple <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev></titleabbrev>

        <para>A renderer for printed output from XML content also needs
        instructions how to format the different elements. A common way to
        define these formatting properties is by using <emphasis>Formatting
        Objects</emphasis> (<abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>)
        standard. <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        documents may be compared to HTML. A HTML document has to be rendered
        by a piece of software called a browser in order to be viewed as an
        image. Likewise <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        documents have to be rendered by a piece of software called a
        formatting objects processor which typically yields PostScript or PDF
        output. As a starting point we take a simple example:</para>

        <figure xml:id="foHelloWorld">
          <title>The most simple <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          document</title>

          <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"&gt;

  &lt;fo:layout-master-set&gt;
    &lt;!-- Define a simple page layout --&gt;
    &lt;fo:simple-page-master master-name="simplePageLayout"
      page-width="60mm"   page-height="100mm"&gt;
      &lt;fo:region-body/&gt;
    &lt;/fo:simple-page-master&gt;
  &lt;/fo:layout-master-set&gt;
  &lt;!-- Print a set of pages using the previously defined layout --&gt;
  &lt;fo:page-sequence master-reference="simplePageLayout"&gt;
    &lt;fo:flow flow-name="xsl-region-body"&gt;
      <emphasis role="bold">&lt;fo:block&gt;Hello, World ...&lt;/fo:block&gt;</emphasis>
    &lt;/fo:flow&gt;
  &lt;/fo:page-sequence&gt;
&lt;/fo:root&gt;</programlisting>
        </figure>

        <para>PDF generation is initiated by executing a <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        processor. At the MI department the script <code>fo2pdf</code> invokes
        <orgname>RenderX</orgname>'s <productname
        xlink:href="http://www.renderx.com">xep</productname>
        processor:</para>

        <programlisting>fo2pdf -fo hello.fo -pdf hello.pdf</programlisting>

        <para>This creates a PDF file which may be printed or previewed by
        e.g. <productname
        xlink:href="http://www.adobe.com">Adobe</productname>'s acrobat reader
        or evince under Linux. For a list of command line options see
        <productname xlink:href="http://www.renderx.com/reference.html">xep's
        documentation</productname>.</para>
      </section>

      <section xml:id="layoutParam">
        <title>Page layout</title>

        <para>The result from of our <quote>Hello, World ...</quote> code is
        not very impressive. In order to develop more elaborated examples we
        have to understand the underlying layout model being defined in a
        <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_simple-page-master">fo:simple-page-master</link>
        element. First of all <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        allows to subdivide a physical page into different regions:</para>

        <figure xml:id="foRegionList">
          <title>Regions being defined in a page.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/regions.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>The most important area in this model is denoted by <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_region-body">fo:region-body</link>.
        Other regions like <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_region-before">fo:region-before</link>
        are typically used as containers for meta information such as chapter
        headings and page numbering. We take a closer look to the <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_region-body">fo:region-body</link>
        area and supply an example of parameterization:</para>

        <figure xml:id="foParamRegBody">
          <title>A complete <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          parameterizing of a physical page and the <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_region-body">fo:region-body</link>.</title>

          <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"
  font-size="6pt"&gt;

  &lt;fo:layout-master-set&gt; <co xml:id="programlisting_fobodyreg_masterset"/>
    &lt;fo:simple-page-master master-name="<emphasis role="bold">simplePageLayout</emphasis>" <co
              xml:id="programlisting_fobodyreg_simplepagelayout"/>
      page-width  = "50mm" page-height   = "80mm"
      margin-top  = "5mm"  margin-bottom = "20mm" 
      margin-left = "5mm"  margin-right  = "10mm"&gt;

      &lt;fo:region-body <co xml:id="programlisting_fobodyreg_regionbody"/>
        margin-top  = "10mm" margin-bottom = "5mm"
        margin-left = "10mm" margin-right  = "5mm"/&gt;
    &lt;/fo:simple-page-master&gt;
  &lt;/fo:layout-master-set&gt;

  &lt;fo:page-sequence master-reference="<emphasis role="bold">simplePageLayout</emphasis>"&gt; <co
              xml:id="programlisting_fobodyreg_pagesequence"/>
    &lt;fo:flow flow-name="xsl-region-body"&gt; <co
              xml:id="programlisting_fobodyreg_flow"/>
      &lt;fo:block space-after="2mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt; <co
              xml:id="programlisting_fobodyreg_block"/>
      &lt;fo:block space-after="2mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt; <coref
              linkend="programlisting_fobodyreg_block"/>
      &lt;fo:block space-after="2mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt; <coref
              linkend="programlisting_fobodyreg_block"/>
      &lt;fo:block space-after="2mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt; <coref
              linkend="programlisting_fobodyreg_block"/>
    &lt;/fo:flow&gt;
  &lt;/fo:page-sequence&gt;
&lt;/fo:root&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="programlisting_fobodyreg_masterset">
            <para>As the name suggests multiple layout definitions can appear
            here. In this example only one layout is defined.</para>
          </callout>

          <callout arearefs="programlisting_fobodyreg_simplepagelayout">
            <para>Each layout definition carries a key attribute master-name
            being unique with respect to all defined layouts appearing in
            <emphasis>the</emphasis> <tag
            class="starttag">fo:layout-master-set</tag>. We may thus call it a
            <emphasis>primary key</emphasis> attribute. The current layout
            definition's key has the value <code>simplePageLayout</code>. The
            length specifications appearing here are visualized in <xref
            linkend="paramRegBodyVisul"/> and correspond to the white
            rectangle.</para>
          </callout>

          <callout arearefs="programlisting_fobodyreg_regionbody">
            <para>Each layout definition <emphasis>must</emphasis> have a
            region body being the region in which the documents main text flow
            will appear. A layout definition <emphasis>may</emphasis> also
            define top, bottom and side regions as we will see <link
            linkend="paramHeadFoot">later</link>. The body region is shown
            with pink background in <xref
            linkend="paramRegBodyVisul"/>.</para>
          </callout>

          <callout arearefs="programlisting_fobodyreg_pagesequence">
            <para>A <abbrev
            xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
            document may have multiple page sequences for example one per each
            chapter of a book. It <emphasis>must</emphasis> reference an
            <emphasis>existing</emphasis> layout definition via its
            <code>master-reference</code> attribute. So we may regard this
            attribute as a foreign key targeting the set of all defined layout
            definitions.</para>
          </callout>

          <callout arearefs="programlisting_fobodyreg_flow">
            <para>A flow allows us to define in which region output shall
            appear. In the current example only one layout containing one
            region of type body definition being able to receive text output
            exists.</para>
          </callout>

          <callout arearefs="programlisting_fobodyreg_block">
            <para>A <tag class="starttag">fo:block</tag> element may be
            compared to a paragraph element <tag class="starttag">p</tag> in
            HTML. The attribute <link
            xlink:href="http://www.w3.org/TR/xsl/#space-after">space-after</link>="2mm"
            adds a space of two mm after each <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_block">fo:block</link>
            container.</para>
          </callout>
        </calloutlist>

        <para>The result looks like:</para>

        <figure xml:id="paramRegBodyVisul">
          <title>Parameterizing page- and region view port. All length
          dimensions are in mm.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/overlay.fig"/>
            </imageobject>
          </mediaobject>
        </figure>
      </section>

      <section xml:id="headFoot">
        <title>Headers and footers</title>

        <titleabbrev>Header/footer</titleabbrev>

        <para>Referring to <xref linkend="foRegionList"/> we now want to add
        fixed headers and footers frequently being used for page numbers. In a
        textbook each page might have the actual chapter's name in its header.
        This name should not change as long as the text below <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_region-body">fo:region-body</link>
        still belongs to the same chapter. In <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        this is achieved by:</para>

        <itemizedlist>
          <listitem>
            <para>Encapsulating each chapter's content in a <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_page-sequence">fo:page-sequence</link>
            of its own.</para>
          </listitem>

          <listitem>
            <para>Defining the desired header text below <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_static-content">fo:static-content</link>
            in the area defined by <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_region-before">fo:region-before</link>.</para>
          </listitem>
        </itemizedlist>

        <para>The notion <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_static-content">fo:static-content</link>
        refers to the fact that the content is constant (static) within the
        given page sequence. The new version reads:</para>

        <figure xml:id="paramHeadFoot">
          <title>Parameterizing header and footer.</title>

          <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;fo:root xmlns:fo="http://www.w3.org/1999/XSL/Format"
  font-size="6pt"&gt;
  
  &lt;fo:layout-master-set&gt;
    &lt;fo:simple-page-master master-name="simplePageLayout"
      page-width  = "50mm" page-height   = "80mm"
      margin-top  = "5mm"  margin-bottom = "20mm" 
      margin-left = "5mm"  margin-right  = "10mm"&gt;
      
      &lt;fo:region-body margin-top  = "10mm" margin-bottom = "5mm" <co
              xml:id="programlisting_head_foot_bodydef"/>
                      margin-left = "10mm" margin-right  = "5mm"/&gt;
      
      &lt;fo:region-before extent="5mm"/&gt; <co
              xml:id="programlisting_head_foot_beforedef"/>
      &lt;fo:region-after  extent="5mm"/&gt; <co
              xml:id="programlisting_head_foot_afterdef"/>
      
    &lt;/fo:simple-page-master&gt;
  &lt;/fo:layout-master-set&gt;
  
  &lt;fo:page-sequence master-reference="simplePageLayout"&gt;

    &lt;fo:static-content flow-name="xsl-region-before"&gt; <co
              xml:id="programlisting_head_foot_beforeflow"/>
      &lt;fo:block 
        font-weight="bold" 
        font-size="8pt"&gt;Headertext&lt;/fo:block&gt;
    &lt;/fo:static-content&gt;
    
    &lt;fo:static-content flow-name="xsl-region-after"&gt; <co
              xml:id="programlisting_head_foot_afterflow"/>
      &lt;fo:block&gt;
        &lt;fo:page-number/&gt;
      &lt;/fo:block&gt;
    &lt;/fo:static-content&gt;
    
    &lt;fo:flow flow-name="xsl-region-body"&gt;
      &lt;fo:block space-after="8mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt;
      &lt;fo:block space-after="8mm"&gt;Dumb text .. dumb text.&lt;/fo:block&gt;
      &lt;fo:block space-after="8mm"&gt;More text .. more text.&lt;/fo:block&gt;
      &lt;fo:block space-after="8mm"&gt;More text .. more text.&lt;/fo:block&gt;
      &lt;fo:block space-after="8mm"&gt;More text .. more text.&lt;/fo:block&gt;
    &lt;/fo:flow&gt;
  &lt;/fo:page-sequence&gt;
&lt;/fo:root&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="programlisting_head_foot_bodydef">
            <para>Defining the body region.</para>
          </callout>

          <callout arearefs="programlisting_head_foot_beforedef programlisting_head_foot_afterdef">
            <para>Defining two regions at the top and bottom of each page. The
            <code>extent</code> attribute denotes the height of these regions.
            <emphasis>Caveat</emphasis>: The attribute <code>extent</code>'s
            value gets subtracted from the <code>margin-top</code> or
            <code>margin-bottom</code> value being defined in the
            corresponding <tag class="starttag">fo:region-body</tag> element.
            So if we consider for example the <tag>fo:region-before</tag> we
            have to obey:</para>

            <para>extent &lt;= margin-top</para>

            <para>Otherwise we may not even see any output.</para>
          </callout>

          <callout arearefs="programlisting_head_foot_beforeflow">
            <para>A <code>fo:static-content</code> denotes text portions which
            are decoupled from the <quote>usual</quote> text flow. For example
            as a book's chapter advances over multiple pages we expect the
            constant chapter's title to appear on top of each page. In the
            current example the static string <code>Headertext</code> will
            appear on each page's top for the whole <tag
            class="starttag">fo:page-sequence</tag> in which it is defined.
            Notice the <code>flow-name="xsl-region-after"</code> reference to
            the region being defined in <coref
            linkend="programlisting_head_foot_beforedef"/>.</para>
          </callout>

          <callout arearefs="programlisting_head_foot_afterflow">
            <para>We do the same here for the page's footer. Instead of static
            text we output <tag>fo_page-number</tag> yielding the current
            page's number.</para>

            <para>This time <code>flow-name="xsl-region-after"</code>
            references the region definition in <coref
            linkend="programlisting_head_foot_afterdef"/>. Actually the
            attribute <code>flow-name</code> is restricted to the following
            five values corresponding to all possible region definitions
            within a layout:</para>

            <informaltable>
              <?dbhtml table-width="50%" ?>

              <?dbfo table-width="50%" ?>

              <tgroup cols="2">
                <colspec align="left" colwidth="1*"/>

                <colspec align="left" colwidth="1*"/>

                <tbody>
                  <row>
                    <entry><tag class="starttag">fo:region-body</tag></entry>

                    <entry>xsl-region-body</entry>
                  </row>

                  <row>
                    <entry><tag
                    class="starttag">fo:region-before</tag></entry>

                    <entry>xsl-region-before</entry>
                  </row>

                  <row>
                    <entry><tag class="starttag">fo:region-after</tag></entry>

                    <entry>xsl-region-after</entry>
                  </row>

                  <row>
                    <entry><tag class="starttag">fo:region-start</tag></entry>

                    <entry>xsl-region-start</entry>
                  </row>

                  <row>
                    <entry><tag class="starttag">fo:region-end</tag></entry>

                    <entry>xsl-region-end</entry>
                  </row>
                </tbody>
              </tgroup>
            </informaltable>
          </callout>
        </calloutlist>

        <para>This results in two pages with page numbers 1 and 2:</para>

        <mediaobject>
          <imageobject>
            <imagedata fileref="Ref/Fig/headfoot.fig"/>
          </imageobject>
        </mediaobject>

        <para>The free chapter from <xref linkend="bibHarold04"/> book
        contains additional information on extended <link
        xlink:href="http://www.cafeconleche.org/books/bible2/chapters/ch18.html#d1e2250">layout
        definitions</link>. The <orgname
        xlink:href="http://w3.org">W3C</orgname> as the holder of the FO
        standard defines the elements <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_layout-master-set">fo:layout-master-set</link>,
        <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_simple-page-master">fo:simple-page-master</link>
        and <link
        xlink:href="http://www.w3.org/TR/xsl/#fo_page-sequence">fo:page-sequence</link></para>
      </section>

      <section xml:id="foContainer">
        <title>Important Objects</title>

        <section xml:id="fo_block">
          <title><code>fo:block</code></title>

          <para>The FO standard borrows a lot from the CSS standard. Most
          formatting objects may have <link
          xlink:href="http://www.w3.org/TR/xsl/#section-N19349-Description-of-Property-Groups">CSS
          like properties</link> with similar semantics, some properties have
          been added. We take a <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_block">fo:block</link>
          container as an example:</para>

          <figure xml:id="blockInline">
            <title>A <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_block">fo:block</link>
            with a <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_inline">fo:inline</link>
            descendant.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/blockprop.fo.pdf"/>
              </imageobject>
            </mediaobject>

            <programlisting>...
&lt;fo:block font-weight='bold'
  border-bottom-style='dashed'
  border-style='solid'
  border='1mm'&gt;A lot of attributes and  &lt;fo:inline background-color='black'
    color='white'&gt;inverted&lt;/fo:inline&gt; text.&lt;/fo:block&gt; ...</programlisting>
          </figure>

          <para>The <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_inline">fo:inline</link>
          descendant serves as a means to change the <quote>current</quote>
          property set. In HTML/CSS this may be achieved by using the
          <code>SPAN</code> tag:</para>

          <programlisting>&lt;!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"&gt;
&lt;html&gt;
  &lt;head&gt;
    &lt;title&gt;Blocks/spans and CSS&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;Blocks/spans and CSS&lt;/h1&gt;
    &lt;p style="font-weight:  bold;   border: 1mm;
              border-style: solid;  border-bottom-style: dashed;"
     &gt;A lot of attributes and 
      &lt;span style="color: white;background-color: black;"
         &gt;inverted&lt;/span&gt; text.&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>

          <para>Though being encapsulated in an attribute <code>class</code>
          we find a one-to-one correspondence between FO and CSS in this case.
          The HTML rendering works as expected.<mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Screen/mozparaspancss.screen.png"/>
              </imageobject>
            </mediaobject>:</para>
        </section>

        <section xml:id="fo_list">
          <title>Lists</title>

          <para>The easiest type of lists are unlabeled (itemized) lists as
          being expressed by the <code>UL</code>/<code>LI</code> tags in HTML.
          FO allows a much more detailed parametrization regarding indents and
          distances between labels and item content. Relevant elements are
          <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_list-block">fo:list-block</link>,
          <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_list-item">fo:list-item</link>
          and <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_list-item-body">fo:list-item-body</link>.
          The drawback is a more complex setup for <quote>default</quote>
          lists:</para>

          <figure xml:id="listItemize">
            <title>An itemized list and result.</title>

            <programlisting>...
&lt;fo:list-block
  provisional-distance-between-starts="2mm"&gt;
  &lt;fo:list-item&gt;
    &lt;fo:list-item-label end-indent="label-end()"&gt;
      &lt;fo:block&gt;&amp;#8226;&lt;/fo:block&gt;
    &lt;/fo:list-item-label&gt;
    &lt;fo:list-item-body start-indent="body-start()"&gt;
      &lt;fo:block&gt;Flowers&lt;/fo:block&gt;
    &lt;/fo:list-item-body&gt;
  &lt;/fo:list-item&gt;
  
  &lt;fo:list-item&gt;
    &lt;fo:list-item-label end-indent="label-end()"&gt;
      &lt;fo:block&gt;&amp;#8226;&lt;/fo:block&gt;
    &lt;/fo:list-item-label&gt;
    &lt;fo:list-item-body start-indent="body-start()"&gt;
      &lt;fo:block&gt;Animals&lt;/fo:block&gt;
    &lt;/fo:list-item-body&gt;
  &lt;/fo:list-item&gt;
&lt;/fo:list-block&gt; ...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/itemize.fo.pdf"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>The result looks somewhat primitive in relation to the amount
          of source code it necessitates. The power of these constructs shows
          up when trying to format nested lists of possibly different types
          like enumerations or definition lists under the requirement of
          typographical excellence. More complex examples are presented in
          <link
          xlink:href="http://www.cafeconleche.org/books/bible2/chapters/ch18.html#d1e4979">Xmlbible
          book</link> of <xref linkend="bibHarold04"/>.</para>
        </section>

        <section xml:id="leaderRule">
          <title>Leaders and rules</title>

          <titleabbrev>Leaders/rules</titleabbrev>

          <para>Sometimes adjustable horizontal space between two neighbouring
          objects has to be filled e.g. in a book's table of contents. The
          <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_leader">fo:leader</link>
          serves this purpose:</para>

          <figure xml:id="leaderToc">
            <title>Two simulated entries in a table of contents.</title>

            <programlisting>...
&lt;fo:block text-align-last='justify'&gt;Valid
  XML&lt;fo:leader leader-pattern="dots"/&gt;
page 7&lt;/fo:block&gt;

&lt;fo:block text-align-last='justify'&gt;XSL
&lt;fo:leader leader-pattern='dots'/&gt;
page 42&lt;/fo:block&gt; ...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/leader.fo.pdf"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>The attributes' value <link
          xlink:href="http://www.w3.org/TR/xsl/#text-align-last">text-align-last</link>
          = <code>'justify'</code> forces the <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_block">fo:block</link> to
          extend to the available width of the current <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_region-body">fo:region-body</link>
          area. The <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_leader">fo:leader</link>
          inserts the necessary amount of content of the specified type
          defined in in <link
          xlink:href="http://www.w3.org/TR/xsl/#leader-pattern">leader-pattern</link>
          to fill up the gap between its neighbouring components. This
          principle can be extended to multiple objects:</para>

          <figure xml:id="leaderMulti">
            <title>Four entries separated by equal amounts of dotted
            space.</title>

            <programlisting>&lt;fo:block text-align-last='justify'&gt;A&lt;fo:leader
leader-pattern="dots"/&gt;B&lt;fo:leader
leader-pattern="dots"/&gt;C&lt;fo:leader leader-pattern="dots"/&gt;D&lt;/fo:block&gt;</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/leadermulti.fo.pdf"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>A <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_leader">fo:leader</link>
          may also be used to draw horizontal lines to separate objects. In
          this case there are no neighbouring components within the
          <quote>current</quote> line in which the <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_leader">fo:leader</link>
          appears. This is frequently used to draw a border between
          <code>xsl-region-body</code> and <code>xsl-region-before</code>
          and/or <code>xsl-region-after</code>:</para>

          <figure xml:id="leaderSeparate">
            <title>A horizontal line separator between header and body of a
            page.</title>

            <programlisting>...
&lt;fo:page-sequence master-reference="simplePageLayout"&gt;
  &lt;fo:static-content flow-name="xsl-region-before"&gt;
    &lt;fo:block text-align-last='justify'&gt;FO&lt;fo:leader/&gt;page 5&lt;/fo:block&gt;
    &lt;fo:block text-align-last='justify'&gt;
      &lt;fo:leader leader-pattern="rule" leader-length="100%"/&gt;
    &lt;/fo:block&gt;
  &lt;/fo:static-content&gt;
  &lt;fo:flow flow-name="xsl-region-body"&gt;
    &lt;fo:block&gt;Some body text ...&lt;/fo:block&gt;
  &lt;/fo:flow&gt;
&lt;/fo:page-sequence&gt;...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/separate.fo.pdf"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>Note the empty leader <code>&lt;</code> <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_leader">fo:leader</link>
          <code>/&gt;</code> between the <quote> <code>FO</code> </quote> and
          the <quote>page 5</quote> text node inserting horizontal whitespace
          to get the page number centered to the header's right edge. This is
          in accordance with the <link
          xlink:href="http://www.w3.org/TR/xsl/#leader-pattern">leader-pattern</link>
          attributes default value <code>space</code>.</para>
        </section>

        <section xml:id="pageNumbering">
          <title>Page numbers</title>

          <para>We already saw an example of page numbering via <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_page-number">fo:page-number</link>
          in <xref linkend="paramHeadFoot"/>. Sometimes a different style for
          page numbering is desired. The default page numbering style may be
          changed by means of the <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_page-sequence">fo:page-sequence</link>
          element's attribute <link
          xlink:href="http://www.w3.org/TR/xsl/#format">format</link>. For a
          closer explanation the <link
          xlink:href="http://www.w3.org/TR/2007/REC-xslt20-20070123/#convert">W3X
          XSLT standards documentation</link> may be consulted:</para>

          <figure xml:id="pageNumberingRoman">
            <title>Roman style page numbers.</title>

            <programlisting>...
&lt;fo:page-sequence format="i"
  master-reference="simplePageLayout"&gt;
  &lt;fo:static-content
    flow-name="xsl-region-after"&gt;
    &lt;fo:block text-align-last='justify'&gt;
      &lt;fo:leader leader-pattern="rule"
        leader-length="100%"/&gt;
    &lt;/fo:block&gt;
    &lt;fo:block font-weight="bold"&gt;
      &lt;fo:page-number/&gt;
    &lt;/fo:block&gt;
  &lt;/fo:static-content&gt;

  &lt;fo:flow flow-name="xsl-region-body"&gt;
    &lt;fo:block&gt;Some text...&lt;/fo:block&gt;
    &lt;fo:block&gt;More text, more text, 
      more text.&lt;/fo:block&gt;
    &lt;fo:block&gt;More text, more text,
       more text.&lt;/fo:block&gt;
    &lt;fo:block&gt;Enough text.&lt;/fo:block&gt;
  &lt;/fo:flow&gt;
&lt;/fo:page-sequence&gt; ...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/pageStack.fig"/>
              </imageobject>
            </mediaobject>
          </figure>
        </section>

        <section xml:id="foMarker">
          <title>Marker</title>

          <figure xml:id="dictionary">
            <title>A dictionary with running page headers.</title>

            <programlisting>...
&lt;fo:page-sequence
  master-reference="simplePageLayout"&gt;
  &lt;fo:static-content flow-name="xsl-region-before"&gt;
    &lt;fo:block font-weight="bold"&gt;
      &lt;fo:retrieve-marker retrieve-class-name="alpha" 
       retrieve-position="first-starting-within-page"
       /&gt;-&lt;fo:retrieve-marker
        retrieve-position="last-starting-within-page"
        retrieve-class-name="alpha"/&gt;
    &lt;/fo:block&gt;
    &lt;fo:block text-align-last='justify'&gt;
      &lt;fo:leader leader-pattern="rule" leader-length="100%"/&gt;&lt;/fo:block&gt;
  &lt;/fo:static-content&gt;

  &lt;fo:flow flow-name="xsl-region-body"&gt;
    &lt;fo:block&gt;
      &lt;fo:marker marker-class-name="alpha"&gt;A
    &lt;/fo:marker&gt;Ant&lt;/fo:block&gt;
    &lt;fo:block&gt;
      &lt;fo:marker marker-class-name="alpha"&gt;B
    &lt;/fo:marker&gt;Bug&lt;/fo:block&gt;
    &lt;fo:block&gt;
      &lt;fo:marker marker-class-name="alpha"&gt;L
    &lt;/fo:marker&gt;Lion&lt;/fo:block&gt;
    &lt;fo:block&gt;
      &lt;fo:marker marker-class-name="alpha"&gt;N
    &lt;/fo:marker&gt;Nose&lt;/fo:block&gt;
    &lt;fo:block&gt;
      &lt;fo:marker marker-class-name="alpha"&gt;P
    &lt;/fo:marker&gt;Peg&lt;/fo:block&gt;
  &lt;/fo:flow&gt;
&lt;/fo:page-sequence&gt; ...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/dictionaryStack.fig"/>
              </imageobject>
            </mediaobject>
          </figure>
        </section>

        <section xml:id="foIntRef">
          <title>Internal references</title>

          <titleabbrev>References</titleabbrev>

          <para>Regarding printed documents we may define two categories of
          document internal references:</para>

          <variablelist>
            <varlistentry>
              <term><emphasis>Page number references</emphasis></term>

              <listitem>
                <para>This is the <quote>classical</quote> type of a reference
                e.g. in books. An author refers the reader to a distant
                location by writing <quote>... see further explanation in
                section 4.5 on page 234</quote>. A book's table of contents
                assigning page numbers to topics is another example. This way
                the implementation of a reference relies solely on the
                features a printed document offers.</para>
              </listitem>
            </varlistentry>

            <varlistentry>
              <term><emphasis>Hypertext references</emphasis></term>

              <listitem>
                <para>This way of implementing references utilizes features of
                (online) viewers for printable documents. For example PDF
                viewers like <productname
                xlink:href="http://www.adobe.com">Adobe's Acrobat
                reader</productname> or the evince application are able to
                follow hypertext links in a fashion known from HTML browsers.
                This browser feature is based on hypertext capabilities
                defined in the Adobe's PDF de-facto standard.</para>
              </listitem>
            </varlistentry>
          </variablelist>

          <para>Of course the second type of references is limited to people
          who use an online viewer application instead of reading a document
          from physical paper.</para>

          <para>We now show the implementation of <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          based page references. As already being discussed for <link
          xlink:href="http://www.w3.org/TR/xml#id">ID</link> / <link
          xlink:href="http://www.w3.org/TR/xml#idref">IDREF</link> pairs we
          need a link destination (anchor) and a link source. The <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          standard uses the same anchor implementation as in XML for <link
          xlink:href="http://www.w3.org/TR/xml#id">ID</link> typed attributes:
          <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          objects <emphasis>may</emphasis> have an attribute <link
          xlink:href="http://www.w3.org/TR/xsl/#id">id</link> with a document
          wide unique value. The <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          element <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_page-number-citation">fo:page-number-citation</link>
          is used to actually create a page reference via its attribute <link
          xlink:href="http://www.w3.org/TR/xsl/#ref-id">ref-id</link>:</para>

          <figure xml:id="refJavaXml">
            <title>Two blocks mutual page referencing each other.</title>

            <programlisting>...
  &lt;fo:flow flow-name='xsl-region-body'&gt;
    &lt;fo:block id='xml'&gt;Java section see page
      &lt;fo:page-number-citation ref-id='java'/&gt;.
    &lt;/fo:block&gt;

    &lt;fo:block id='java'&gt;XML section see page
      &lt;fo:page-number-citation ref-id='xml'/&gt;.
    &lt;/fo:block&gt;
  &lt;/fo:flow&gt; ...</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata align="left" fileref="Ref/Fig/pagerefStack.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>NB: Be careful defining <link
          xlink:href="http://www.w3.org/TR/xsl/#id">id</link> attributes for
          objects being descendants of <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_static-content">fo:static-content</link>
          nodes. Such objects typically appear on multiple pages and are
          therefore no unique anchors. A reference carrying such an id value
          thus actually refers to 1 &lt;= n values on n different pages.
          Typically a user agent will choose the first object of this set when
          clicking the link. So in effect the parent <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_page-sequence">fo:page-sequence</link>
          is chosen as the effective link target.</para>

          <para>The element <link
          xlink:href="http://www.w3.org/TR/xsl/#fo_basic-link">fo:basic-link</link>
          creates PDF hypertext links. We extend the previous example:</para>

          <figure xml:id="refJavaXmlHyper">
            <title>Two blocks with mutual page- and hypertext
            references.</title>

            <programlisting>&lt;fo:flow flow-name='xsl-region-body'&gt;
  &lt;fo:block id='xml'&gt;Java section see &lt;fo:basic-link color="blue" 
    internal-destination="java"&gt;page&lt;fo:page-number-citation 
   ref-id='java'/&gt;.&lt;/fo:basic-link&gt;&lt;/fo:block&gt;

&lt;fo:block id='java'&gt;XML section see
  &lt;fo:basic-link color="blue"
    internal-destination="xml"&gt;page &lt;fo:page-number-citation
     ref-id='xml'/&gt;.&lt;/fo:basic-link&gt;&lt;/fo:block &gt;
&lt;/fo:flow&gt;</programlisting>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/pagerefhyperStack.fig"/>
              </imageobject>
            </mediaobject>
          </figure>
        </section>

        <section xml:id="pdfBookmarks">
          <title>PDF bookmarks</title>

          <titleabbrev>Bookmarks</titleabbrev>

          <para>The PDF specification allows to define so called bookmarks
          offering an explorer like navigation:</para>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Screen/pdfbookmarks.screen.png"/>
            </imageobject>
          </mediaobject>

          <para>PDF bookmarks are <link
          xlink:href="http://www.w3.org/TR/2006/REC-xsl11-20061205/#d0e14206">part
          of the XSL-FO 1.1</link> Standard. Some <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          processors still continue to use proprietary solutions for bookmark
          creation with respect to the older <abbrev
          xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
          1.0 standard. For details of bookmark extensions by
          <orgname>RenderX</orgname>'s processor see <link
          xlink:href="http://www.renderx.com/tutorial.html#PDF_Bookmarks">xep's
          documentation</link>.</para>
        </section>
      </section>

      <section xml:id="xml2fo">
        <title>Constructing <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        from XML documents</title>

        <titleabbrev><abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        from XML</titleabbrev>

        <para>So far we have learnt some basic <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        elements. As with HTML we typically generate FO code from other
        sources rather than crafting it by hand. The general picture
        is:</para>

        <figure xml:id="htmlFoProduction">
          <title>Different target formats from common source.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/crossmedia.fig" scale="65"/>
            </imageobject>

            <caption>
              <para>We may generate both online and printed documentation from
              a common source. This requires style sheets for the desired
              destination formats in question.</para>
            </caption>
          </mediaobject>
        </figure>

        <para>We discussed the <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        standard as an input format for printable output production by a
        renderer. In this way a <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        document is similar to HTML being a format to be rendered by a web
        browser for visual (screen oriented) output production. The
        transformation from a XML source (e.g. a memo document) to <abbrev
        xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
        is still missing. As for HTML we may use <abbrev
        xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> as a
        transformation means. We generate the sender's surname from a memo
        document instance:</para>

        <figure xml:id="memo2fosurname">
          <title>Generating a sender's surname for printing.</title>

          <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:fo="http://www.w3.org/1999/XSL/Format" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:output method="xml" indent="yes"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;fo:root&gt;
      &lt;fo:layout-master-set&gt;
        &lt;fo:simple-page-master master-name="simplePageLayout"
          page-width="294mm" page-height="210mm" margin="5mm"&gt;
          &lt;fo:region-body margin="15mm"/&gt;
        &lt;/fo:simple-page-master&gt;
      &lt;/fo:layout-master-set&gt;
      &lt;fo:page-sequence master-reference="simplePageLayout"&gt;
        &lt;fo:flow flow-name="xsl-region-body"&gt;
          &lt;fo:block font-size="20pt"&gt;
            &lt;xsl:text&gt;Sender:&lt;/xsl:text&gt;
            &lt;fo:inline font-weight='bold'&gt;
              &lt;xsl:value-of select="memo/from/surname"/&gt;
            &lt;/fo:inline&gt;
          &lt;/fo:block&gt;
        &lt;/fo:flow&gt;
      &lt;/fo:page-sequence&gt;
    &lt;/fo:root&gt;
  &lt;/xsl:template&gt;
&lt;/xsl:stylesheet&gt;</programlisting>
        </figure>

        <para>A suitable XML document instance reads:</para>

        <figure xml:id="memoMessage">
          <title>A <code>memo</code> document instance.</title>

          <programlisting>&lt;?xml version="1.0" ?&gt;
&lt;!DOCTYPE memo SYSTEM "memo.dtd"&gt;
&lt;memo&gt;
  &lt;from&gt;
    &lt;name&gt;Martin&lt;/name&gt;
    &lt;surname&gt;Goik&lt;/surname&gt;
  &lt;/from&gt;
  &lt;to&gt;
    &lt;name&gt;Adam&lt;/name&gt;
    &lt;surname&gt;Hacker&lt;/surname&gt;
  &lt;/to&gt;
  &lt;to&gt;
    &lt;name&gt;Eve&lt;/name&gt;
    &lt;surname&gt;Intruder&lt;/surname&gt;
  &lt;/to&gt;
  &lt;date year="2005" month="1" day="6"/&gt;
  &lt;subject&gt;Firewall problems&lt;/subject&gt;
  &lt;content&gt;
    &lt;para&gt;Thanks for your excellent work.&lt;/para&gt;
    &lt;para&gt;Our firewall is definitely broken!&lt;/para&gt;
  &lt;/content&gt;
&lt;/memo&gt;</programlisting>
        </figure>

        <para>Some remarks:</para>

        <orderedlist>
          <listitem>
            <para>The <link
            xlink:href="http://www.w3.org/TR/2007/REC-xslt20-20070123/#element-stylesheet">xsl_stylesheet</link>
            element contains a namespace definition for the target FO
            document's namespace, namely:</para>

            <programlisting>xmlns:xsl="http://www.w3.org/1999/XSL/Transform"</programlisting>

            <para>This is required to use elements like <link
            xlink:href="http://www.w3.org/TR/xsl/#fo_block">fo:block</link>
            belonging to the FO namespace.</para>
          </listitem>

          <listitem>
            <para>The option value <code>indent="yes"</code> in <link
            xlink:href="http://www.w3.org/TR/2007/REC-xslt20-20070123/#element-output">xsl_output</link>
            is usually set to "no" in a production environment to avoid
            whitespace related problems.</para>
          </listitem>

          <listitem>
            <para>The generation of a print format like PDF is actually a two
            step process. To generate message.pdf from message.xml by a
            stylesheet memo2fo.xsl we need the following calls:</para>

            <variablelist>
              <varlistentry>
                <term><emphasis>XML document instance to FO</emphasis></term>

                <listitem>
                  <programlisting>xml2xml message.xml memo2fo.xsl -o message.fo</programlisting>
                </listitem>
              </varlistentry>

              <varlistentry>
                <term><emphasis>FO to PDF</emphasis></term>

                <listitem>
                  <programlisting>fo2pdf -fo message.fo -pdf message.pdf</programlisting>
                </listitem>
              </varlistentry>
            </variablelist>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/xml2fo2pdf.fig"/>
              </imageobject>
            </mediaobject>

            <para>When debugging of the intermediate <abbrev
            xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
            file is not required both steps may be combined into a single
            call:</para>

            <programlisting>fo2pdf -xml message.xml -xsl memo2fo.xsl -pdf message.pdf</programlisting>
          </listitem>
        </orderedlist>
      </section>

      <section xml:id="foCatalog">
        <title>Formatting a catalog.</title>

        <titleabbrev>A catalog</titleabbrev>

        <para>We now take the <link linkend="climbingCatalog">climbing catalog
        example</link> with prices being added and incrementally create a
        series of PDF versions improving from one version to another.</para>

        <qandaset role="exercise">
          <title>A first PDF version of the catalog</title>

          <qandadiv>
            <qandaentry xml:id="idCatalogStart">
              <question>
                <para>Write a <abbrev
                xlink:href="http://www.w3.org/Style/XSL">XSL</abbrev> script
                to generate a starting version <filename
                xlink:href="Ref/src/Dom/climbenriched.start.pdf">climbenriched.start.pdf</filename>.</para>
              </question>

              <answer>
                <programlisting>&lt;?xml version="1.0" encoding="utf-8"?&gt;
&lt;xsl:stylesheet version="1.0"
  xmlns:fo="http://www.w3.org/1999/XSL/Format" 
  xmlns:xsl="http://www.w3.org/1999/XSL/Transform"&gt;

  &lt;xsl:output method="xml" indent="yes"/&gt;

  &lt;xsl:template match="/"&gt;
    &lt;fo:root font-size="10pt"&gt;
      &lt;fo:layout-master-set&gt;
        &lt;fo:simple-page-master master-name="productPage"
          page-width="80mm" page-height="110mm" margin="5mm"&gt;
          &lt;fo:region-body margin="15mm"/&gt;
          &lt;fo:region-before extent="10mm"/&gt;
        &lt;/fo:simple-page-master&gt;
      &lt;/fo:layout-master-set&gt;
      &lt;xsl:apply-templates select="catalog/product" /&gt;
    &lt;/fo:root&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="product"&gt;
    &lt;fo:page-sequence master-reference="productPage"&gt;
      &lt;fo:static-content flow-name="xsl-region-before"&gt;
        &lt;fo:block font-weight="bold"&gt;
          &lt;xsl:value-of select="title"/&gt;
        &lt;/fo:block&gt;
      &lt;/fo:static-content&gt;
      &lt;fo:flow flow-name="xsl-region-body"&gt;
        &lt;xsl:apply-templates select="description/para"/&gt;
        
        &lt;fo:block&gt;Price:&lt;xsl:value-of select="@price"/&gt;&lt;/fo:block&gt;
        &lt;fo:block&gt;Order no:&lt;xsl:value-of select="@id"/&gt;&lt;/fo:block&gt;
      &lt;/fo:flow&gt;
    &lt;/fo:page-sequence&gt;
  &lt;/xsl:template&gt;

  &lt;xsl:template match="para"&gt;
    &lt;fo:block space-after="10px"&gt;
      &lt;xsl:value-of select="."/&gt;
    &lt;/fo:block&gt;
  &lt;/xsl:template&gt;

&lt;/xsl:stylesheet&gt;</programlisting>
              </answer>
            </qandaentry>

            <qandaentry xml:id="idCatalogProduct">
              <question>
                <label>Header, page numbers and table formatting</label>

                <para>Extend <xref linkend="idCatalogStart"/> by adding page
                numbers. The order number and prices shall be formatted as
                tables. Add a ruler to each page's head. The result should
                look like <filename
                xlink:href="Ref/src/Dom/climbenriched.product.pdf">climbenriched.product.pdf</filename></para>
              </question>

              <answer>
                <para>Solution see <filename
                xlink:href="Ref/src/Dom/catalog2fo.product.xsl">catalog2fo.product.xsl</filename>.</para>
              </answer>
            </qandaentry>

            <qandaentry xml:id="idCatalogToc">
              <question>
                <label>A table of contents.</label>

                <para>Each product description's page number shall appear in a
                table of contents together with the product's
                <code>title</code> as in <filename
                xlink:href="Ref/src/Dom/climbenriched.toc.pdf">climbenriched.toc.pdf</filename>.</para>
              </question>

              <answer>
                <para>Solution see <filename
                xlink:href="Ref/src/Dom/catalog2fo.toc.xsl">catalog2fo.toc.xsl</filename>.</para>
              </answer>
            </qandaentry>

            <qandaentry xml:id="idCatalogToclink">
              <question>
                <label>A table of contents with hypertext links.</label>

                <para>The table of contents' entries may offer hypertext
                features to supporting browsers as in <filename
                xlink:href="Ref/src/Dom/climbenriched.toclink.pdf">climbenriched.toclink.pdf</filename>.
                In addition include the document's <tag
                class="starttag">introduction</tag>.</para>
              </question>

              <answer>
                <para>Solution see <filename
                xlink:href="Ref/src/Dom/catalog2fo.toclink.xsl">catalog2fo.toclink.xsl</filename>.</para>
              </answer>
            </qandaentry>

            <qandaentry xml:id="idCatalogFinal">
              <question>
                <label>A final version.</label>

                <para>Add the following features:</para>

                <orderedlist>
                  <listitem>
                    <para>Number the table of contents starting with page i,
                    ii, iii, iv and so on. Start the product descriptions with
                    page 1. On each page's footer a text <quote>page xx of
                    yy</quote> shall be displayed. This requires the
                    definition of an anchor <code>id</code> on the <abbrev
                    xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
                    document's last page.</para>
                  </listitem>

                  <listitem>
                    <para>Add PDF bookmarks by using <orgname>XEP</orgname>'s
                    <abbrev
                    xlink:href="http://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section">FO</abbrev>
                    extensions. This requires the namespace declaration
                    <code>xmlns:rx="http://www.renderx.com/XSL/Extensions"</code>
                    in the XSLT script's header.</para>
                  </listitem>
                </orderedlist>

                <para>The result may look like <filename
                xlink:href="Ref/src/Dom/climbenriched.final.pdf">climbenriched.final.pdf</filename>.
                N.B.: It may take some effort to achieve this result. This
                effort is left to the <emphasis>interested</emphasis>
                participants.</para>
              </question>

              <answer>
                <para>Solution see <filename
                xlink:href="Ref/src/Dom/catalog2fo.toclink.xsl">catalog2fo.toclink.xsl</filename>.</para>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>
    </chapter>

    <chapter xml:id="chapter_entities">
      <title>Entities</title>

      <para>Entities target the <emphasis>physical</emphasis> structure of
      <abbrev
      xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
      and document instances. Both <abbrev
      xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
      and XML document instances may be <emphasis>physically</emphasis>
      composed of smaller pieces:</para>

      <itemizedlist>
        <listitem>
          <para><abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
          often reuse standard components. For example many <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
          adopted the HTML table model. Entities offer an elegant way to
          include such building blocks into other <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s.</para>
        </listitem>

        <listitem>
          <para>A book may <emphasis>logically</emphasis> consist of 10
          chapters. We may use entities to represent a book by a single master
          document plus 10 separate XML documents representing each
          chapter.</para>
        </listitem>
      </itemizedlist>

      <para>In correspondence with these two examples we first note that two
      different types of entities exist:</para>

      <glosslist>
        <glossentry>
          <glossterm>Parameter entities</glossterm>

          <glossdef>
            <para>May only be used within <abbrev
            xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
            but not in document instances.</para>
          </glossdef>
        </glossentry>

        <glossentry>
          <glossterm>General entities</glossterm>

          <glossdef>
            <para>May be used both in <abbrev
            xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s
            and in document instances.</para>
          </glossdef>
        </glossentry>
      </glosslist>

      <para>Both types of entities exist in two flavors
      <quote>Internal</quote> and <quote>external</quote> depending on whether
      they are defined within a document itself or in an external document
      being referenced.</para>

      <section xml:id="section_parameterentity">
        <title xml:id="section_parameterentities">Parameter entities</title>

        <para>We consider the following DTD:</para>

        <figure xml:id="figure_nonmodular_doc">
          <title>A DTD <filename>doc.dtd</filename> describing document
          instances consisting of paragraphs and figures</title>

          <programlisting>&lt;!ELEMENT doc     (para|figure)* <co
              xml:id="programlisting_figure1_doc"/>&gt;
&lt;!ELEMENT para    (#PCDATA) &gt;

&lt;!ELEMENT figure  (caption, image) <co
              xml:id="programlisting_figure1_figure"/>&gt;
&lt;!ELEMENT caption (#PCDATA) <co xml:id="programlisting_figure1_caption"/>&gt;
&lt;!ELEMENT image   EMPTY &gt;
&lt;!ATTLIST image
          src     CDATA #REQUIRED <co
              xml:id="programlisting_figure1_image_src"/>&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="programlisting_figure1_doc">
            <para>A document consists of an arbitrary sequence of paragraphs
            and figures.</para>
          </callout>

          <callout arearefs="programlisting_figure1_figure">
            <para>A figure has a caption describing the image's content and an
            <tag class="starttag">image</tag> node. The formatting expectation
            may be defined as an image with a caption being placed
            below.</para>
          </callout>

          <callout arearefs="programlisting_figure1_caption">
            <para>A textual description of the corresponding image.</para>
          </callout>

          <callout arearefs="programlisting_figure1_image_src">
            <para>The attribute <tag class="attribute">src</tag> contains an
            URI to image data.</para>
          </callout>
        </calloutlist>

        <para>An <filename>example.xml</filename> document instance looks
        like:</para>

        <programlisting>&lt;!DOCTYPE doc SYSTEM "doc.dtd"&gt;
&lt;doc&gt;
    &lt;para&gt;A paragraph&lt;/para&gt;
    &lt;figure&gt;
        &lt;caption&gt;A nice image&lt;/caption&gt;
        &lt;image src="image.png"/&gt;
    &lt;/figure&gt;
&lt;/doc&gt;</programlisting>

        <para>In a <quote>real</quote> DTD a <tag class="element">figure</tag>
        element will have more complexity. An author of a different DTD
        describing a fashion catalog may want to reuse the <tag
        class="element">figure</tag> element as a component. This may be
        achieved by moving all <tag class="element">figure</tag> related
        definitions into a separate file
        <filename>figure.mod</filename>:</para>

        <figure xml:id="figureEntityDef">
          <title>The <tag class="element">figure</tag> element implemented in
          an independent DTD module <filename>figure.mod</filename></title>

          <programlisting>&lt;!ELEMENT figure  (caption, image) &gt;
&lt;!ELEMENT caption (#PCDATA) &gt;
&lt;!ELEMENT image   EMPTY &gt;
&lt;!ATTLIST image src CDATA #REQUIRED &gt;</programlisting>
        </figure>

        <para>Now we may include this module in a master DTD:</para>

        <figure xml:id="figure_doc_master">
          <title>The master DTD which includes the <code>figure.mod</code>
          module</title>

          <programlisting>&lt;!ENTITY % <co xml:id="figure_doc_master_pentity"/>figure.mod <co
              xml:id="figure_doc_master_identifier"/>SYSTEM <co
              xml:id="figure_doc_master_keyword_system"/>"figure.mod" <co
              xml:id="figure_doc_master_entity_filename"/>&gt;

%figure.mod; <co xml:id="figure_doc_master_include"/>

&lt;!ELEMENT doc (para|figure)* &gt;
&lt;!ELEMENT para (#PCDATA) &gt;</programlisting>

          <calloutlist>
            <callout arearefs="figure_doc_master_pentity">
              <para>The percent sign <quote>%</quote> defines the following
              identifier to be a <emphasis>parameter</emphasis> entity.
              Without this character it would define a <link
              linkend="section_generalentities">general</link> entity.</para>
            </callout>

            <callout arearefs="figure_doc_master_identifier">
              <para>The entity to be defined will be represented by the local
              identifier <code>figure.mod</code>.<filename/></para>
            </callout>

            <callout arearefs="figure_doc_master_keyword_system">
              <para>The <code>SYSTEM</code> keyword states that the following
              content is a reference to an <emphasis>external</emphasis>
              object.</para>
            </callout>

            <callout arearefs="figure_doc_master_entity_filename">
              <para><filename>figure.mod</filename> is just the filename of a
              DTD module containing all definitions of the <tag
              class="element">figure</tag> element.</para>
            </callout>

            <callout arearefs="figure_doc_master_include">
              <para>The variable <code>figure.mod</code> represents parameter
              entity definitions. We have to <emphasis>include</emphasis> them
              to the current DTD in order to make them part of it. In C/C++
              the term <code>%figure.mod;</code> would read <code>#include
              "figure.mod"</code>.</para>
            </callout>
          </calloutlist>
        </figure>

        <para>This file functions as a complete replacement for the non
        modular DTD presented at the <link
        linkend="figure_nonmodular_doc">beginning</link>. This way
        <filename>figure.mod</filename> acts as a <quote>building
        block</quote> that may be reused in other <abbrev
        xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>'s
        as well. We note that using an entity in a XML DTD is a two step
        process:</para>

        <itemizedlist>
          <listitem>
            <para>Declaration of an entity.</para>
          </listitem>

          <listitem>
            <para><quote>Use</quote> of a declared entity.</para>
          </listitem>
        </itemizedlist>

        <para>Many programming languages combine these two steps into one.
        Examples are:</para>

        <glosslist>
          <glossentry>
            <glossterm>C/C++:</glossterm>

            <glossdef>
              <para><code>#include "stdio.h"</code></para>
            </glossdef>
          </glossentry>

          <glossentry>
            <glossterm><link
            linkend="gloss_Java"><trademark>Java</trademark></link>:</glossterm>

            <glossdef>
              <para><code>import de.hdm-stuttgart.xml;</code></para>
            </glossdef>
          </glossentry>
        </glosslist>

        <para>On the other hand there are similarities concerning the way
        entities are handled. If we take C/C++ as an example we observe the
        following situation: A compiler reads a <quote>master</quote> file and
        includes (possibly recursively) sets of other files. This part of the
        compilation process is carried out by a separate software called a
        preprocessor which may be invoked independently. As an example we take
        a <quote>master</quote> file <filename>main.c</filename> written in
        the programming language C:</para>

        <programlisting language="c">/* no #include &lt;stdio.h&gt; for simplicity */
#include "maximum.h"

void main(char **args){
  printf("The maximum of %d and %d is %d", 3, 5, <emphasis role="bold">max(3,5)</emphasis>);
}</programlisting>

        <para>The referenced file <filename>maximum.h</filename> being
        included contains a single line defining the macro
        <code>max(...)</code> appearing in the <code>printf</code>
        statement:</para>

        <programlisting language="c">#define <emphasis role="bold">max(a, b)</emphasis> ( (a)&gt;(b) ? (a) : (b) )</programlisting>

        <para>Despite some warning messages we may compile and execute
        <code>main.c</code>:</para>

        <programlisting><computeroutput>[goik@mupter ~]$ cc -o main main.c
... warnings omitted ...
[goik@mupter ~]$ ./main
The maximum of 3 and 5 is 5</computeroutput></programlisting>

        <para>Now we may also execute the C preprocessor separately:</para>

        <programlisting>[goik@mupter ~]$ cpp -P main.c
void main(char **args){
  printf("The maximum of %d and %d is %d", 3, 5, <emphasis role="bold">( (3)&gt;(5) ? (3) : (5) )</emphasis>);
}</programlisting>

        <para>We observe that the preprocessor has resolved the dependency
        from <filename>main.c</filename> to <filename>maximum.h</filename> by
        in line replacing the macro call <code>max(3,5)</code> into <code>(
        (3)&gt;(5) ? (3) : (5) )</code>. This output is then read by the
        <quote>real</quote> compiler to create an executable binary file
        <code>main</code>.</para>

        <figure xml:id="cppCompilerTwoStep">
          <title>Two processing steps building an executable from a C
          file</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/cpp.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>A XML parser validating a document will do the same both
        regarding the document instance itself and any entities which have to
        be resolved. The first step before any real parsing is executed by the
        <emphasis>entity resolver</emphasis> which can be compared to a C
        Preprocessor. We reconsider our figure DTD example:</para>

        <figure xml:id="entityResolv">
          <title>The entity resolving process. The dashed arrows show
          <code>SYSTEM</code> references to external entities.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/entityresolve.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>The actual XML validating parser will examine the output
        <filename>resolve.xml from the entity resolver</filename>.</para>

        <para>As we noted in the introduction to this chapter entities may
        also be of type internal. This means they are defined within a
        document itself rather than residing in an external object. We
        consider the following example:</para>

        <programlisting>&lt;!ENTITY % <emphasis role="bold">url</emphasis> "CDATA" <co
            xml:id="programlisting_internparam_urlent"/>&gt;

&lt;!ELEMENT doc (para|figure)* &gt;
&lt;!ELEMENT para (#PCDATA) &gt;

&lt;!ELEMENT figure  (caption, image) &gt;
&lt;!ELEMENT caption (#PCDATA) &gt;
&lt;!ELEMENT image   EMPTY &gt;
&lt;!ATTLIST image src %<emphasis role="bold">url</emphasis>;<co
            xml:id="programlisting_internparam_urluse"/> #REQUIRED &gt;</programlisting>

        <calloutlist>
          <callout arearefs="programlisting_internparam_urlent">
            <para>An internal parameter entity <tag
            class="paramentity">url</tag> is defined. Since the
            <code>SYSTEM</code> keyword is absent the definition is taken
            <quote>as is</quote>.</para>
          </callout>

          <callout arearefs="programlisting_internparam_urluse">
            <para>The internal entity <tag class="paramentity">url</tag> is
            used. The entity resolver will replace this term by the string
            <code>CDATA</code>.</para>
          </callout>
        </calloutlist>

        <para>From a practical point of view we might argue that the given
        code does not make sense. Actually the entity <tag
        class="paramentity">url</tag> does a kind of <quote>copy/paste</quote>
        action. There seems to be no benefit since the parser still sees the
        attribute type <code>CDATA</code> and will thus still accept invalid
        <link xlink:href="http://www.w3.org/Addressing">URLs</link> like
        <code>http://c:\mydir\</code>.</para>

        <para>The actual gain is readability: In a DTD attributes of
        <emphasis>desired</emphasis> type <link
        xlink:href="http://www.w3.org/Addressing">URL</link> appear
        frequently. In the scope of DTDs there is no appropriate data type
        describing the <link
        xlink:href="http://www.ietf.org/rfc/rfc1738.txt">formal rules</link> a
        <link xlink:href="http://www.w3.org/Addressing">URL</link> has to
        obey. But at least the reader will notice the
        <emphasis>intention</emphasis> that the attribute <tag
        class="attribute">src</tag> of the element <tag
        class="element">image</tag> shall contain a <link
        xlink:href="http://www.w3.org/Addressing">URL</link>.</para>

        <para>In the next example we want to extend out book.dtd by allowing
        simplified HTML tables:</para>

        <table border="1" xml:id="example_table_col_rowspan">
          <caption>A table caption</caption>

          <?target dbhtml table-width="50%"?>

          <?target dbfo table-width="50%"?>

          <tr>
            <td rowspan="2">A cell spanning two rows</td>

            <td>a single cell</td>
          </tr>

          <tr>
            <td>another single cell</td>
          </tr>

          <tr>
            <td colspan="2">A cell spanning two columns</td>
          </tr>
        </table>

        <qandaset role="exercise">
          <title>book.dtd and tables</title>

          <qandadiv>
            <qandaentry xml:id="example_docbook_v5">
              <question>
                <para>The <link linkend="example_table_col_rowspan">example
                table</link> presented before may be defined by the following
                code snippet:</para>

                <programlisting>...
&lt;table border="1" <co xml:id="programlisting_table_col_rowspan_attborder"/> &gt;
  &lt;caption&gt;A table caption&lt;/caption&gt;
    &lt;tr&gt;
      &lt;td rowspan="2" <co
                    xml:id="programlisting_table_col_rowspan_attrowspan"/>&gt;A cell spanning two rows&lt;/td&gt;
      &lt;td&gt;a single cell&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td&gt;another single cell&lt;/td&gt;
    &lt;/tr&gt;
    &lt;tr&gt;
      &lt;td colspan="2" <co
                    xml:id="programlisting_table_col_rowspan_attcolspan"/>&gt;A cell spanning two columns&lt;/td&gt;
    &lt;/tr&gt;
&lt;/table&gt;
...</programlisting>

                <calloutlist>
                  <callout arearefs="programlisting_table_col_rowspan_attborder">
                    <para>We want a table with borders. In a HTML rendered
                    version the number indicates the line with in pixel. In
                    this example we expect a line width of one pixel.</para>
                  </callout>

                  <callout arearefs="programlisting_table_col_rowspan_attrowspan">
                    <para>The cell will span two rows.</para>
                  </callout>

                  <callout arearefs="programlisting_table_col_rowspan_attcolspan">
                    <para>The cell will span two columns.</para>
                  </callout>
                </calloutlist>

                <para>Define a DTD table module <filename>table.mod</filename>
                and include it into the <filename>book.dtd</filename> via an
                external parameter entity.</para>
              </question>

              <answer>
                <para>The table model definitions in
                <filename>table.mod</filename> read:</para>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!ELEMENT table   (caption, tr+)&gt;
&lt;!ATTLIST table 
                  border NMTOKEN #IMPLIED &gt;
&lt;!ELEMENT caption (#PCDATA) &gt;
&lt;!ELEMENT tr      (td+) &gt;
&lt;!ELEMENT td      (#PCDATA) &gt;
&lt;!ATTLIST td 
                  colspan NMTOKEN #IMPLIED
                  rowspan NMTOKEN #IMPLIED &gt;</programlisting>

                <para>This may be included into our
                <filename>book.dtd</filename> via:</para>

                <programlisting>&lt;!ENTITY % table.mod SYSTEM "table.mod" &gt;
%table.mod;

&lt;!ELEMENT book     (title, chapter+)&gt;
...</programlisting>

                <para>The complete source code is available <link
                xlink:href="Ref/src/Dtd/book/v5/book.dtd">here</link> . A
                document instance reads:</para>

                <programlisting>&lt;!DOCTYPE book SYSTEM "book.dtd"&gt;
&lt;book lang="en"&gt;
  &lt;title&gt;Introduction to Java&lt;/title&gt;
  &lt;chapter id="introJava"&gt;
    &lt;title&gt;Introduction&lt;/title&gt;
    &lt;para id="notUsed"&gt;Documentation on &lt;link linkend="introJava"&gt;types&lt;/link&gt;&lt;/para&gt;
    &lt;table border="1"&gt;
      &lt;caption&gt;A table caption&lt;/caption&gt;
      &lt;tr&gt;
        &lt;td rowspan="2"&gt;A cell spanning two columns&lt;/td&gt;
        &lt;td&gt;a single cell&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td&gt;another single cell&lt;/td&gt;
      &lt;/tr&gt;
      &lt;tr&gt;
        &lt;td colspan="2"&gt;A cell spanning two rows&lt;/td&gt;
      &lt;/tr&gt;
    &lt;/table&gt;
  &lt;/chapter&gt;
&lt;/book&gt;</programlisting>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="section_generalentities">
        <title>General entities</title>

        <para>Parameter entities are limited to appear only within the scope
        of <abbrev
        xlink:href="http://en.wikipedia.org/wiki/Document_Type_Declaration">DTD</abbrev>s.
        They must not appear in document instances. This motivates the
        introduction of general entities. We start with an example of a
        copyright notice:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;para&gt;All rights, including copyright are owned or
controlled for these purposes by the company.&lt;/para&gt;

&lt;para&gt;For further information, see Section Two of the Member Agreement.&lt;/para&gt;</programlisting>

        <para>We notice that this code is not even well formed XML: It has got
        two <tag class="element">para</tag> nodes at top level.</para>

        <para>We assume that the company in question produces a great number
        of documents. These two paragraphs shall be kept at a centralized
        location to be included into all publications. For this purpose the
        document shall be accessible from
        <filename>ftp://internal.com/copyright.xml</filename> in the company's
        intra net. Starting with our previously introduced
        <code>doc.dtd</code> we may embed and use this copyright
        document:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE doc SYSTEM "doc.dtd" [ <co
            xml:id="programlisting_copyright_internal"/>
  &lt;!ENTITY copyrightnotice <co xml:id="programlisting_copyright_entitydef"/> SYSTEM "ftp://internal.com/copyright.xml"&gt;
]<co xml:id="programlisting_copyright_endsubset"/>&gt;
&lt;doc&gt;
    &lt;para&gt;A paragraph&lt;/para&gt;
    &lt;figure&gt;
        &lt;caption&gt;A nice image&lt;/caption&gt;
        &lt;image src="image.png"/&gt;
    &lt;/figure&gt;
    &amp;copyrightnotice; <co xml:id="programlisting_copyright_entityuse"/>
&lt;/doc&gt;</programlisting>

        <calloutlist>
          <callout arearefs="programlisting_copyright_internal">
            <para>The left bracket <quote>[</quote> marks the begin of the
            document's <emphasis>internal DTD subset</emphasis>.</para>
          </callout>

          <callout arearefs="programlisting_copyright_entitydef">
            <para>An external general entity <tag
            class="genentity">copyrightnotice</tag> is declared. The <link
            xlink:href="http://www.w3.org/Addressing">URL</link> following the
            <code>SYSTEM</code> keyword defines a reference to the external
            definitions.</para>
          </callout>

          <callout arearefs="programlisting_copyright_endsubset">
            <para>Internal subset definitions end here.</para>
          </callout>

          <callout arearefs="programlisting_copyright_entityuse">
            <para>The entity <tag class="genentity">copyrightnotice</tag> is
            used. The entity resolver will expand it to the actual content of
            <filename>ftp://internal.com/copyright.xml</filename>.</para>
          </callout>
        </calloutlist>

        <para>The careful reader will have already guessed that from a XML
        processing application's viewpoint this is equivalent to:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE doc SYSTEM "doc.dtd"&gt;
&lt;doc&gt;
    &lt;para&gt;A paragraph&lt;/para&gt;
    &lt;figure&gt;
        &lt;caption&gt;A nice image&lt;/caption&gt;
        &lt;image src="image.png"/&gt;
    &lt;/figure&gt;
    &lt;para&gt;All rights, including copyright are owned or
controlled for these purposes by the company.&lt;/para&gt;

    &lt;para&gt;For further information, see Section Two of the Member Agreement.&lt;/para&gt;
&lt;/doc&gt;</programlisting>

        <para>We now have to clarify the term <quote>internal subset</quote>
        in the context of DTDs and start with:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE doc SYSTEM "doc.dtd" [ 
  &lt;!ENTITY copyrightnotice  SYSTEM "ftp://internal.com/copyright.xml"&gt;
]&gt;...</programlisting>

        <para>The XML standard allows markup declarations to appear both in
        <filename>doc.dtd</filename> itself and within the range being
        delimited by the braces <code>[...]</code>. Markup declarations
        appearing in <filename>doc.dtd</filename> belong to the so called
        <emphasis>external subset</emphasis> reflecting the fact that they
        reside outside the <quote>current</quote> document instance. Any
        markup declarations appearing within <code>[ ... ]</code> are
        considered to belong to the document instance's <emphasis>internal
        subset</emphasis>. We are now able to review some of our introductory
        XML examples: Our <tag class="element">memo</tag> document instance
        from <xref linkend="dtd_and_document"/> has no external subset at all.
        The markup declarations are completely defined in the internal subset
        of the document instance. As being stated earlier this only makes
        sense for development or demonstration purposes.</para>

        <para>The internal subset may under some circumstances even be used to
        extend content model or attribute definitions of the underlying DTD
        and thus leading to non portable document instances. This is possible
        if the DTD provides <quote>hooks</quote> intended to be used as entry
        points for extensions.</para>

        <para>In the above example we might have defined the entity <tag
        class="genentity">copyrightnotice</tag> in the external subset i.e.
        within <filename>doc.dtd</filename>. We conclude this section by
        showing a meaningful use case for an internal general entity:</para>

        <qandaset role="exercise">
          <title>Avoiding title duplication</title>

          <qandadiv>
            <qandaentry xml:id="example_xhtml_duplicate_title">
              <question>
                <para>We recall the sample Xhtml document given in <xref
                linkend="figure_xhtmlbase"/>. The <tag
                class="starttag">title</tag> and the <tag
                class="starttag">h1</tag> node both contain the same content
                <quote>A first start</quote>. Use an entity to define this
                content to be used at the two different positions.</para>
              </question>

              <answer>
                <para>We define an entity being used at the two locations in
                question:</para>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE html SYSTEM "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"[
&lt;!ENTITY mytitle "A first start" <co
                    xml:id="programlisting_xhtml_duplicate_title_entity"/>&gt;
]&gt;
&lt;html xmlns="http://www.w3.org/1999/xhtml"&gt;
  &lt;head&gt;&lt;title&gt;&amp;mytitle;<co
                    xml:id="programlisting_xhtml_duplicate_title_entity_first"/>&lt;/title&gt;&lt;/head&gt;
  &lt;body&gt;
    &lt;h1&gt;&amp;mytitle;<co
                    xml:id="programlisting_xhtml_duplicate_title_entity_second"/>&lt;/h1&gt;
    &lt;p&gt;This is a very simple document&lt;/p&gt;
  &lt;/body&gt;
&lt;/html&gt;</programlisting>

                <calloutlist>
                  <callout arearefs="programlisting_xhtml_duplicate_title_entity">
                    <para>Definition of an internal general entity <tag
                    class="genentity">mytitle</tag>.</para>
                  </callout>

                  <callout arearefs="programlisting_xhtml_duplicate_title_entity_first">
                    <para>First usage.</para>
                  </callout>

                  <callout arearefs="programlisting_xhtml_duplicate_title_entity_second">
                    <para>Second usage</para>
                  </callout>
                </calloutlist>
              </answer>
            </qandaentry>

            <qandaentry xml:id="example_chapter_entities">
              <question>
                <label>Dividing a book.dtd document instance into
                chapters.</label>

                <para>General entities may be used to physically split
                documents into smaller parts. Create a <tag
                class="starttag">book</tag> document instance
                <filename>master.xml</filename> with two chapters. Define an
                <code>IDREF</code> reference from the second to the first
                chapter. Now create two XML files
                <filename>chap1.xml</filename> and
                <filename>chap2.xml</filename> and move the content of the two
                chapters from <filename>master.xml</filename> into these
                files. Then include them into the master document as external
                general entities. What happens with the reference from the
                second to the first chapter?</para>
              </question>

              <answer>
                <para>Our master document reads:</para>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE book SYSTEM "book.dtd"[
  &lt;!ENTITY chap1 SYSTEM "chap1.xml"&gt;
  &lt;!ENTITY chap2 SYSTEM "chap2.xml"&gt;
]&gt;
&lt;book&gt;
  &lt;title&gt;Master document example&lt;/title&gt;
  &amp;chap1;
  &amp;chap2;
&lt;/book&gt;</programlisting>

                <para>The first general entity <filename>chap1.xml</filename>
                contains:</para>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;chapter id="firstChapter"&gt;
  &lt;title&gt;This is the first chapter&lt;/title&gt;
  &lt;para&gt;We add some text here.&lt;/para&gt;
&lt;/chapter&gt;</programlisting>

                <para>Notice that the <tag class="starttag">chapter</tag> node
                contains an attribute <tag class="attribute">id</tag> with
                value <tag class="attvalue">firstChapter</tag>. The second
                file <filename>chap2.xml</filename> reads:</para>

                <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;chapter&gt;
  &lt;title&gt;This is the second chapter&lt;/title&gt;
  &lt;para&gt;This is a &lt;link linkend="firstChapter"&gt;reference&lt;/link&gt;.&lt;/para&gt;
&lt;/chapter&gt;</programlisting>

                <para>The paragraph contains an <code>IDREF</code> based
                reference to the first chapter being defined as a general
                entity. The master document is a valid XML file with respect
                to our <filename>book.dtd</filename> grammar. We expect this
                result since entities are only a means to
                <emphasis>physically</emphasis> divide a XML file into smaller
                <quote>chunks</quote> without changing the logical structure
                at all.</para>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="section_notation">
        <title>Notations and unparsed entities</title>

        <para>An unparsed entity is conceptually part of an XML document but
        will be ignored by the parser. A common example for unparsed entities
        are images. The most simple way is to reference XML document external
        images by attributes:</para>

        <programlisting>&lt;graphic image="printer.gif"/&gt;</programlisting>

        <para>Many editors simply use this method which apparently suffers
        from some deficiencies:</para>
      </section>
    </chapter>

    <appendix>
      <title>W3C production rules</title>

      <productionset>
        <title>Characters</title>

        <production xml:id="w3RecXml_NT-Letter">
          <lhs>Letter</lhs>

          <rhs><nonterminal def="#w3RecXml_NT-BaseChar">BaseChar</nonterminal>
          | <nonterminal
          def="#w3RecXml_NT-Ideographic">Ideographic</nonterminal></rhs>
        </production>

        <production xml:id="w3RecXml_NT-BaseChar">
          <lhs>BaseChar</lhs>

          <rhs>[#x0041-#x005A] | [#x0061-#x007A] | [#x00C0-#x00D6]
          | [#x00D8-#x00F6] | [#x00F8-#x00FF] | [#x0100-#x0131]
          | [#x0134-#x013E] |...(values omitted here, see W3C
          documentation)</rhs>
        </production>

        <production xml:id="w3RecXml_NT-Ideographic">
          <lhs>Ideographic</lhs>

          <rhs>[#x4E00-#x9FA5] | #x3007 | [#x3021-#x3029]</rhs>
        </production>

        <production xml:id="w3RecXml_NT-CombiningChar">
          <lhs>CombiningChar</lhs>

          <rhs>[#x0300-#x0345] | ...(values omitted here)</rhs>
        </production>

        <production xml:id="w3RecXml_NT-Digit">
          <lhs>Digit</lhs>

          <rhs>[#x0030-#x0039] | [#x0660-#x0669] | [#x06F0-#x06F9]
          | [#x0966-#x096F] | [#x09E6-#x09EF] | [#x0A66-#x0A6F]
          | [#x0AE6-#x0AEF] | [#x0B66-#x0B6F] | [#x0BE7-#x0BEF]
          | [#x0C66-#x0C6F] | [#x0CE6-#x0CEF] | [#x0D66-#x0D6F]
          | [#x0E50-#x0E59] | [#x0ED0-#x0ED9] | [#x0F20-#x0F29]</rhs>
        </production>

        <production xml:id="w3RecXml_NT-Extender">
          <lhs>Extender</lhs>

          <rhs>#x00B7 | #x02D0 | #x02D1 | #x0387 | #x0640 | #x0E46 | #x0EC6
          | #x3005 | [#x3031-#x3035] | [#x309D-#x309E] | [#x30FC-#x30FE]</rhs>
        </production>
      </productionset>
    </appendix>
  </part>

  <part xml:id="persistenceStrategies">
    <title annotations="ws/eclipse/HibIntro/target/classes">Persistence
    strategies and application development</title>

    <chapter xml:id="orm">
      <title>Object Relational Mapping</title>

      <remark>Mapping tools should be used only by someone familiar with
      relational technology. O-R mapping is not meant to save developers from
      understanding mapping problems or to hide them altogether. It is meant
      for those who have an understanding of the issues and know what they
      need, but who don't want to have to write thousands of lines of code to
      deal with a problem that has already been solved.<xref
      linkend="bibKeith09"/>.</remark>

      <section xml:id="configureEclipseMaven">
        <title>Configuring a Maven based Eclipse <link
        linkend="gloss_Java"><trademark>Java</trademark></link> project with
        Hibernate</title>

        <para>We will use Maven for several purposes:</para>

        <figure xml:id="fig_reasonsUsingMaven">
          <title>Reasons for using Maven</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/mavenIntro.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>We do explain the problem of managing transitive dependencies in
        projects:</para>

        <figure xml:id="fig_transitiveDependencies">
          <title>Transitive dependencies</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/transitiveDep.fig" scale="65"/>
            </imageobject>
          </mediaobject>
        </figure>

        <section xml:id="sect_mavenConfigEclipseProject">
          <title>Create a Maven based project in Eclipse</title>

          <para>The following section requires the eclipse Maven plugin to be
          installed. This may be accomplished by installing the <productname
          xlink:href="http://www.jboss.org/tools">Jboss Tools</productname>
          via <guimenu>Help</guimenu> <guisubmenu>Eclipse
          Marketplace</guisubmenu> which will install Maven as a
          dependency.</para>

          <orderedlist>
            <listitem>
              <para>We start Eclipse and choose the <quote>new project</quote>
              wizard.</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/1.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>Filtering <quote>maven</quote> yields our desired project
              type</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/2.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>Just accept the defaults</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/3.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>We select <quote>maven-archetype-quickstart</quote> to
              choose a plain <link
              linkend="gloss_Java"><trademark>Java</trademark></link>
              project</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/4.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>The chosen Group Id will become our project's name.</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/5.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>We end up with a <link
              linkend="gloss_Java"><trademark>Java</trademark></link> project
              already being enabled for <productname
              xlink:href="http://www.junit.org">Junit</productname>
              testing.</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/6.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>
          </orderedlist>

          <para> But wait: We are about to work with (Mysql) databases. Thus
          we need at least a <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          driver. Maven assists us if we define an appropriate dependency as
          we will see in the following section.</para>
        </section>

        <section xml:id="sect_mavenAddMysqlJdbcConnector">
          <title>Adding a <productname
          xlink:href="http://www.mysql.com">Mysql</productname>
          <trademark>JDBC</trademark> driver</title>

          <para>We might just download a <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          implementation jar file like
          <filename>mysql-connector-java-5.1.16.jar</filename> manually and
          add it to our eclipse environment. If we want to share our project
          with other people or work on it on different workstations this jar
          file must be available on each system we are working with.</para>

          <para>One solution might be to integrate it into our project
          completely (e.g. in a <filename>lib</filename> folder) and put the
          whole project under version control (<productname
          xlink:href="http://git-scm.com/">git</productname>, <productname
          xlink:href="http://subversion.apache.org">svn</productname>). On the
          other hand this just bloats our project with external (library)
          dependencies.</para>

          <para>Maven helps us to easily manage external dependencies. The
          idea is to keep them in centralized repositories for download and
          add meta information like a package name, a package group name and a
          version number for retrieval:</para>

          <orderedlist>
            <listitem>
              <para>Searching for <quote>mysql</quote> in a maven repository
              yields the <link
              linkend="gloss_Java"><trademark>Java</trademark></link>
              <trademark
              xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
              connector:</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/mysql1.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>We choose the most recent version:</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/mysql2.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>Again we copy the dependency snippet ...</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/mysql3.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>... and add it to our <filename>pom.xml</filename> file's
              dependency section:</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/mysql4.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>
            </listitem>

            <listitem>
              <para>Did we actually succeed? Right-clicking on our project
              <guimenu>Build path</guimenu> <guisubmenu>Configure Build
              Path</guisubmenu> and choosing the
              <guisubmenu>Libraries</guisubmenu> tab we see our
              <envar>CLASSPATH</envar> being extended:</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Screen/CreateMaven/mysql5.png"
                               scale="80"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>

              <para>Notice the location of the <productname
              xlink:href="http://www.mysql.com">Mysql</productname> jar below
              the <filename>.m2</filename> Maven folder in the user's home
              directory. If we share our project this location will change to
              e.g. <filename>c:\users\foo\.m2\...</filename> due to different
              system default paths.</para>
            </listitem>
          </orderedlist>
        </section>

        <section xml:id="sect_mavenAddHibernate">
          <title>Adding Hibernate dependencies</title>

          <para>Our goal is to start using Hibernate for a console based
          project. Searching the Maven repository for hibernate-core provides
          a suitable artifact:</para>

          <programlisting>&lt;dependency&gt;
  &lt;groupId&gt;org.hibernate&lt;/groupId&gt;
  &lt;artifactId&gt;hibernate-core&lt;/artifactId&gt;
  &lt;version&gt;4.1.9.Final&lt;/version&gt;
&lt;/dependency&gt;            </programlisting>
        </section>

        <section xml:id="sect_createHibernateConfiguration">
          <title>Creating a Hibernate configuration</title>

          <para>Hibernate is intended to provide persistence services saving
          transient <link
          linkend="gloss_Java"><trademark>Java</trademark></link> instances to
          a database. For this purpose Hibernate needs:</para>

          <itemizedlist>
            <listitem>
              <para>The type of database (Oracle, DB2, Mysql,...)</para>
            </listitem>

            <listitem>
              <para>JDBC driver class name.</para>
            </listitem>

            <listitem>
              <para>JDBC connection parameters</para>

              <itemizedlist>
                <listitem>
                  <para>Server name</para>
                </listitem>

                <listitem>
                  <para>port</para>
                </listitem>

                <listitem>
                  <para>user</para>
                </listitem>

                <listitem>
                  <para>password</para>
                </listitem>
              </itemizedlist>
            </listitem>

            <listitem>
              <para>A list of classes to be mapped</para>
            </listitem>

            <listitem>
              <para>Parameters defining the log level, whether generated SQL
              code shall be logged etc.</para>
            </listitem>
          </itemizedlist>

          <para>Hibernate offers an XML based configuration syntax. We show a
          toy example of a <filename>hibernate.cfg.xml</filename>
          configuration file mapping just one class
          <classname>hibintro.v1.model.User</classname> to a Mysql database
          server:</para>

          <figure xml:id="hibernateConfigurationFile">
            <title>A basic Hibernate configuration file
            <filename>hibernate.cfg.xml</filename>.</title>

            <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE hibernate-configuration 
  PUBLIC "-//Hibernate/Hibernate Configuration DTD 3.0//EN"
         "http://hibernate.sourceforge.net/hibernate-configuration-3.0.dtd"&gt;
&lt;hibernate-configuration&gt;
 &lt;session-factory &gt;
  &lt;property name="hibernate.connection.driver_class"&gt;com.mysql.jdbc.Driver&lt;/property&gt;
  &lt;property name="hibernate.connection.password"&gt;XYZ&lt;/property&gt;
  &lt;property name="hibernate.connection.url"&gt;jdbc:mysql://localhost:3306/hdm&lt;/property&gt;
  &lt;property name="hibernate.connection.username"&gt;hdmuser&lt;/property&gt;
  &lt;property name="hibernate.dialect"&gt;org.hibernate.dialect.MySQL5InnoDBDialect&lt;/property&gt;
  &lt;property name="hibernate.show_sql"&gt;true&lt;/property&gt;
  &lt;property name="hibernate.format_sql"&gt;true&lt;/property&gt;
  &lt;property name="hibernate.hbm2ddl.auto"&gt;update&lt;/property&gt;
  
  &lt;mapping class="hibintro.v1.model.User"/&gt;
 &lt;/session-factory&gt;
&lt;/hibernate-configuration&gt;</programlisting>
          </figure>

          <para>This file may be edited with a simple text editor. The Eclipse
          <productname xlink:href="http://www.jboss.org/tools">Jboss
          Tools</productname> Eclipse plugin provides a configuration editor
          simplifying this task. They may be installed on top of Eclipse <link
          xlink:href="http://www.jboss.org/tools/download">in several
          ways</link>. The following video shows some of its features.</para>

          <mediaobject>
            <videoobject>
              <videodata fileref="Ref/Video/hibernateConfig.mp4"/>
            </videoobject>
          </mediaobject>
        </section>
      </section>

      <section xml:id="sect_hibernateBasics">
        <title>A round trip working with objects</title>

        <para>Hibernate may be regarded as a persistence provider to <link
        linkend="gloss_JPA"><abbrev>JPA</abbrev></link>:</para>

        <figure xml:id="jpaPersistProvider">
          <title><link linkend="gloss_JPA"><abbrev>JPA</abbrev></link>
          persistence provider</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/persistProvider.fig"/>
            </imageobject>
          </mediaobject>
        </figure>

        <para>Having configured Hibernate we may now start working with <link
        linkend="gloss_Java"><trademark>Java</trademark></link> objects. To do
        so we need an appropriate session object to run transactions. Starting
        from the Hibernate documentation we code the following helper
        method:</para>

        <programlisting>package hibintro.util;

import org.hibernate.SessionFactory;
import org.hibernate.cfg.Configuration;
import org.hibernate.service.ServiceRegistryBuilder;

public class HibernateUtil {

  /**
   * @param hibernateConfigFileName The filename defaults to &lt;code&gt;hibernate.cfg.xml&lt;/code&gt;.
   * @return Session factory instance to be used for actual session creation by caller.
   */
  public static SessionFactory createSessionFactory(final String hibernateConfigFileName) {
    Configuration configuration = new Configuration();
    configuration.configure(hibernateConfigFileName);
    ServiceRegistryBuilder serviceRegistryBuilder = new ServiceRegistryBuilder().applySettings(configuration
        .getProperties());
    return configuration
        .buildSessionFactory(serviceRegistryBuilder.buildServiceRegistry());
  }
}</programlisting>

        <para>The following class
        <classname>hibintro.v1.model.User</classname> will be used as a
        starting example to be mapped to a database. Notice the
        <classname>javax.persistence.Entity</classname> <link
        xlink:href="http://docs.oracle.com/javase/tutorial/java/javaOO/annotations.html">annotation</link>
        <coref linkend="entityAnnotation"/>:</para>

        <figure xml:id="mappingUserInstances">
          <title>Mapping <classname>hibintro.v1.model.User</classname>
          instances to a database.</title>

          <programlisting>package hibintro.v1.model;

...

<emphasis role="bold">@Entity</emphasis> <co xml:id="entityAnnotation"/>
public class User {

  <emphasis role="bold">//The user's unique login name e.g. "goik"</emphasis>
  String uid; 
  public String getUid() {return uid;}
  public void setUid(String uid) {this.uid = uid;}

  <emphasis role="bold">// The user's common name e.g. "Martin Goik"</emphasis>
  String cname;
  public String getCname() {return cname;}
  public void setCname(String cname) {this.cname = cname;}

  <emphasis role="bold">// Hibernate requires a default constructor</emphasis>
  public User() {}

  public User(String uid, String cname) {
    super();
    this.uid = uid;
    this.cname = cname;
  }
}</programlisting>
        </figure>

        <para>With respect to <xref linkend="hibernateConfigurationFile"/> we
        notice our class <classname>hibintro.v1.model.User</classname> being
        referenced:</para>

        <programlisting>&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;!DOCTYPE hibernate-configuration 
...  
  &lt;mapping class="<emphasis role="bold">hibintro.v1.model.User</emphasis>"/&gt; 
 &lt;/session-factory&gt;
&lt;/hibernate-configuration&gt;</programlisting>

        <para>This line tells Hibernate to actually map
        <classname>hibintro.v1.model.User</classname> to a (Mysql)
        database.</para>

        <section xml:id="persistingObjects">
          <title>Persisting objects</title>

          <para>Persisting transient objects may be achieved in various ways.
          In <xref linkend="jdbcIntro"/> we introduced the <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          <abbrev xlink:href="http://en.wikipedia.org/wiki/Api">API</abbrev>
          connecting <link
          linkend="gloss_Java"><trademark>Java</trademark></link> applications
          and relational database systems. We stored and retrieved object
          values.</para>

          <para>Having larger projects these tasks become increasingly
          tedious. It is thus desired to automate these tasks while still
          using <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          as a low level transport layer. This is being shown in <xref
          linkend="jdbcFourTier"/>. That figure already mentions Hibernate as
          a possible persistence service provider.</para>

          <para>The following sections start with a single class
          <classname>hibintro.v1.model.User</classname>:</para>

          <figure xml:id="fig_BasicUser">
            <title>A basic <code>User</code> class.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/classUser.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>Object relational mapping (ORM) denotes the process of mapping
          instances of classes to relational table data. In our current
          example we may draw a simple implementation sketch:</para>

          <figure xml:id="mappingProperties2attributes">
            <title>Mapping properties to attributes.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/mapUser.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>This is far too simplistic. What about integrity
          constraints?</para>

          <figure xml:id="mappingIntegrityConstraints">
            <title>Annotating integrity constraints</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/mapUserIntegrity.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>We start with the following
          <classname>hibintro.v1.model.User</classname> class lacking
          integrity constraints completely:</para>

          <programlisting>package hibintro.v1.model;
@Entity
public class User {
  String uid;
  public String getUid() {return uid;}
  public void setUid(String uid) {this.uid = uid;}

  String cname;
  public String getCname() {return cname;}
  public void setCname(String cname) {this.cname = cname;}

  /**
   * Hibernate/JPA require a default constructor. It has has to be implemented
   * if any non-default constructor has been defined
   */
  public User() {}

  /**
   * @param uid See {@link #getUid()}.
   * @param cname See {@link #getCname()}.
   */
  public User(String uid, String cname) {
    this.uid = uid;
    this.cname = cname;
  }
}</programlisting>

          <para>Persisting objects with Hibernate requires a
          <classname>org.hibernate.Session</classname> instance <coref
          linkend="sessionInstance"/>. It happens between the start <coref
          linkend="startTransaction"/> and commit <coref
          linkend="commitTransaction"/> of a transaction being derived from
          that session:</para>

          <programlisting>package hibintro.v1.run;

...
public class PersistSingleUser {

  public static void main(String[] args) {
    final <classname>org.hibernate.Session</classname> session <co
              xml:id="sessionInstance"/>= HibernateUtil.createSessionFactory("hibernate.cfg.xml").openSession();

    final <classname>org.hibernate.Transaction</classname> transaction = session.beginTransaction();<co
              xml:id="startTransaction"/>

    final <classname>hibintro.v1.model.User</classname> u = new User("goik", "Martin Goik");
    session.save(u);

    transaction.commit(); <co xml:id="commitTransaction"/>
  }
}</programlisting>

          <para>Executing the above code yields a runtime exception:</para>

          <programlisting>Exception in thread "main" java.lang.ExceptionInInitializerError
   at myhibernate.intro.run.PersistUser.main(PersistUser.java:14)
Caused by: org.hibernate.AnnotationException: <emphasis role="bold">No identifier specified for entity: myhibernate.intro.model.User</emphasis>
...
   at myhibernate.intro.util.HibernateUtil.buildConfiguration(HibernateUtil.java:17)
   at myhibernate.intro.util.HibernateUtil.&lt;clinit&gt;(HibernateUtil.java:9)</programlisting>

          <para>This runtime error is a little bit cryptic. The missing
          <quote>identifier</quote> refers to the absence of a primary key
          definition already mentioned in <xref
          linkend="mappingIntegrityConstraints"/>. We define a key by
          annotating the <code>uid</code> property with a
          <classname>javax.persistence.Id</classname> annotation <coref
          linkend="primaryKeyDefinition"/>:</para>

          <programlisting>package hibintro.v1.model;

import javax.persistence.Entity;
<emphasis role="bold">import javax.persistence.Id;</emphasis>

...
@Entity public class User {...
  <emphasis role="bold">@Id</emphasis> <co xml:id="primaryKeyDefinition"/>
  public String getUid() {
    return uid;
  } ...</programlisting>

          <para>The careful reader will have noticed that we've annotated the
          getter method rather than the property <code>uid</code> itself.
          Hibernate / <link linkend="gloss_JPA"><abbrev>JPA</abbrev></link>
          can work both ways. Annotating a getter however offers additional
          support e.g. when logging for debugging purposes is required.</para>

          <para>This time we are successful. Since we enabled the logging of
          SQL statements in <xref linkend="hibernateConfigurationFile"/>
          Hibernate shows us the corresponding <code>INSERT</code>
          statement:</para>

          <programlisting>Hibernate: 
    insert 
    into
        User
        (cname, uid) sky
    values
        (?, ?)</programlisting>

          <para>Notice the (?,?) part of our log: This indicates the internal
          usage of <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          <classname>java.sql.PreparedStatement</classname> instances.
          Hibernate generates the following create table statement:</para>

          <figure xml:id="fig_createTableV1User">
            <title>Database schema mapping instances of
            <classname>hibintro.v1.model.User</classname>.</title>

            <programlisting>CREATE TABLE User (
  uid VARCHAR(255) NOT NULL PRIMARY KEY,
  cname VARCHAR(255)
) </programlisting>
          </figure>
        </section>

        <section xml:id="loadingObjectsByPrimaryKey">
          <title>Loading Objects by primary key</title>

          <para>Having persisted a single
          <classname>hibintro.v1.model.User</classname> instance by means of
          <classname>hibintro.v1.run.PersistSingleUser</classname> we may now
          load the database object. The easiest way is based on both the
          requested object's type <coref linkend="specLoadType"/> and its
          primary key value <coref linkend="specLoadPrimaryKey"/>:</para>

          <figure xml:id="loadByClassAndPrimaryKey">
            <title>Loading a single object by a primary key value.</title>

            <programlisting>package hibintro.v1.run;
...
public class RetrieveSingleUser {
...
    final Transaction transaction = session.beginTransaction();

    final User u = (User) session.load(<emphasis role="bold">User.class</emphasis> <co
                xml:id="specLoadType"/>, "<emphasis role="bold">goik</emphasis>" <co
                xml:id="specLoadPrimaryKey"/>);
    if (null == u ) {
      System.out.println("No such user 'goik'");
    } else {
      System.out.println("Found user '" + u.getCname() + "'");
    }
    transaction.commit();...</programlisting>
          </figure>

          <para>This retrieves the expected result. Buried in other log
          messages we find the following SQL <quote>background</quote>
          statement:</para>

          <programlisting>...
INFO: HHH000232: Schema update complete
Hibernate: 
    <emphasis role="bold">select
        user0_.uid as uid0_0_,
        user0_.cname as cname0_0_ 
    from
        User user0_ 
    where
        user0_.uid=?</emphasis>

Found user 'Martin Goik'</programlisting>

          <qandaset role="exercise">
            <title>Choosing the correct method</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>Actually the code in <xref
                  linkend="loadByClassAndPrimaryKey"/> is not quite correct.
                  Execute it with a non-existing primary key value i.e.
                  <quote>goik2</quote>. What do you observe? Can you explain
                  that behaviour?</para>

                  <para>Read the documentation of the
                  <classname>org.hibernate.Session</classname>.<code>load()</code>
                  method and correct the code snippet.</para>
                </question>

                <answer>
                  <para>If there is no corresponding database object we
                  receive a
                  <classname>org.hibernate.ObjectNotFoundException</classname>
                  :<coref linkend="loadUserObjectNotFoundException"/></para>

                  <programlisting>Hibernate: 
    select
        user0_.uid as uid0_0_,
        user0_.cname as cname0_0_ 
    from
        User user0_ 
    where
        user0_.uid=?
Exception in thread "main" org.hibernate.ObjectNotFoundException: <co
                      xml:id="loadUserObjectNotFoundException"/>No row with the given identifier exists: [hibintro.v1.model.User#goik2]
...
  at org.hibernate.proxy.pojo.javassist.JavassistLazyInitializer.invoke(JavassistLazyInitializer.java:185)
  at hibintro.v1.model.User_$$_javassist_0.getCname(User_$$_javassist_0.java)
  at hibintro.v1.run.RetrieveSingleUser.main(<emphasis role="bold">RetrieveSingleUser.java:35</emphasis>)<co
                      xml:id="exceptionOnGetCname"/>
</programlisting>

                  <para>Due to <coref linkend="exceptionOnGetCname"/> the
                  exception is being triggered by the <code>getCname()</code>
                  call. The documentation of <code>load()</code> tells us that
                  method calls may be delegated to proxy objects which is
                  being implemented by byte code instrumentation. If however
                  no matching database object exists calling the proxy
                  instance yields a
                  <classname>org.hibernate.ObjectNotFoundException</classname>.</para>

                  <para>The documentation also tells us to use the
                  corresponding
                  <methodname>org.hibernate.Session.get(Class,Serializable)</methodname>
                  method which actually returns <code>null</code> in case a
                  primary key value does not exist:</para>

                  <programlisting>... final User u = (User) session.get(User.class, "goik2");
    if (null == u ) {
      System.out.println("No such user having key value 'goik2'");
...</programlisting>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="loadingObjectsByQuery">
          <title>Loading objects by queries</title>

          <para>Often we are interested in a (sub)set of results. We populate
          our database with additional
          <classname>hibintro.v1.model.User</classname> instances:</para>

          <programlisting>package hibintro.v1.run;
...
public class PersistUsers {
  ...
    final Transaction transaction = session.beginTransaction();

    final User users[] = {new User("wings", "Fred Wings"),
        new User("eve", "Eve Briggs")} ;
    for (final User u : users ) {session.save(u);}

    transaction.commit(); ...</programlisting>

          <para>Now we'd like to retrieve these objects. Hibernate offers the
          <emphasis role="bold">H</emphasis>ibernate <emphasis
          role="bold">Q</emphasis>uery <emphasis
          role="bold">L</emphasis>anguage (<abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>)
          for object queries. As we will see <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
          extends <acronym
          xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym> with
          respect to polymorphic queries. The current example does not use
          inheritance leaving us with a simple <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
          query <coref linkend="hqlFromUser"/> in
          <classname>hibintro.v1.run.RetrieveAll</classname>:</para>

          <figure xml:id="retrieveAllUserByHql">
            <title>Retrieving <classname>hibintro.v1.model.User</classname>
            instances by <abbrev
            xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>.</title>

            <programlisting>package hibintro.v1.run;
   ...
public class RetrieveAll {
   ...
final Query searchUsers = session.createQuery("<emphasis role="bold">from User</emphasis>");<co
                xml:id="hqlFromUser"/>
final List&lt;User&gt; users = (List&lt;User&gt;) searchUsers.list();
  for (final User u: users) {
      System.out.println("uid=" + u.getUid() + ", " + u.getCname());
  }</programlisting>
          </figure>

          <para>Being used to <acronym
          xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>we notice
          the absence of a SELECT clause in <coref linkend="hqlFromUser"/>:
          The ratio behind is having a focus on objects rather than on
          attribute sets. Thus our <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
          query returns a set of <classname>hibintro.v1.model.User</classname>
          instances:</para>

          <programlisting>uid=eve, Eve Briggs
uid=goik, Martin Goik
uid=wings, Fred Wings</programlisting>

          <qandaset role="exercise">
            <title><abbrev
            xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
            and <acronym
            xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>.</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>We may actually retrieve attributes rather than
                  objects. For this purpose our query actually resembles
                  standard <acronym
                  xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym><coref
                  linkend="hqlWithSelect"/>:</para>

                  <programlisting>final Query searchUsers = session.createQuery("<emphasis
                      role="bold">select uid, cname from User</emphasis>" <co
                      xml:id="hqlWithSelect"/>);
final Object queryResult <co xml:id="queryResultFromSelect"/>= searchUsers.list();</programlisting>

                  <para>Use the <methodname>Class.getSimpleName()</methodname>
                  reflection method to iteratively analyze the
                  <code>queryResult</code> <coref
                  linkend="queryResultFromSelect"/> instance's structure. This
                  guides you in finding suitable casts to add code similar as
                  in <xref linkend="retrieveAllUserByHql"/> in order to write
                  user's attribute values to standard output.</para>
                </question>

                <answer>
                  <para>A possible implementation reads:</para>

                  <programlisting>package hibintro.v1.run;
...
public class GetUsersAsAttributes {
...
    final Query searchUsers = session.createQuery("<emphasis role="bold">select uid, cname from User</emphasis>");

    @SuppressWarnings("unchecked")
    final Object queryResult = searchUsers.list();
    System.out.println("queryResult type:" + queryResult.getClass().getSimpleName()); <co
                      xml:id="typeOfHqlResult"/>
    final List&lt;Object&gt; usersAttributes = (List&lt;Object&gt;) queryResult;
    for (final Object o: usersAttributes) {
      System.out.println("result set element type:" + o.getClass().getSimpleName()); <co
                      xml:id="typeOfEmbeddedObjects"/>
      final Object attributes[] = (Object []) o;
      for (Object attribute: attributes) {
        System.out.println("attribute value:" + attribute);
      }
    }...</programlisting>

                  <para>Actually the two lines <coref
                  linkend="typeOfHqlResult"/> and <coref
                  linkend="typeOfEmbeddedObjects"/> are only needed during the
                  development process to discover the result set's object
                  structure.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>

          <para>The careful reader may already expect <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
          to offer additional features namely predicate based queries.
          Following <classname>hibintro.v1.run.SelectUser</classname> we may
          restrict our result set by an <acronym
          xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym> style
          <code>WHERE</code> clause:</para>

          <programlisting>    final List&lt;User&gt; users = (List&lt;User&gt;) session.createQuery(
            "<emphasis role="bold">from User u where u.cname like '%e%'</emphasis>").list();
    for (final User u: users) {
      System.out.println("Found user '" + u.getCname() + "'");
    }</programlisting>

          <para>This time we receive a true subset of
          <classname>hibintro.v1.model.User</classname> instances:</para>

          <programlisting>Found user 'Eve Briggs'
Found user 'Fred Wings'</programlisting>
        </section>

        <section xml:id="criteriaBasedQueries">
          <title>Criteria based queries</title>

          <para>Selecting Objects by <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>c
          queries technically means parsing <abbrev
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>
          and transforming it into some sort of abstract syntax tree. We may
          instead create corresponding structures by using
          <trademark>Hibernate</trademark>'s criteria API:</para>
        </section>
      </section>

      <section xml:id="mappingSingleClasses">
        <title>Mapping single entities and database tables</title>

        <section xml:id="transientProperties">
          <title>Transient properties</title>

          <para>We take a closer look at <xref
          linkend="mappingUserInstances"/> assuming that Instances of
          <classname>hibintro.v1.model.User</classname> need an additional
          <emphasis role="bold">GUI related</emphasis> property
          <code>selected</code> <coref linkend="propertyIsSelected"/>:</para>

          <programlisting>package hibintro.v2;


@Entity public class User {
...
  boolean <emphasis role="bold">selected</emphasis> <co
              xml:id="propertyIsSelected"/> = false;
  
  public boolean isSelected() {
   return selected;
  }
  public void setSelected(boolean selected) {
     this.selected = selected;
  }
  ...
}</programlisting>

          <para>Hibernates produces the following <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev>
          statements containing an attribute <code>selected</code> <coref
          linkend="attributeSelected"/>:</para>

          <programlisting>CREATE TABLE User (
  uid VARCHAR(255) NOT NULL PRIMARY KEY,
  cname VARCHAR(255),
  <emphasis role="bold">selected</emphasis> <co xml:id="attributeSelected"/> BIT NOT NULL,
) </programlisting>

          <para>If we just annotate a Java class with an
          <classname>javax.persistence.Entity</classname> Annotation all
          properties of the class in question will be mapped. The Hibernate
          framework of course cannot distinguish between transient and
          persistent properties. If we want a property to be transient we have
          to add a <classname>javax.persistence.Transient</classname>
          annotation to the corresponding getter method:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v3;
...
@Entity
public class User {
...
  boolean selected = false;
  @Transient <co xml:id="transientAnnotation"/> public boolean isSelected() {
   return selected;
  }
  public void setSelected(boolean selected) {
     this.selected = selected;
  }...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">SQL</emphasis></td>

              <td><programlisting>CREATE TABLE User ( 
  uid VARCHAR(255) NOT NULL PRIMARY KEY,
  cname VARCHAR(255)
) </programlisting></td>
            </tr>
          </informaltable>

          <para>The <classname>javax.persistence.Transient</classname>
          annotation inhibits the mapping of our property
          <code>selected</code>.</para>

          <caution>
            <para>When loading a <classname>hibintro.v3.User</classname>
            instance from a database the transient property's value is of
            course entirely determined by the constructor.</para>
          </caution>
        </section>

        <section xml:id="sect_mappingNullValues">
          <title>Properties and NULL values</title>

          <para>In <xref linkend="fig_createTableV1User"/> the primary key
          <code>uid</code> property's value must not be <code>NULL</code>.
          This is an immediate consequence of the
          <classname>javax.persistence.Id</classname> annotation and the fact
          that databases don't allow NULL values for key attributes.</para>

          <para>The <code>cname</code> property however may be null. Sometimes
          we want to ensure the corresponding database attributes to be set,
          at least carrying an empty string value. This can be achieved by
          adding a
          <classname>javax.persistence.Column</classname><code>(nullable =
          false)</code> annotation:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v4;

...
@Entity public class User {

  String cname;
  <emphasis role="bold">@Column(nullable = false)</emphasis> public String getCname() {
    return cname;
  }
...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">SQL</emphasis></td>

              <td><programlisting>CREATE TABLE User (
  uid VARCHAR(255) NOT NULL PRIMARY KEY,
  cname VARCHAR(255) <emphasis role="bold">NOT NULL</emphasis> <co
                    xml:id="cnameDatabaseNotNull"/>
)</programlisting></td>
            </tr>
          </informaltable>

          <para>This results in a corresponding database constraint <coref
          linkend="cnameDatabaseNotNull"/>. Attempting to store instances with
          null values now fails:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v4;
...
public class PersistSingleUser {

      final Transaction transaction = session.beginTransaction();
      {
         final User u = new User("goik", null);
         session.save(u);
      }
      transaction.commit(); ...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Log</emphasis></td>

              <td><programlisting>Hibernate: 
    insert 
    into
        User
        (cname, uid) 
    values
        (?, ?)
...
WARN: SQL Error: 1048, SQLState: 23000
Feb 13, 2013 9:38:32 PM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
ERROR: Column 'cname' cannot be null
Exception in thread "main" org.hibernate.exception.ConstraintViolationException: Column 'cname' cannot be null
...
<emphasis role="bold">Caused by: com.mysql.jdbc.exceptions.jdbc4.MySQLIntegrityConstraintViolationException: Column 'cname' cannot be null</emphasis>
...</programlisting></td>
            </tr>
          </informaltable>

          <para>The exception is thrown by the <trademark
          xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc">JDBC</trademark>
          driver as the result of a database constraint violation but not by
          the hibernate framework itself prior to attempting the
          insert.</para>
        </section>

        <section xml:id="mappingKeys">
          <title>Defining keys</title>

          <para>Frequently we need more than just a primary key. Starting from
          <classname>hibintro.v4.User</classname> we may want to add a
          property <code>uidNumber</code>. This is a common requirement: On
          UNIX type operation systems for example each user does have both a
          unique login name (like <quote>goik</quote>) and a unique numerical
          value (like <quote>123</quote>). We choose our primary key to be
          numeric <coref linkend="uidNumberIsPrimaryKey"/>and the login name
          to become a second candidate key <coref
          linkend="uidIsUnique"/>:</para>

          <programlisting>package hibintro.v5;
...
@Entity
@Table(uniqueConstraints={@UniqueConstraint(columnNames={"uid"})}) <co
              xml:id="uidIsUnique"/>
public class User {

  int uidNumber;
  @Id <co xml:id="uidNumberIsPrimaryKey"/> public int getUidNumber() {
    return uidNumber;
  }
  public void setUidNumber(int uidNumber) {
    this.uidNumber = uidNumber;
  }

  String uid;
  public String getUid() {
    return uid;
  }
  public void setUid(String uid) {
    this.uid = uid;
  }
...</programlisting>

          <para>Notice the slight difference: The property <code>uid</code>
          may need a
          <code>@</code><code><classname>javax.persistence.Column</classname>(nullable=false)</code>
          annotation to become a candidate key. This is
          <emphasis>not</emphasis> automatically inferred by the
          <classname>javax.persistence.UniqueConstraint</classname> definition
          <coref linkend="uidIsUnique"/>. In contrast the property
          <code>uidNumber</code> is not being referenced by the preceding
          <classname>javax.persistence.Table</classname> annotation but
          annotated by <classname>javax.persistence.Id</classname>. Hence a
          <code>nullable=false</code> is not needed.</para>

          <para>This is in accordance with <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev>:
          Attributes composing a primary key must not allow <code>NULL</code>
          values but attributes only appearing in UNIQUE declarations may
          become <code>NULL</code>.</para>

          <para>The <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev>
          reads:</para>

          <programlisting>CREATE TABLE User (
  uidNumber INT NOT NULL PRIMARY KEY,
  cname VARCHAR(255) NOT NULL,
  uid VARCHAR(255) NOT NULL UNIQUE
)</programlisting>
        </section>

        <section xml:id="sect_ComposedKeys">
          <title>Composed keys</title>

          <para>Composed candidate keys are sometimes referred to as business
          keys. The underlying logic defines which objects are considered to
          be identical based on their values.</para>

          <para>As an example, we consider a company having several
          departments. Regarding projects he following business rules shall
          apply:</para>

          <figure xml:id="projectBusinessRules">
            <title>Business rules for projects</title>

            <orderedlist>
              <listitem>
                <para>Each department must have a unique name.</para>
              </listitem>

              <listitem>
                <para>A project's name must be unique within the set of all
                projects belonging to the same department.</para>
              </listitem>

              <listitem>
                <para>A project must be assigned to exactly one
                department.</para>
              </listitem>
            </orderedlist>
          </figure>

          <para>Right now we defer considerations of the n:1 relationship
          between departments and projects to a later chapter. Instead we
          focus just on project instances and represent departments just by
          their integer id values which will later become foreign keys.</para>

          <para>In addition each project receives a unique integer id value as
          well. This is in accordance with the <quote>best practice</quote>
          rule of defining a <link
          xlink:href="http://en.wikipedia.org/wiki/Surrogate_key">surrogate
          key</link> <coref linkend="projectPrimaryKeyDefinition"/> to be used
          as (primary) object identifier. This immutable key will then become
          the target in foreign key definitions:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v6;
...
@Entity
@Table(uniqueConstraints={@UniqueConstraint(columnNames={"name", "department"})}) <co
                    xml:id="projectBusinessKey"/>
public class Project {
  int id;
  @Id <co xml:id="projectPrimaryKeyDefinition"/> public int getId() {return id;}
  protected void setId(int id) {this.id = id;}

  String name;
  @Column(nullable=false) public String getName() {return name;}
  public void setName(String name) {this.name = name;}

  int department;
  @Column(nullable=false)
  public int getDepartment() {return department;}
  public void setDepartment(int department) {this.department = department;}
...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Sql</emphasis></td>

              <td><programlisting>CREATE TABLE Project (
  id int(11) NOT NULL PRIMARY KEY <coref linkend="projectPrimaryKeyDefinition"/>,
  department int(11) NOT NULL,
  name varchar(255) NOT NULL,
  UNIQUE KEY name (name,department) <coref linkend="projectBusinessKey"/>
)</programlisting></td>
            </tr>
          </informaltable>

          <calloutlist>
            <callout arearefs="projectPrimaryKeyDefinition">
              <para>Defining the surrogate primary key.</para>
            </callout>

            <callout arearefs="projectBusinessKey">
              <para>Defining a business key composed of a project's
              <code>name</code> and <code>department</code> number. This
              implements our second business rule in <xref
              linkend="projectBusinessRules"/>.</para>
            </callout>
          </calloutlist>

          <qandaset role="exercise">
            <title><link linkend="gloss_JPA"><abbrev>JPA</abbrev></link>
            requirements.</title>

            <qandadiv>
              <qandaentry>
                <question>
                  <para>The setter void <methodname
                  annotations="nojavadoc">setId(int)</methodname>in
                  <classname>hibintro.v6.Project</classname> has protected
                  access. Explain this choice.</para>
                </question>

                <answer>
                  <para>From an application developer's point of view the
                  setter should be absent: The <code>id</code> property is
                  immutable and should not be accessed at all.</para>

                  <para>When loading an instance from a database a persistence
                  provider however has to set its value. Hibernate uses the
                  reflection-API to override the restriction being imposed by
                  the <code>protected</code> modifier. So why not declare it
                  private? Doing so may cause our IDE to flag a warning about
                  an unused private method.</para>

                  <para>So choosing <code>protected</code> is a compromise: An
                  application developer cannot modify the property (unless
                  deriving a class) and our persistence provider can still set
                  its value to the database's primary key attribute
                  value.</para>
                </answer>
              </qandaentry>
            </qandadiv>
          </qandaset>
        </section>

        <section xml:id="nonUniqueIndexes">
          <title>Indexes (non-unique)</title>

          <para>From the viewpoint of software modelling non-unique indexes
          are not part of the business logic but refer to database
          optimization. Consequently <link
          linkend="gloss_JPA"><abbrev>JPA</abbrev></link> has no support for
          non-unique indexes.</para>

          <para>On the other hand performance matters. Hibernate and other
          persistence providers offer vendor specific <link
          linkend="gloss_JPA"><abbrev>JPA</abbrev></link> extensions. We may
          find it useful to access <classname>hibintro.v5.User</classname>
          instances having a specific <code>cname</code> quickly. This can be
          achieved by adding a Hibernate specific
          <code>org.hibernate.annotations.</code><classname>org.hibernate.annotations.Table</classname>
          index generating annotation <coref
          linkend="hibernateExtensionIndex"/> which works on top of <link
          linkend="gloss_JPA"><abbrev>JPA</abbrev></link>'s
          <code>javax.persistence.</code><classname>javax.persistence.Table</classname>:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v7;
...
@Entity
@Table(uniqueConstraints={@UniqueConstraint(columnNames={"uid"})})
<emphasis role="bold">@org.hibernate.annotations.Table(</emphasis> <co
                    xml:id="hibernateExtensionIndex"/>
             <emphasis role="bold">appliesTo="User",
             indexes = {@Index(name = "findCname", columnNames = {"cname"})})</emphasis>
public class User {
...
  String cname;
  @Column(nullable = false) public String getCname() { return cname;}
  public void setCname(String cname) {this.cname = cname;}
...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Sql</emphasis></td>

              <td><programlisting>CREATE TABLE User (
  uidNumber INT NOT NULL PRIMARY KEY,
  cname VARCHAR(255) NOT NULL,
  uid VARCHAR(255) NOT NULL UNIQUE
);

CREATE INDEX findCname ON User (cname ASC);</programlisting></td>
            </tr>
          </informaltable>
        </section>

        <section xml:id="sect_RenameTablesAndAttributes">
          <title>Renaming tables and attributes</title>

          <para>So far we assumed that we map classes to database tables
          having identical names: A <link
          linkend="gloss_Java"><trademark>Java</trademark></link> class
          <code>User</code> is being mapped to a relational table with
          identical name <code>User</code>. Sometimes a renaming is desired.
          We may for example want to access a legacy database by a newly
          implemented <link
          linkend="gloss_Java"><trademark>Java</trademark></link> application.
          Choosing meaningful names may conflict with decisions being taken
          when the original database design took place.</para>

          <para>In the following example we change the database tables name
          from its default User to Person <coref
          linkend="renameUserToPerson"/>. The properties
          <code>uidNummbe</code>r and <code>cname</code> are changed to
          attribute names <code>numericUid</code> <coref
          linkend="renameUidNumberToNumericUid"/>and <code>fullName</code>
          <coref linkend="renameCnameToFullName"/> respectively:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v8;
...
@Entity
@Table(name="Person") <co xml:id="renameUserToPerson"/>
public class User {

  int uidNumber;
  @Id
  @Column(name="numericUid") <co xml:id="renameUidNumberToNumericUid"/>
  public int getUidNumber() {return uidNumber;}
  public void setUidNumber(int uidNumber) {this.uidNumber = uidNumber;}

  String uid;
  @Column(nullable=false)
  public String getUid() {return uid;}
  public void setUid(String uid) {this.uid = uid;}

  String cname;
  @Column(nullable = false, name="fullName") <co
                    xml:id="renameCnameToFullName"/>
  public String getCname() {return cname;}
  public void setCname(String cname) {this.cname = cname;}
...</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Sql</emphasis></td>

              <td><programlisting>CREATE TABLE Person <coref
                    linkend="renameUserToPerson"/> (
  numericUid <coref linkend="renameUidNumberToNumericUid"/> int(11) NOT NULL PRIMARY KEY,
  fullName <coref linkend="renameCnameToFullName"/> varchar(255) NOT NULL,
  uid varchar(255) NOT NULL
)</programlisting></td>
            </tr>
          </informaltable>
        </section>

        <section xml:id="sectChangeDefaultTypeMapping">
          <title>Changing the default type mapping</title>

          <para>Sometimes we are interested in changing <link
          linkend="gloss_JPA"><abbrev>JPA</abbrev></link>'s default type
          mapping strategy. For example <trademark
          xlink:href="http://www.mysql.com/about/legal/trademark.html">Mysql</trademark>
          versions prior to 5.0 lack an appropriate type representing boolean
          values. It was therefore quite common mapping boolean properties to
          <code>CHAR(1)</code> with possible values being <code>'Y'</code> and
          <code>'N'</code>. Hibernate will map boolean values to
          <code>tinyint(1)</code>. Supporting older software may require to
          tweak the standard mapping.</para>

          <para>Unfortunately <link
          linkend="gloss_JPA"><abbrev>JPA</abbrev></link> itself does not
          offer any interface for this purpose. The persistence provider may
          offer a solution though. Hibernate for example allows to remap
          <coref linkend="remapBooleanChar"/> types . We assume our
          <classname>hibintro.v9.User</classname> class to have a
          <code>boolean</code> property <code>active</code>:</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package hibintro.v9;
...
public class User {
...
   public void setCname(String cname) {this.cname = cname;}

   boolean active = false;
   @Type(type="yes_no") <co xml:id="remapBooleanChar"/>
   public boolean isActive() {return active;}
   public void setActive(boolean active) {this.active = active;}
}</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Sql</emphasis></td>

              <td><programlisting>CREATE TABLE User (
  uidNumber int(11) NOT NULL PRIMARY KEY,
  active char(1) NOT NULL,
  cname varchar(255) DEFAULT NULL,
  uid varchar(255) NOT NULL
)</programlisting></td>
            </tr>
          </informaltable>

          <para>Readers being interested in more sophisticated strategies like
          mapping user defined data types to database types are advised to
          read the <link
          xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch05.html#mapping-types">manual
          section on Hibernate types</link>.</para>
        </section>
      </section>

      <section xml:id="inheritance">
        <title>Inheritance</title>

        <para>Mapping inheritance hierarchies to relational databases means
        bridging the gap between object <link
        xlink:href="http://en.wikipedia.org/wiki/Object-relational_impedance_mismatch">oriented
        and relational models</link>. We start with a slightly modified
        example from <xref linkend="Bauer05"/>:</para>

        <figure xml:id="fig_BillingDetails">
          <title>Modelling payment.</title>

          <mediaobject>
            <imageobject>
              <imagedata fileref="Ref/Fig/billing.fig"/>
            </imageobject>

            <caption>
              <para>Simplified Billing details example derived from <xref
              linkend="Bauer05"/>. Notice
              <classname>inherit.v1.BillingDetails</classname> being an
              abstract parent class of two concrete classes
              <classname>inherit.v1.CreditCard</classname> and
              <classname>inherit.v1.BankAccount</classname>. The attribute
              <code>number</code> applies both to bank account and credit card
              payments.</para>
            </caption>
          </mediaobject>
        </figure>

        <para>Since the relational model lacks inheritance completely we have
        to implement a database schema ourselves. We subsequently explore
        three main approaches each of which having its own advantages and
        disadvantages.</para>

        <section xml:id="sect_InheritTablePerClassHierarchie">
          <title>Single table per class hierarchy</title>

          <para>This approach may be considered the most simple: We just
          create one database table for storing instances of arbitrary classes
          belonging to the inheritance hierarchy in question:</para>

          <figure xml:id="fig_TablePerClassHierarchyData">
            <title>A single relation mapping.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/billingData.fig"/>
              </imageobject>

              <caption>
                <para>Fitting both
                <classname>inherit.v1.CreditCard</classname> and
                <classname>inherit.v1.BankAccount</classname> instances into a
                single relation.</para>
              </caption>
            </mediaobject>
          </figure>

          <para>The relation may be created by the following <abbrev
          xlink:href="http://en.wikipedia.org/wiki/Data_definition_language">DDL</abbrev>:</para>

          <figure xml:id="fig_TablePerClassHierarchyMapping">
            <title>Mapping the inheritance hierarchy.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/billingSql.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>We take a closer look at the generated relation. Since</para>

          <informaltable border="1">
            <colgroup width="6%"/>

            <colgroup width="94%"/>

            <tr>
              <td valign="top"><emphasis role="bold">Java</emphasis></td>

              <td valign="top"><programlisting>package inherit.v1;
         ...
@Entity
@Inheritance(strategy=InheritanceType.<emphasis role="bold">SINGLE_TABLE</emphasis>) <co
                    linkends="billingMapSingleTableCallout"
                    xml:id="billingMapSingleTable"/>
@DiscriminatorColumn(name="dataType", discriminatorType=DiscriminatorType.STRING) <co
                    linkends="billingMapSingleTableDiscriminatorCallout"
                    xml:id="billingMapSingleTableDiscriminator"/>
abstract class BillingDetails {
   @Id @GeneratedValue <co linkends="billingMapSingleTableIdGeneratedCallout"
                    xml:id="billingMapSingleTableIdGenerated"/> public Long getId() ...
   @Column(nullable = false, length = 32)public final String getNumber() ...
   @Temporal(TemporalType.TIMESTAMP)
   @Column(nullable = false) public Date getCreated() ...</programlisting><programlisting>package inherit.v1;
  ...
@Entity
@DiscriminatorValue(value = "Credit card" <co
                    xml:id="billingMapSingleTableDiscriminatorCredit"/>)
public class CreditCard extends BillingDetails {
  ... //Nothing JPA related happens here</programlisting><programlisting>package inherit.v1;
  ...
@Entity
@DiscriminatorValue(value = "Bank account" <co
                    xml:id="billingMapSingleTableDiscriminatorBank"/>)
public class BankAccount extends BillingDetails {
  ... //Nothing JPA related happens here</programlisting></td>
            </tr>

            <tr>
              <td valign="top"><emphasis role="bold">Sql</emphasis></td>

              <td><programlisting continuation="continues">CREATE TABLE BillingDetails <co
                    linkends="billingMapSingleTableCallout"
                    xml:id="BillingDetailsGeneratedRelationName"/> (
  dataType varchar(31) NOT NULL,
  id bigint(20) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  number varchar(255) NOT NULL, <co
                    linkends="billingMapSingleTableBaseNotNull"
                    xml:id="billingMapSingleTableCalloutNumberNotNull"/>
  created datetime NOT NULL, <co linkends="billingMapSingleTableBaseNotNull"
                    xml:id="billingMapSingleTableCalloutCreatedNotNull"/>
  cardType int(11) DEFAULT NULL, <co
                    linkends="billingMapSingleTableDerivedNull"
                    xml:id="billingMapSingleTableCardTypeNull"/>
  expiration datetime DEFAULT NULL, <co
                    linkends="billingMapSingleTableDerivedNull"
                    xml:id="billingMapSingleTableExpirationNull"/>
  bankName varchar(255) DEFAULT NULL, <co
                    linkends="billingMapSingleTableDerivedNull"
                    xml:id="billingMapSingleTableBankNameNull"/>
  swiftcode varchar(255) DEFAULT NULL <co
                    linkends="billingMapSingleTableDerivedNull"
                    xml:id="billingMapSingleTableSwiftCodeNull"/>
)</programlisting></td>
            </tr>
          </informaltable>

          <calloutlist>
            <callout arearefs="billingMapSingleTable"
                     xml:id="billingMapSingleTableCallout">
              <para>All classes of the inheritance hierarchy will be mapped to
              a single table. Unless stated otherwise the <link
              linkend="gloss_JPA"><abbrev>JPA</abbrev></link> provider will
              choose the root class' name (<code>BillingDetails</code>) as
              default value for the generated relation's name <coref
              linkend="BillingDetailsGeneratedRelationName"/>.</para>
            </callout>

            <callout arearefs="billingMapSingleTableDiscriminator"
                     xml:id="billingMapSingleTableDiscriminatorCallout">
              <para>The <link linkend="gloss_JPA"><abbrev>JPA</abbrev></link>
              provider needs a column to distinguish the different types of
              database objects. We've chosen the discriminator attribute
              <code>dataType</code> values to be simple strings. Due to the
              definitions in <coref
              linkend="billingMapSingleTableDiscriminatorCredit"/> and <coref
              linkend="billingMapSingleTableDiscriminatorBank"/> database
              object types are being identified by either of the two
              values:</para>

              <itemizedlist>
                <listitem>
                  <para><code>Credit card</code>: object will be mapped to
                  <classname>inherit.v1.CreditCard</classname>.</para>
                </listitem>

                <listitem>
                  <para><code>Bank account</code>: object will be mapped to
                  <classname>inherit.v1.BankAccount</classname>.</para>
                </listitem>
              </itemizedlist>

              <para>In a productive system the
              <classname>javax.persistence.DiscriminatorType</classname>
              setting will typically favour
              <classname>javax.persistence.DiscriminatorType</classname><code>.INTEGER</code>
              over
              <classname>javax.persistence.DiscriminatorType</classname><code>.STRING</code>
              unless the application in question has to deal with a legacy
              database schema.</para>
            </callout>

            <callout arearefs="billingMapSingleTableIdGenerated"
                     xml:id="billingMapSingleTableIdGeneratedCallout">
              <para>This one is unrelated to inheritance: Our primary key
              values will be auto generated by the database server e.g. by
              <code>SEQUENCE</code> or <code>IDENTITY</code> mechanisms if
              available.</para>
            </callout>

            <callout arearefs="billingMapSingleTableCalloutNumberNotNull billingMapSingleTableCalloutCreatedNotNull"
                     xml:id="billingMapSingleTableBaseNotNull">
              <para>Only the base class' attributes may exclude
              <code>NULL</code> values.</para>
            </callout>

            <callout arearefs="billingMapSingleTableCardTypeNull billingMapSingleTableExpirationNull billingMapSingleTableBankNameNull billingMapSingleTableSwiftCodeNull"
                     xml:id="billingMapSingleTableDerivedNull">
              <para>All derived classes' attributes must allow
              <code>NULL</code> values.</para>
            </callout>
          </calloutlist>

          <para>We may now insert instances of
          <classname>inherit.v1.BankAccount</classname> or
          <classname>inherit.v1.CreditCard</classname>:</para>

          <figure xml:id="insertCreditBank">
            <title>Inserting payment information</title>

            <programlisting>package inherit.v1;
...
public class Persist {
   public static void main(String[] args) throws ParseException {
... final Transaction transaction = session.beginTransaction();
      {
         final CreditCard creditCard = new CreditCard("4412 8334 4512 9416", 1, "05/18/15");
         session.save(creditCard);
         
         final BankAccount bankAccount = new BankAccount("1107 2 31", "Lehman Brothers", "BARCGB22");
         session.save(bankAccount);
      }
      transaction.commit(); ...</programlisting>
          </figure>

          <section xml:id="sect_InheritTablePerClassHierarchieLoad">
            <title>Database object retrieval</title>

            <para>As in <xref linkend="retrieveAllUserByHql"/> objects being
            stored by <xref linkend="insertCreditBank"/> may be queried using
            <abbrev
            xlink:href="http://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html">HQL</abbrev>.</para>

            <informaltable border="1">
              <colgroup width="6%"/>

              <colgroup width="94%"/>

              <tr>
                <td valign="top"><emphasis role="bold">Java</emphasis></td>

                <td valign="top"><programlisting>package inherit.v1;
  ...
public class RetrieveCredit {
  public static void main(String[] args) {
   ...
   final Transaction transaction = session.beginTransaction();

    final Query searchCreditPayments = session.createQuery("<emphasis
                      role="bold">from inherit.v1.CreditCard</emphasis>"); <co
                      xml:id="hqlQueryCreditCard"/>
    final List&lt;CreditCard&gt; creditCardList = (List&lt;CreditCard&gt;) searchCreditPayments.list();
    for (final CreditCard c: creditCardList) {
      System.out.println(c);
    } ...</programlisting></td>
              </tr>

              <tr>
                <td valign="top"><emphasis role="bold">Sql</emphasis></td>

                <td><programlisting continuation="continues">INFO: HHH000232: Schema update complete
Hibernate: 
    select
        creditcard0_.id as id0_,
        creditcard0_.created as created0_,
        creditcard0_.number as number0_,
        creditcard0_.cardType as cardType0_,
        creditcard0_.expiration as expiration0_ 
    from
        BillingDetails creditcard0_ 
    where
        creditcard0_.<emphasis role="bold">dataType</emphasis> <co
                      xml:id="hqlQueryCreditCard_dataType"/>='<emphasis
                      role="bold">Credit card</emphasis>'

<emphasis role="bold">CreditCard: number=4412 8334 4512 9416, created 2013-02-19 13:09:22.0,
            cardType=1, expiration=2015-05-18 00:00:00.</emphasis> <co
                      xml:id="hqlQueryCreditCardResultSet"/></programlisting></td>
              </tr>
            </informaltable>

            <para>Some Remarks: Our query asks for instances of
            <classname>inherit.v2.CreditCard</classname> <coref
            linkend="hqlQueryCreditCard"/>. This gets implemented as an
            <acronym
            xlink:href="http://en.wikipedia.org/wiki/Sql">SQL</acronym>
            <code>SELECT</code> choosing datasets whose discriminator
            attribute <code>value of dataType</code> <coref
            linkend="hqlQueryCreditCard_dataType"/> equals <quote><code>Credit
            card</code></quote>. The current result set contains just one
            element <coref linkend="hqlQueryCreditCardResultSet"/> in
            accordance with <xref linkend="insertCreditBank"/>.</para>

            <para>Retrieving both <classname>inherit.v1.CreditCard</classname>
            and <classname>inherit.v1.BankAccount</classname> instances is
            accomplished by querying for the common base class
            <classname>inherit.v1.BillingDetails</classname>:</para>

            <informaltable border="1">
              <colgroup width="6%"/>

              <colgroup width="94%"/>

              <tr>
                <td valign="top"><emphasis role="bold">Java</emphasis></td>

                <td valign="top"><programlisting>package inherit.v1;
  ...
public class RetrieveAll {
  ...
    final Query searchBilling = session.createQuery("from <emphasis
                      role="bold">inherit.v1.BillingDetails</emphasis>");
    @SuppressWarnings("unchecked")
    final List&lt;BillingDetails&gt; billingDetailsList = (List&lt;BillingDetails&gt;) searchBilling.list();
    for (final BillingDetails c: billingDetailsList) {
      System.out.println(c);
    } ...</programlisting></td>
              </tr>

              <tr>
                <td valign="top"><emphasis role="bold">Sql</emphasis></td>

                <td><programlisting continuation="continues">INFO: HHH000232: Schema update complete
Hibernate: 
    select
        billingdet0_.id as id0_,
        ...
        billingdet0_.dataType as dataType0_ 
    from
        BillingDetails billingdet0_

CreditCard: number=4412 8334 4512 9416, created 2013-02-19 13:09:22.0, <co
                      xml:id="resultSetHeterogeneous"/>
            cardType=1, expiration=2015-05-18 00:00:00.0
BankAccount: number=1107 2 31, created 2013-02-19 13:09:22.0,
             bankName=Lehman Brothers, swiftcode=BARCGB22</programlisting></td>
              </tr>
            </informaltable>

            <para>This is the first example of a polymorphic query yielding a
            heterogeneous result set<coref
            linkend="resultSetHeterogeneous"/>.</para>
          </section>

          <section xml:id="sect_InheritTablePerClassHierarchieNullProblem">
            <title>Null values</title>

            <para>Our current mapping strategy limits our means to specify
            data integrity constraints. It is no longer possible to disallow
            <code>null</code> values for properties belonging to derived
            classes. We might want to disallow <code>null</code> values in the
            <code>bankName</code> property. Hibernate will generate a
            corresponding database attribute <coref
            linkend="require_bankNameNotNullDb"/>:</para>

            <informaltable border="1">
              <colgroup width="6%"/>

              <colgroup width="94%"/>

              <tr>
                <td valign="top"><emphasis role="bold">Java</emphasis></td>

                <td valign="top"><programlisting>package inherit.v2;
...
@Entity @DiscriminatorValue(value = "Bank account")
public class BankAccount extends BillingDetails {
   String bankName;
   @Column(<emphasis role="bold">nullable=false</emphasis>) <co
                      xml:id="require_bankNameNotNull"/>
   public String getBankName() {return bankName;} ...</programlisting></td>
              </tr>

              <tr>
                <td valign="top"><emphasis role="bold">Sql</emphasis></td>

                <td><programlisting continuation="continues">CREATE TABLE BillingDetails (
  id bigint(20) NOT NULL AUTO_INCREMENT PRIMARY KEY,
  bankName varchar(255) <emphasis role="bold">NOT NULL</emphasis>, <co
                      xml:id="require_bankNameNotNullDb"/>
...</programlisting></td>
              </tr>
            </informaltable>

            <para>Looks good? Unfortunately the attempt to save a bank account
            <coref linkend="saveBankAccount"/> yields a runtime exception
            <coref linkend="saveBankAccountException"/>:</para>

            <informaltable border="1">
              <colgroup width="6%"/>

              <colgroup width="94%"/>

              <tr>
                <td valign="top"><emphasis role="bold">Java</emphasis></td>

                <td valign="top"><programlisting>package inherit.v2;
...
public class Persist {
...
  final CreditCard creditCard = new CreditCard("4412 8334 4512 9416", 1, "05/18/15");
  session.save(creditCard);
         
  final BankAccount bankAccount = new BankAccount("1107 2 31", "Lehman Brothers", "BARCGB22");
  session.save(bankAccount) <co xml:id="saveBankAccount"/>; ...</programlisting></td>
              </tr>

              <tr>
                <td valign="top"><emphasis role="bold">Sql</emphasis></td>

                <td><programlisting continuation="continues">...
Feb 19, 2013 10:28:00 AM org.hibernate.tool.hbm2ddl.SchemaUpdate execute
INFO: HHH000232: Schema update complete
Hibernate: 
    insert 
    into
        BillingDetails
        (created, number, cardType, expiration, dataType) 
    values
        (?, ?, ?, ?, 'Credit card')
Feb 19, 2013 10:28:00 AM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
WARN: SQL Error: 1364, SQLState: HY000
Feb 19, 2013 10:28:00 AM org.hibernate.engine.jdbc.spi.SqlExceptionHelper logExceptions
<emphasis role="bold">ERROR: Field 'bankName' doesn't have a default value 
Exception in thread "main" org.hibernate.exception.GenericJDBCException: 
   Field 'bankName' doesn't have a default value</emphasis> <co
                      xml:id="saveBankAccountException"/>
...
  at inherit.v2.Persist.main(Persist.java:28)
Caused by: java.sql.SQLException: Field 'bankName' doesn't have a default value</programlisting></td>
              </tr>
            </informaltable>

            <para>Conclusion: A table per class hierarchy mapping does not
            allow to specify not null constraints for properties of derived
            classes.</para>

            <qandaset role="exercise">
              <title>Mapping figures</title>

              <qandadiv>
                <qandaentry>
                  <question>
                    <para>Map the following model to a database:</para>

                    <figure xml:id="modelFigureInheritance">
                      <title>Figure subclasses</title>

                      <mediaobject>
                        <imageobject>
                          <imagedata fileref="Ref/Fig/figureInherit.fig"/>
                        </imageobject>
                      </mediaobject>
                    </figure>

                    <para>The two properties <code>xCenter</code> and
                    <code>yCenter</code> in the abstract base class
                    <code>Figure</code> represent the coordinates of the
                    concrete figure's center of gravity. In a drawing
                    application this would be considered the placement of the
                    respective object.</para>

                    <para>The abstract method <code>getArea()</code> is meant
                    to be implemented without interfering with your database
                    mapping. Choose an integer discriminator. Test your
                    application by storing and loading objects.</para>
                  </question>

                  <answer>
                    <para>The main difference to the current
                    <classname>inherit.v1.BillingDetails</classname> example
                    is the <classname>javax.persistence.Transient</classname>
                    annotation of the <code>area</code> property in
                    <classname>inherit.v3.Figure</classname>,
                    <classname>inherit.v3.Circle</classname> and
                    <classname>inherit.v3.Rectangle</classname>. The storage
                    ant retrieval applications are
                    <classname>inherit.v3.Persist</classname>,
                    <classname>inherit.v3.RetrieveRectangles</classname> and
                    <classname>inherit.v3.RetrieveAll</classname> are
                    straightforward.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>
        </section>

        <section xml:id="joinedSubclass">
          <title>Joined subclasses</title>

          <para>The basic idea is to generate a normalized schema implementing
          inheritance relationships by foreign keys:</para>

          <figure xml:id="joindSubclassMapping">
            <title>Joined subclass mapping.</title>

            <mediaobject>
              <imageobject>
                <imagedata fileref="Ref/Fig/billingMapJoined.fig"/>
              </imageobject>
            </mediaobject>
          </figure>

          <para>The inheritance strategy of joined subclasses <coref
          linkend="strategyJoinedSubclass"/> is being defined in the abstract
          base class
          <classname>inherit.joined.v1.BillingDetails</classname>:</para>

          <programlisting>package inherit.joined.v1;
...
@Entity
@Inheritance(strategy=InheritanceType.JOINED) <co
              xml:id="strategyJoinedSubclass"/>
public abstract class BillingDetails { ... }</programlisting>

          <para>The derived classes need to provide an implementation hint in
          order to identify the required foreign key <coref
          linkend="referenceParenntClass"/> to the parent class
          <classname>inherit.joined.v1.BillingDetails</classname>:</para>

          <programlisting>package inherit.joined.v1;
...
@Entity
@PrimaryKeyJoinColumn(name="parent" <co xml:id="referenceParenntClass"/>, referencedColumnName="id")
public class CreditCard extends BillingDetails {

   int cardType;
   @Column(nullable=false) <co xml:id="tpcNotNullCardType"/>
   public int getCardType() {return cardType;}
   public void setCardType(int cardType) {this.cardType = cardType;}
   
   Date expiration;
   @Column(nullable=false) <co xml:id="tpcNotNullexpiration"/>
   public Date getExpiration() {return expiration;}
   public void setExpiration(Date expiration) {this.expiration = expiration;}
...
}</programlisting>

          <para>Notice the ability to exclude null values in <coref
          linkend="tpcNotNullCardType"/> and <coref
          linkend="tpcNotNullexpiration"/>.</para>

          <section xml:id="joinedSubclassRetrieve">
            <title>Retrieving Objects</title>

            <para>On the database server side object retrieval results in a
            more expensive operation: A query for root class instances
            of<classname>inherit.joined.v1.BillingDetails</classname> <coref
            linkend="joinedQueryBillingDetails"/> of our inheritance hierarchy
            results in joining all three tables <code>BillingDetails</code>
            <coref linkend="joinFromBillingDetails"/>,
            <code>BankAccount</code> <coref linkend="joinFromBankAccount"/>
            and <code>CreditCard</code> <coref
            linkend="joinFromCreditCard"/>:</para>

            <informaltable border="1">
              <colgroup width="6%"/>

              <colgroup width="94%"/>

              <tr>
                <td valign="top"><emphasis role="bold">Java</emphasis></td>

                <td valign="top"><programlisting>package inherit.joined.v1;
...
public class RetrieveAll {
...
    final Query searchBilling = session.createQuery("<emphasis role="bold">from inherit.tpc.v1.BillingDetails</emphasis>" <co
                      xml:id="joinedQueryBillingDetails"/>);
...</programlisting></td>
              </tr>

              <tr>
                <td valign="top"><emphasis role="bold">Sql</emphasis></td>

                <td><programlisting continuation="continues">Hibernate: 
    select
        billingdet0_.id as id0_,
        billingdet0_.created as created0_,
        billingdet0_.number as number0_,
        billingdet0_1_.bankName as bankName1_,
        billingdet0_1_.swiftcode as swiftcode1_,
        billingdet0_2_.cardType as cardType2_,
        billingdet0_2_.expiration as expiration2_,
        case 
            when billingdet0_1_.id is not null then 1 
            when billingdet0_2_.id is not null then 2 
            when billingdet0_.id is not null then 0 
        end as clazz_ 
    from
        <emphasis role="bold">BillingDetails</emphasis> billingdet0_ <co
                      xml:id="joinFromBillingDetails"/>
    left outer join
        <emphasis role="bold">BankAccount</emphasis> billingdet0_1_ <co
                      xml:id="joinFromBankAccount"/>
            on billingdet0_.id=billingdet0_1_.id
    left outer join
        <emphasis role="bold">CreditCard</emphasis> billingdet0_2_ <co
                      xml:id="joinFromCreditCard"/>
            on billingdet0_.id=billingdet0_2_.id </programlisting></td>
              </tr>
            </informaltable>

            <qandaset role="exercise">
              <title><link linkend="gloss_JPA"><abbrev>JPA</abbrev></link>
              constraints and database integrity.</title>

              <qandadiv>
                <qandaentry>
                  <question>
                    <para>Explain all integrity constraints of the Hibernate
                    generated schema. Is it able to implement the correct
                    constraints on database level corresponding to the
                    inheritance related <link
                    linkend="gloss_Java"><trademark>Java</trademark></link>
                    objects? On contrary: Are there possible database states
                    which do not correspond to the domain model's object
                    constraints?</para>
                  </question>

                  <answer>
                    <para>We take a look to the database schema:</para>

                    <programlisting>CREATE TABLE BillingDetails (
  id bigint(20) NOT NULL AUTO_INCREMENT PRIMARY KEY <co
                        linkends="inheritJoinSqlJava-1"
                        xml:id="inheritJoinSqlJava-1-co"/>,
  created datetime NOT NULL,
  number varchar(32) NOT NULL
);
CREATE TABLE CreditCard (
  id bigint(20) NOT NULL  PRIMARY KEY <co linkends="inheritJoinSqlJava-2"
                        xml:id="inheritJoinSqlJava-2-co"/> REFERENCES <co
                        linkends="inheritJoinSqlJava-3"
                        xml:id="inheritJoinSqlJava-3-co"/> BillingDetails,
  cardType int(11) NOT NULL,
  expiration datetime NOT NULL
);
CREATE TABLE BankAccount (
  id bigint(20) NOT NULL PRIMARY KEY <co linkends="inheritJoinSqlJava-4"
                        xml:id="inheritJoinSqlJava-4-co"/> REFERENCES <co
                        linkends="inheritJoinSqlJava-4"
                        xml:id="inheritJoinSqlJava-5-co"/> BillingDetails,
  bankName varchar(255) NOT NULL,
  swiftcode varchar(255) NOT NULL
);</programlisting>

                    <calloutlist>
                      <callout arearefs="inheritJoinSqlJava-1-co"
                               xml:id="inheritJoinSqlJava-1">
                        <para>The table implementing the root class
                        <classname>inherit.joined.v1.BillingDetails</classname>
                        of the inheritance hierarchy will be referenced both
                        by <code>CreditCard</code> and
                        <code>BankAccount</code> datasets and thus requires a
                        key to become addressable. Moreover the corresponding
                        <classname>inherit.joined.v1.BillingDetails</classname>
                        class requires this attribute to be the primary key
                        anyway.</para>
                      </callout>

                      <callout arearefs="inheritJoinSqlJava-2-co"
                               xml:id="inheritJoinSqlJava-2">
                        <para>Each <code>CreditCard</code> specific set of
                        attributes belongs to exactly one
                        <code>BillingDetails</code> instance and hence the id
                        within our table <code>CreditCard</code> must be
                        unique.</para>
                      </callout>

                      <callout arearefs="inheritJoinSqlJava-3-co"
                               xml:id="inheritJoinSqlJava-3">
                        <para>As stated in <coref
                        linkend="inheritJoinSqlJava-2-co"/> each
                        <code>CreditCard</code> dataset must refer to its
                        parent <code>BillingDetails</code> instance.</para>
                      </callout>

                      <callout arearefs="inheritJoinSqlJava-4-co inheritJoinSqlJava-5-co"
                               xml:id="inheritJoinSqlJava-4">
                        <para>These constraints likewise describe <coref
                        linkend="inheritJoinSqlJava-2-co"/> and <coref
                        linkend="inheritJoinSqlJava-3-co"/> for
                        <code>BankAccount</code> datasets.</para>
                      </callout>
                    </calloutlist>

                    <para>The NOT NULL constraints implement their counterpart
                    properties in the corresponding <link
                    linkend="gloss_Java"><trademark>Java</trademark></link>
                    objects.</para>

                    <para>The mapping does not cover one important integrity
                    constraint of our domain model: The base class
                    <classname>inherit.joined.v1.BillingDetails</classname> is
                    abstract. Thus each entry in the database must refer
                    either to a
                    <classname>inherit.joined.v1.CreditCard</classname> or a
                    <classname>inherit.joined.v1.BankAccount</classname>
                    instance. But the above database schema allows for
                    datasets to appear in the <code>BillingDetails</code>
                    table not being referenced by either
                    <code>BankAccount</code> or <code>CreditCard</code>
                    datasets.</para>

                    <para>So the current database schema actually refers to a
                    domain model having a <emphasis
                    role="bold">concrete</emphasis> base class
                    <code>BillingDetails</code>.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>

            <qandaset role="exercise">
              <title>Implementing figures by joined subclasses</title>

              <qandadiv>
                <qandaentry>
                  <question>
                    <para>Implement the model being given in <xref
                    linkend="modelFigureInheritance"/> by joined
                    subclasses.</para>
                  </question>

                  <answer>
                    <para>See
                    <classname>inherit.joined.v2.Figure</classname>.</para>
                  </answer>
                </qandaentry>
              </qandadiv>
            </qandaset>
          </section>
        </section>

        <section xml:id="inheritTablePerConcrete">
          <title>Table per concrete class</title>

          <para>Not covered here.</para>
        </section>
      </section>

      <section xml:id="mappingRelatedEntities">
        <title>Mapping related entities</title>

        <section xml:id="primaryKeyRevisit">
          <title>Primary keys revisited</title>

          <para>Following <xref linkend="Bauer05"/> (p.88) we list important
          properties of primary keys with respect to <quote>best
          practices</quote> on top of their relational counterparts:</para>

          <itemizedlist>
            <listitem>
              <para>A primary key's values never change</para>
            </listitem>

            <listitem>
              <para>Primary key values should not have a business
              meaning</para>
            </listitem>

            <listitem>
              <para>Primary keys should be chosen to have proper indexing
              support with respect to the database product in question.</para>
            </listitem>
          </itemizedlist>

          <para>Regarding persistence we have three different concepts
          regarding an object's identity:</para>

          <glosslist>
            <glossentry>
              <glossterm>Java Object identity</glossterm>

              <glossdef>
                <para>The operator == checks whether two identifiers point to
                the same memory address.</para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm>Java Object equality</glossterm>

              <glossdef>
                <para>The
                <methodname>Object.equals(Object)</methodname>.</para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm>Database identity</glossterm>

              <glossdef>
                <para>Two distinct datasets (tuples) are identical if all
                primary key attributes have the same value.</para>

                <para>In other words: Two distinct database objects differ at
                least in one primary key attribute.</para>
              </glossdef>
            </glossentry>
          </glosslist>

          <section xml:id="objectEqualityByPrimaryKey">
            <title>Defining object equality by primary key</title>

            <para>Since JPA entities require a</para>
          </section>
        </section>

        <section xml:id="entityValueTypes">
          <title>Entity and value types</title>

          <para>From the viewpoint of <link linkend="gloss_ORM">ORM</link> we
          distinguish two distinct types of database objects:</para>

          <glosslist>
            <glossentry>
              <glossterm>Entity type</glossterm>

              <glossdef>
                <para>Objects of this type do have their own database identity
                and may exist independently of other (database)
                entities.</para>
              </glossdef>
            </glossentry>

            <glossentry>
              <glossterm>Value type</glossterm>

              <glossdef>
                <para>An object of value type has no database identity. It
                will appear in a database as a composite of a parent entity
                type. Its lifecycle is completely dependent on its
                parent.</para>
              </glossdef>
            </glossentry>
          </glosslist>
        </section>

        <section xml:id="sect_MappingEmbeddedClass">
          <title>Mapping a single embedded class</title>

          <para/>
        </section>
      </section>

      <section xml:id="sect_hibernateValidation">
        <title>Hibernate validation</title>

        <para/>
      </section>
    </chapter>
  </part>

  <xi:include href="bibliography.xml" xpointer="element(/1)"/>
</book>