<?xml version="1.0" encoding="UTF-8"?>
<book version="5.1" xmlns="http://docbook.org/ns/docbook"
      xmlns:xlink="http://www.w3.org/1999/xlink"
      xmlns:xi="http://www.w3.org/2001/XInclude"
      xmlns:svg="http://www.w3.org/2000/svg"
      xmlns:m="http://www.w3.org/1998/Math/MathML"
      xmlns:html="http://www.w3.org/1999/xhtml"
      xmlns:db="http://docbook.org/ns/docbook">
  <info>
    <title>Lecture notes of Martin Goik</title>

    <author>
      <personname><firstname>Martin</firstname>
      <surname>Goik</surname></personname>

      <affiliation>
        <orgname>https://www.hdm-stuttgart.de/mi</orgname>
      </affiliation>
    </author>

    <legalnotice>
      <para>Source code available at <uri
      xlink:href="https://gitlab.mi.hdm-stuttgart.de/goik/GoikLectures.git">https://gitlab.mi.hdm-stuttgart.de/goik/GoikLectures.git</uri></para>
    </legalnotice>
  </info>

  <part xml:id="sda1">
    <title>Structured Data and Applications 1</title>

    <chapter xml:id="basicXmlDataModels">
      <title>Basic XML Schema</title>

      <para>We would like to support a company's organization. We start
      modeling employees. Each employee shall have the following
      attributes:</para>

      <figure xml:id="fig_ModelEmployee">
        <title>Modelling employees</title>

        <itemizedlist>
          <listitem>
            <para>Given name / surname, mandatory</para>
          </listitem>

          <listitem>
            <para>birth date</para>
          </listitem>

          <listitem>
            <para>sex (male/female), mandatory</para>
          </listitem>

          <listitem>
            <para>email</para>
          </listitem>

          <listitem>
            <para>phone</para>
          </listitem>
        </itemizedlist>

        <para>In addition we want each dataset to receive a unique identity
        attribute <code>id</code> <coref linkend="co_uniqueId"/> in order to
        distinguish between employees having identical names by coincidence.
        Regarding relational databases we may start with the following
        implementation:</para>
      </figure>

      <figure xml:id="fig_SqlDdlEmployee">
        <title>Relational database implementation of employees</title>

        <programlisting language="sql">CREATE TABLE EMPLOYEE
(
	ID BIGINT <co xml:id="co_uniqueId"/> NOT NULL PRIMARY KEY,
	GIVENNAME CHAR(20) NOT NULL,
	SURNAME CHAR(20) NOT NULL,
	BIRTHDATE DATE,
	SEX CHAR NOT NULL,
	EMAIL CHAR(20),
	PHONE CHAR(20)
);</programlisting>
      </figure>

      <qandaset defaultlabel="qanda"
                xml:id="sda1QandaImplicitSchemaConstraints">
        <title>Implicit model parameters</title>

        <qandadiv>
          <qandaentry xml:id="quandaentry_companyAddionIntegrity">
            <question>
              <para><xref linkend="fig_SqlDdlEmployee"/> shows an
              implementation of <xref linkend="fig_ModelEmployee"/>. But it
              does contain additional integrity constraints of two different
              categories not being stated in <xref
              linkend="fig_ModelEmployee"/>.</para>
            </question>

            <answer>
              <itemizedlist>
                <listitem>
                  <para>Each attribute is being typed. For example <code
                  language="sql">BIRTHDAY DATE</code> declares an attribute
                  <code language="sql">BIRTHDAY</code> of SQL type
                  <code>DATE</code>.</para>
                </listitem>

                <listitem>
                  <para>An attribute may or may not allow for <code
                  language="sql">NULL</code> values.</para>
                </listitem>
              </itemizedlist>
            </answer>
          </qandaentry>
        </qandadiv>
      </qandaset>

      <para>We may as well represent our current physical <link
      linkend="gloss_SQL">SQL</link> model graphically. The annotation
      <quote>(<abbrev>NN</abbrev>)</quote> denotes <quote><code>not
      nullable</code></quote>:</para>

      <figure xml:id="fig_GraphModelEmployee">
        <title>Graphical representation of employees' model.</title>

        <mediaobject>
          <imageobject>
            <imagedata fileref="Ref/Fig/employee.png"/>
          </imageobject>
        </mediaobject>
      </figure>

      <para>We now turn to <link linkend="gloss_XML">XML</link> representation
      of data. Completely disregarding integrity constraints we may use a so
      called <emphasis>well formed</emphasis> XML file representation:</para>

      <figure xml:id="fig_EmployeeWellformed">
        <title>Representing an employee's data</title>

        <programlisting language="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;employee&gt;
    &lt;id&gt;21&lt;/id&gt;
    &lt;givenName&gt;Bob&lt;/givenName&gt;
    &lt;surname&gt;Hope&lt;/surname&gt;
    &lt;birthday&gt;1982-07-22&lt;/birthday&gt;
    &lt;sex&gt;m&lt;/sex&gt;
    &lt;email&gt;hope@exploitation.com&lt;/email&gt;
    &lt;phone&gt;1123-33244&lt;/phone&gt;
&lt;/employee&gt;</programlisting>
      </figure>

      <para><link linkend="gloss_XML">XML</link> allows for multiple employees
      as well:</para>

      <programlisting language="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;employeeList&gt;
    &lt;employee&gt;
        &lt;id&gt;21&lt;/id&gt;
        &lt;givenName&gt;Bob&lt;/givenName&gt;
        &lt;surname&gt;Hope&lt;/surname&gt;
        &lt;birthday&gt;1982-07-22&lt;/birthday&gt;
        &lt;sex&gt;m&lt;/sex&gt;
        &lt;email&gt;hope@exploitation.com&lt;/email&gt;
        &lt;phone&gt;1123-33244&lt;/phone&gt;
    &lt;/employee&gt;
    &lt;employee&gt;
        &lt;id&gt;22&lt;/id&gt;
        &lt;givenName&gt;Gill&lt;/givenName&gt;
        &lt;surname&gt;Evans&lt;/surname&gt;
        &lt;birthday&gt;1973-01-15&lt;/birthday&gt;
        &lt;sex&gt;f&lt;/sex&gt;
        &lt;email&gt;gill@flashmaster.com&lt;/email&gt;
        &lt;phone&gt;9771-43421&lt;/phone&gt;
    &lt;/employee&gt;
&lt;/employeeList&gt;</programlisting>

      <qandaset defaultlabel="qanda" xml:id="sda1QandaWellFormed">
        <title>Well-formed XML data questions</title>

        <qandadiv>
          <qandaentry xml:id="quandaentry_prologueRequired">
            <question>
              <para>Is the prologue <emphasis role="bold"><code>&lt;?xml
              version="1.0" encoding="UTF-8"?&gt;</code></emphasis> mandatory
              for a file to become a well formed XML document? Find the
              corresponding passage in the <link
              xlink:href="https://www.w3.org/TR/2008/REC-xml-20081126">XML 1.0
              standard reference document</link> and explain your
              answer.</para>
            </question>

            <answer>
              <para>The <link
              xlink:href="https://www.w3.org/TR/2008/REC-xml-20081126/#sec-prolog-dtd">Prolog
              and Document Type Declaration</link> section defines <emphasis
              role="bold"><code>&lt;?xml version="1.0" ...
              ?&gt;</code></emphasis> defines this header as being optional
              (<emphasis role="bold">should</emphasis>). In addition the <link
              linkend="gloss_EBNF">EBNF</link> diagram
              contains:<programlisting language="ebnf">22]  prolog   ::=   XMLDecl<emphasis
                    role="bold">?</emphasis> <co xml:id="co_XmlDeclOptional"/>...</programlisting>The
              <quote>?</quote> <coref linkend="co_XmlDeclOptional"/> tells us
              that the header in question may appear once or may be
              omitted.</para>
            </answer>
          </qandaentry>

          <qandaentry xml:id="quandaentry_prologueVersionExplain">
            <question>
              <para>Explain the meaning of the <emphasis role="bold"><code
              language="xml">version="1.0"</code></emphasis> prologue
              attribute.</para>
            </question>

            <answer>
              <para>The version attribute allows for an evolving XML standard.
              This way XML parsers are able to parse XML documents adhering to
              different XML standard versions.</para>

              <para>For example XML documents adhering to the (unsuccessful)
              <link xlink:href="https://www.w3.org/TR/xml11/#sec-xml11">XML
              1.1</link> Standard may be analyzed by a parser accordingly by
              reading the prologue to apply the respective rule set.</para>
            </answer>
          </qandaentry>

          <qandaentry xml:id="quandaentry_prologueEncodingExplain">
            <question>
              <para>Explain the meaning of <emphasis role="bold"><code
              language="xml">encoding="UTF-8"</code></emphasis> prologue
              attribute. What about other values than <code>UTF-8</code>
              ?</para>
            </question>

            <answer>
              <para>XML allows for choosing an encoding like <code
              language="xml">encoding="US-ASCII"</code> or <code
              language="xml">encoding="ISO-8859-1" for western European
              countries</code>. So the XML standard supports
              <trademark>Unicode</trademark> but may use different encodings
              as well.</para>
            </answer>
          </qandaentry>
        </qandadiv>
      </qandaset>

      <qandaset defaultlabel="qanda" xml:id="sda1QandaMediaStore">
        <title>Building up a media store software</title>

        <qandadiv>
          <qandaentry xml:id="quandaentry_mediaformatSQL">
            <question xml:id="quandaentry_mediaFormats">
              <para>We do a step by step development of a media store in
              parallel to the company example. This is an ongoing set of
              exercises.</para>

              <para>We start representing media formats like
              <abbrev>Mp3</abbrev>, <abbrev>Flac</abbrev>,
              <abbrev>Avi</abbrev> and so on. True media objects like album
              tracks will later reference a given media format. So our format
              descriptions are independent from the actual media
              content:</para>

              <informalfigure>
                <mediaobject>
                  <imageobject>
                    <imagedata fileref="Ref/Fig/mediaFormat.png"/>
                  </imageobject>
                </mediaobject>
              </informalfigure>

              <para>Implement the given entity as a relational table and
              create a well-formed sample XML dataset. Explain the ratio
              behind the seemingly redundant definition of a second <code
              language="sql">UNIQUE</code> attribute <code>CODE</code> on top
              of the primary key attribute <code>ID</code>.</para>
            </question>

            <answer>
              <para>The following table may represent different formats and
              codecs:</para>

              <programlisting language="sql">CREATE TABLE MEDIAFORMAT (
	ID BIGINT NOT NULL AUTO_INCREMENT PRIMARY KEY,
	NAME CHAR(10) NOT NULL UNIQUE COMMENT 'mp3,flac,avi,...',
	CODE INT NOT NULL UNIQUE COMMENT 'unique number representing a mediaformat'
);</programlisting>

              <para>Having a second attribute CODE allows us to keep primary
              key values immutable. Thus foreign key definitions referencing
              <code>Mediaformat</code> datasets remain stable if we change
              values of <code>CODE</code> attribute values. This way we avoid
              <code language="sql">ON UPDATE CASCADE</code> clauses giving
              rise to performance problems.</para>

              <para>XML sample data may be represented as:</para>

              <programlisting language="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;formatlist&gt;
    &lt;format&gt;
        &lt;id&gt;1&lt;/id&gt;
        &lt;name&gt;MP3&lt;/name&gt;
        &lt;code&gt;12&lt;/code&gt;
    &lt;/format&gt;
    &lt;format&gt;
        &lt;id&gt;2&lt;/id&gt;
        &lt;name&gt;FLAC&lt;/name&gt;
        &lt;code&gt;7&lt;/code&gt;
    &lt;/format&gt;
&lt;/formatlist&gt;</programlisting>
            </answer>
          </qandaentry>
        </qandadiv>
      </qandaset>

      <section xml:id="xmlDataTypes">
        <title>XML data typing</title>

        <para>So far our database representation limits attribute values to
        domains by imposing type restrictions. We recall <xref
        linkend="fig_SqlDdlEmployee"/>. For example the following
        <code>UPDATE</code> statement will fail:</para>

        <programlisting language="sql"><emphasis role="bold">-- Subsequent UPDATE yields "Incorrect integer value: 'thousand' for column 'ID'" --</emphasis>
UPDATE EMPLOYEE
SET id = 'thousand'
WHERE ID=1</programlisting>

        <para>On contrary XML data just being well formed imposes no
        restrictions at all:</para>

        <programlisting language="xml">&lt;employee&gt;
    &lt;id&gt;<emphasis role="bold">Thousand</emphasis>&lt;/id&gt;
    &lt;givenName&gt;Bob&lt;/givenName&gt;
&lt;/employee&gt;</programlisting>

        <para>So apparently well formedness of XML data does not meet
        programming standards for declarative specification of data integrity
        constraints. For this purpose the following three standards are on
        offer:</para>

        <itemizedlist>
          <listitem>
            <para><link linkend="gloss_DTD">DTD</link> (old)</para>
          </listitem>

          <listitem>
            <para><link linkend="gloss_RelaxNG">RelaxNG</link> (document
            oriented)</para>
          </listitem>

          <listitem>
            <para><link linkend="gloss_XmlSchema">XML Schema</link> (data
            oriented)</para>
          </listitem>
        </itemizedlist>

        <para>In this lecture only <link linkend="gloss_XmlSchema">XML
        Schema</link> is being described. Regarding a database schema like
        <xref linkend="fig_SqlDdlEmployee"/> the <link
        linkend="gloss_XmlSchema">XML Schema</link> standard offers a
        counterpart namely an <emphasis>XML Schema Definition</emphasis>
        (<filename>.xsd</filename>). We consider the following file
        <filename>employee.xsd</filename>:</para>

        <figure xml:id="fig_XsdEmployee">
          <title>A schema describing an employee's data structure</title>

          <programlisting language="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" &gt;

    &lt;xs:element name="employee"&gt;
        &lt;xs:complexType&gt;
            &lt;xs:sequence&gt;
                &lt;xs:element name="id" type="xs:unsignedInt" <co
              xml:id="co_IdNonNegative"/> /&gt;
                &lt;xs:element name="givenName" type="xs:string"/&gt;
                &lt;xs:element name="surname" type="xs:string"/&gt;
                &lt;xs:element name="birthday" type="xs:string"/&gt;
                &lt;xs:element name="sex" type="xs:string"/&gt;
                &lt;xs:element name="email" type="xs:string"/&gt;
                &lt;xs:element name="phone" type="xs:string"/&gt;
            &lt;/xs:sequence&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

&lt;/xs:schema&gt;</programlisting>
        </figure>

        <para>Notice the subtle difference with respect to the id attribute
        <coref linkend="co_IdNonNegative"/>: We expect this attribute to have
        non-negative values. To ensure this integrity constraint the<link
        linkend="gloss_XSD"> XSD</link> standard data type
        <code>xs:unsignedInt</code> has been chosen in favour of
        <code>xs:int</code>.</para>

        <para>This XML schema definition acts like <code>CREATE TABLE</code>
        statements with respect to relational databases. It adds a grammar to
        our XML data thereby enabling validation:</para>

        <programlisting language="xml">&lt;?xml version="1.0" encoding="UTF-8"?&gt;
&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="<emphasis role="bold">employee.xsd</emphasis>"&gt;
    &lt;id&gt;21&lt;/id&gt;
    &lt;givenName&gt;Bob&lt;/givenName&gt;
    &lt;surname&gt;Hope&lt;/surname&gt;
    &lt;birthday&gt;1982-07-22&lt;/birthday&gt;
    &lt;sex&gt;m&lt;/sex&gt;
    &lt;email&gt;hope@exploitation.com&lt;/email&gt;
    &lt;phone&gt;1123-33244&lt;/phone&gt;
&lt;/employee&gt;</programlisting>

        <para>Having just one employee isn't very satisfying. We choose one of
        different possibilities to allow for a sequence of employees:</para>

        <figure xml:id="fig_ListOfEmployees">
          <title>A list of employees</title>

          <programlisting language="xml">&lt;xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" &gt;

    &lt;xs:element name="employee"&gt; <co linkends="schemaElementReference-1"
              xml:id="schemaElementReference-1-co"/>
        &lt;xs:complexType&gt;
            &lt;xs:sequence&gt;
                &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
                &lt;xs:element name="givenName" type="xs:string"/&gt;
                &lt;xs:element name="surname" type="xs:string"/&gt;
                &lt;xs:element name="birthday" type="xs:string"/&gt;
                &lt;xs:element name="sex" type="xs:string"/&gt;
                &lt;xs:element name="email" type="xs:string"/&gt;
                &lt;xs:element name="phone" type="xs:string"/&gt;
            &lt;/xs:sequence&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

    &lt;xs:element name="employeeList"&gt;
        &lt;xs:complexType&gt;
            &lt;xs:sequence&gt;
                &lt;xs:element ref="employee" <co
              linkends="schemaElementReference-2"
              xml:id="schemaElementReference-2-co"/>
                    minOccurs="0" <co linkends="schemaElementReference-3"
              xml:id="schemaElementReference-3-co"/>
                    maxOccurs="unbounded"/&gt; <co
              linkends="schemaElementReference-3"
              xml:id="schemaElementReference-4-co"/>
            &lt;/xs:sequence&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

&lt;/xs:schema&gt;</programlisting>
        </figure>

        <calloutlist>
          <callout arearefs="schemaElementReference-1-co"
                   xml:id="schemaElementReference-1">
            <para>Defining an employee's data structure like in <xref
            linkend="fig_XsdEmployee"/>.</para>
          </callout>

          <callout arearefs="schemaElementReference-2-co"
                   xml:id="schemaElementReference-2">
            <para>An element is being defined by referencing the definition in
            <coref linkend="schemaElementReference-1-co"/>.</para>
          </callout>

          <callout arearefs="schemaElementReference-3-co schemaElementReference-4-co"
                   xml:id="schemaElementReference-3">
            <para>We may have an arbitrary number of employee datasets
            including none at all.</para>
          </callout>
        </calloutlist>

        <para>This allows for an arbitrary number of employee data
        sets:</para>

        <programlisting language="xml">&lt;employeeList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="employee.xsd"&gt;

    &lt;employee&gt;
        &lt;id&gt;21&lt;/id&gt;
        &lt;givenName&gt;Bob&lt;/givenName&gt;
        &lt;surname&gt;Hope&lt;/surname&gt;
        &lt;birthday&gt;1982-07-22&lt;/birthday&gt;
        &lt;sex&gt;m&lt;/sex&gt;
        &lt;email&gt;hope@exploitation.com&lt;/email&gt;
        &lt;phone&gt;1123-33244&lt;/phone&gt;
    &lt;/employee&gt;

    &lt;employee&gt;
        &lt;id&gt;22&lt;/id&gt;
        &lt;givenName&gt;Gill&lt;/givenName&gt;
        &lt;surname&gt;Evans&lt;/surname&gt;
        &lt;birthday&gt;1973-01-15&lt;/birthday&gt;
        &lt;sex&gt;f&lt;/sex&gt;
        &lt;email&gt;gill@flashmaster.com&lt;/email&gt;
        &lt;phone&gt;9771-43421&lt;/phone&gt;
    &lt;/employee&gt;

&lt;/employeeList&gt;</programlisting>

        <qandaset defaultlabel="qanda" xml:id="sda1QandaMediaFormatGrammar">
          <title>Adding a grammar to <tag
          class="starttag">mediaFormat</tag>.</title>

          <qandadiv>
            <qandaentry xml:id="quandaentry_Track">
              <question>
                <para>Consider the following example of an album consisting of
                tracks:</para>

                <table border="1" width="70%" xml:id="table_AlbumExample">
                  <caption>An example album</caption>

                  <colgroup width="10%"/>

                  <colgroup width="76%"/>

                  <colgroup width="14%"/>

                  <tr>
                    <th>Album:</th>

                    <td colspan="2">...Nothing Like the Sun</td>
                  </tr>

                  <tr>
                    <th>Artist:</th>

                    <td colspan="2">Sting</td>
                  </tr>

                  <tr>
                    <th colspan="3"/>
                  </tr>

                  <tr>
                    <th>Track number</th>

                    <th>track title</th>

                    <th>track time</th>
                  </tr>

                  <tr>
                    <td>1</td>

                    <td>The Lazarus Heart</td>

                    <td>4:34</td>
                  </tr>

                  <tr>
                    <td>2</td>

                    <td>Be Still My Beating Heart</td>

                    <td>5:32</td>
                  </tr>

                  <tr>
                    <td>3</td>

                    <td>Englishman in New York</td>

                    <td>4:25</td>
                  </tr>

                  <tr>
                    <th>...</th>

                    <th>...</th>

                    <th>...</th>
                  </tr>
                </table>

                <para>In the current exercise we'll only consider the tracks
                themselves:</para>

                <figure xml:id="fig_SingleMusicTrack">
                  <title>A single music track</title>

                  <mediaobject>
                    <imageobject>
                      <imagedata fileref="Ref/Fig/track.png"/>
                    </imageobject>
                  </mediaobject>
                </figure>

                <para>Though all tracks of a given album will typically share
                a common media format like e.g. MP3 our model allows each
                track having its own encoding. Create a well formed XML file
                containing the three tracks from <xref
                linkend="table_AlbumExample"/>.</para>
              </question>

              <answer>
                <para>We omit the XML header for brevity in accordance with
                the XML 1.0 specification:</para>

                <programlisting language="xml">&lt;trackList&gt;
    &lt;track&gt;
        &lt;id&gt;231&lt;/id&gt;
        &lt;album&gt;...Nothing Like the Sun&lt;/album&gt;
        &lt;number&gt;1&lt;/number&gt;
        &lt;name&gt;The Lazarus Heart&lt;/name&gt;
        &lt;duration&gt;4:34&lt;/duration&gt;
        &lt;mediaformat&gt;MP3&lt;/mediaformat&gt;
    &lt;/track&gt;
    &lt;track&gt;
        &lt;id&gt;232&lt;/id&gt;
        &lt;album&gt;...Nothing Like the Sun&lt;/album&gt;
        &lt;number&gt;2&lt;/number&gt;
        &lt;name&gt;Be Still My Beating Heart&lt;/name&gt;
        &lt;duration&gt;5:32&lt;/duration&gt;
        &lt;mediaformat&gt;MP3&lt;/mediaformat&gt;
    &lt;/track&gt;
    &lt;track&gt;
        &lt;id&gt;233&lt;/id&gt;
        &lt;album&gt;Englishman in New York&lt;/album&gt;
        &lt;number&gt;3&lt;/number&gt;
        &lt;name&gt;Be Still My Beating Heart&lt;/name&gt;
        &lt;duration&gt;4:25&lt;/duration&gt;
        &lt;mediaformat&gt;MP3&lt;/mediaformat&gt;
    &lt;/track&gt;
&lt;/trackList&gt;</programlisting>
              </answer>
            </qandaentry>

            <qandaentry xml:id="quandaentry_TrackXmlSchema">
              <question>
                <para>Reconsider the current track model to create a suitable
                XML schema <filename>track.xsd</filename> allowing for an
                arbitrary number of tracks.</para>

                <para>Hints: Regarding a track's time data type look for the
                heading <quote><emphasis role="bold">Duration Data
                Type</emphasis></quote> in <link
                xlink:href="http://www.w3schools.com/xml/schema_dtypes_date.asp">http://www.w3schools.com/xml/schema_dtypes_date.asp</link>
                and consider the corresponding reference definition in <link
                xlink:href="https://www.w3.org/TR/xmlschema11-2/#duration">http://www.w3.org/TR/xmlschema11-2</link>.</para>
              </question>

              <answer>
                <para>A single track may be modeled by
                <filename>track.xsd</filename>:</para>

                <programlisting language="xml">&lt;xs:element name="track"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
            &lt;xs:element name="album" type="xs:string"/&gt;
            &lt;xs:element name="number" type="xs:unsignedInt"/&gt;
            &lt;xs:element name="name" type="xs:string"/&gt;
            <emphasis role="bold">&lt;xs:element name="duration" type="xs:duration"/&gt;</emphasis> <co
                    xml:id="co_DurationExample"/>
            &lt;xs:element name="mediaformat" type="xs:string"/&gt;
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

                <para>The element duration <coref
                linkend="co_DurationExample"/> offers a precise syntax <coref
                linkend="co_DurationExampleValue"/> describing time
                periods:</para>

                <programlisting continuation="continues" language="xml">&lt;track xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="track.xsd"&gt;
    &lt;id&gt;231&lt;/id&gt;
    &lt;album&gt;...Nothing Like the Sun&lt;/album&gt;
    &lt;number&gt;1&lt;/number&gt;
    &lt;name&gt;The Lazarus Heart&lt;/name&gt;
    <emphasis role="bold">&lt;duration&gt;PT4M34S&lt;/duration&gt;</emphasis> <co
                    xml:id="co_DurationExampleValue"/>
    &lt;mediaformat&gt;MP3&lt;/mediaformat&gt;
&lt;/track&gt;</programlisting>

                <para>Extending the current model to a set of tracks is
                straightforward cf. <xref
                linkend="fig_ListOfEmployees"/>:</para>

                <programlisting language="xml">&lt;xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" elementFormDefault="qualified"&gt;

    &lt;xs:element name="trackList"&gt;
        &lt;xs:complexType&gt;
            &lt;xs:sequence&gt;
                &lt;xs:element ref="track"
                minOccurs="0"
                maxOccurs="unbounded"/&gt;
            &lt;/xs:sequence&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

    &lt;xs:element name="track"&gt;
        &lt;xs:complexType&gt;
            &lt;xs:sequence&gt;
                &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
                &lt;xs:element name="album" type="xs:string"/&gt;
                &lt;xs:element name="number" type="xs:unsignedInt"/&gt;
                &lt;xs:element name="name" type="xs:string"/&gt;
                &lt;xs:element name="duration" type="xs:duration"/&gt;
                &lt;xs:element name="mediaformat" type="xs:string"/&gt;
            &lt;/xs:sequence&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

&lt;/xs:schema&gt;</programlisting>

                <para>This allows for an arbitrary number of tracks:</para>

                <figure xml:id="fig_TwoTrackBasic">
                  <title>Two tracks belonging to the same album.</title>

                  <programlisting language="xml">&lt;trackList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="trackList.xsd"&gt;
    &lt;track&gt;
        &lt;id&gt;231&lt;/id&gt;
        &lt;album&gt;...Nothing Like the Sun&lt;/album&gt;
        &lt;number&gt;1&lt;/number&gt;
        &lt;name&gt;The Lazarus Heart&lt;/name&gt;
        &lt;duration&gt;PT4M34S&lt;/duration&gt;
        &lt;mediaformat&gt;mp3&lt;/mediaformat&gt;
    &lt;/track&gt;
    &lt;track&gt;
        &lt;id&gt;232&lt;/id&gt;
        &lt;album&gt;...Nothing Like the Sun&lt;/album&gt;
        &lt;number&gt;2&lt;/number&gt;
        &lt;name&gt;Be Still My Beating Heart&lt;/name&gt;
        &lt;duration&gt;PT5M32S&lt;/duration&gt;
        &lt;mediaformat&gt;flac&lt;/mediaformat&gt;
    &lt;/track&gt;

    &lt;!-- maybe more to come ... --&gt;
&lt;/trackList&gt;</programlisting>
                </figure>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="restrictDataTypes">
        <title>User defined data types</title>

        <para>Some SQL and XML Schema data types are quite appropriate.
        Correct <code>BIRTHDAY</code> values are being well described by its
        <link linkend="gloss_SQL">SQL</link> data type
        <code>DATE</code>.</para>

        <para>We may however want add further restrictions on existing types
        to enhance data integrity. The attribute <code>SEX</code> for example
        may be restricted to allow only male/'m' and female/'f' values. <link
        linkend="gloss_SQL">SQL</link> enables us to do so by adding a
        <code>CHECK</code> constraint <coref linkend="co_CheckSex"/>:</para>

        <programlisting language="sql">CREATE TABLE EMPLOYEE
(
...
  SEX CHAR NOT NULL,
...
  <emphasis role="bold">CHECK(SEX in ('m', 'f'))</emphasis> <co
            xml:id="co_CheckSex"/>
);</programlisting>

        <para>This constraint effectively restricts the domain of arbitrary
        character values to a set of just two values {'m', 'f'}.</para>

        <para>XML Schema provides similar means to enforce data integrity.
        Consider for example a rule enforcing id attribute values to a minimum
        of 200. This rule may be implemented by restricting the domain of
        unsigned integer values <coref linkend="co_RestrictInt"/> to numbers
        starting from and including 200 <coref
        linkend="co_RestrictInt200"/>:</para>

        <programlisting language="xml">&lt;xs:element name="employee"&gt;
    &lt;xs:annotation&gt;
        <emphasis role="bold">&lt;xs:documentation&gt;restricting possible values of id&lt;/xs:documentation&gt;</emphasis> <co
            xml:id="co_CommentOnElement"/>
    &lt;/xs:annotation&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id"&gt;
                &lt;xs:simpleType&gt;
                    &lt;xs:restriction base="xs:unsignedInt"&gt; <co
            xml:id="co_RestrictInt"/>
                        <emphasis role="bold">&lt;xs:minInclusive value="200"/&gt;</emphasis> <co
            xml:id="co_RestrictInt200"/>
                    &lt;/xs:restriction&gt;
                &lt;/xs:simpleType&gt;
            &lt;/xs:element&gt;
            &lt;xs:element name="givenName" type="xs:string"/&gt;
           ...
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

        <para>Notice the ability ox XML Schema to allow structured comments
        <coref linkend="co_CommentOnElement"/>. The following document
        instance violates the rule in question:</para>

        <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="employee.xsd"&gt;
    &lt;id&gt;21&lt;/id&gt;
    &lt;givenName&gt;Bob&lt;/givenName&gt;
   ...
&lt;/employee&gt;</programlisting>

        <para>A validating parser thus yields a corresponding error
        message:</para>

        <screen>cvc-minInclusive-valid: Value '21' is not facet-valid with respect to minInclusive '200'
  for type <emphasis role="bold">'#AnonType_idemployee'</emphasis> <co
            xml:id="co_ErrAnonTypeEmployee"/></screen>

        <qandaset defaultlabel="qanda" xml:id="sda1QandamMessageErrorType">
          <title>The type error message in detail</title>

          <qandadiv>
            <qandaentry xml:id="quandaentry_AnonTypeMsg">
              <question>
                <para>What does the somewhat strange value <emphasis
                role="bold">#AnonType_idemployee</emphasis> in the error
                message indicate? Hint: The parser chooses this value since
                something optional is missing.</para>
              </question>

              <answer>
                <para>To better understand the error message note that we have
                defined a new type namely the subset of all
                <code>xs:unsignedInt</code> values starting from 200. This
                data type has not been given a name like
                <code>integerLargerThan200</code>. Consequently we call this
                an anonymous type.</para>

                <para>The parser internally assigns the<code> synthetic name
                </code><quote><code>#AnonType_idemployee</code></quote>
                appearing in the error message .</para>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>

        <para>Though we have implemented the desired restrictions two topics
        may still be addressed:</para>

        <itemizedlist>
          <listitem>
            <para>Our new data type <quote>integer values starting from
            200</quote> might be used to restrict other elements as well. For
            example <tag class="starttag">id</tag> elements typically appear
            once per employee and other entities. We want to be able reusing
            data type definitions in favour of "copy/paste". The following
            code illustrates the problem of duplicated type
            definitions:</para>

            <figure xml:id="fig_AnonDuplicateTypeDef">
              <title>Two identical and thus redundant anonymous type
              definitions</title>

              <programlisting language="xml">...
<emphasis role="bold">&lt;xs:element name="employee"&gt;</emphasis>
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id"&gt;
                <emphasis role="bold">&lt;xs:simpleType&gt;
                    &lt;xs:restriction base="xs:integer"&gt;
                        &lt;xs:minInclusive value="200"/&gt;
                    &lt;/xs:restriction&gt;
                &lt;/xs:simpleType&gt;</emphasis>
            &lt;/xs:element&gt;
            &lt;xs:element name="givenName" type="xs:string"/&gt;
             ...
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;

<emphasis role="bold">&lt;xs:element name="department"&gt;</emphasis>
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id"&gt;
                <emphasis role="bold">&lt;!-- Duplicate type definition --&gt;</emphasis>
                <emphasis role="bold">&lt;xs:simpleType&gt;
                    &lt;xs:restriction base="xs:integer"&gt;
                        &lt;xs:minInclusive value="200"/&gt;
                    &lt;/xs:restriction&gt;
                &lt;/xs:simpleType&gt;</emphasis>
            &lt;/xs:element&gt;
            ...</programlisting>
            </figure>
          </listitem>

          <listitem>
            <para>User defined data types should have names rather than being
            anonymous to provide better code understanding.</para>
          </listitem>
        </itemizedlist>

        <para>The XML Schema standard supports named type definitions. <xref
        linkend="fig_AnonDuplicateTypeDef"/> may be represented by:</para>

        <figure xml:id="fig_CentralizedIdDefinition">
          <title>Defining a separate non-anonymous identity data type</title>

          <programlisting language="xml"><emphasis role="bold">&lt;xs:simpleType name="NumericIdentity"&gt;</emphasis> <emphasis
              role="bold">&lt;!-- User defined type NumericIdentity --&gt;</emphasis>
    &lt;xs:restriction base="xs:unsignedInt"&gt;
        &lt;xs:minInclusive value="200"/&gt;
    &lt;/xs:restriction&gt;
&lt;/xs:simpleType&gt;

&lt;xs:element name="employee"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            <emphasis role="bold">&lt;!-- type referral to previously defined type --&gt;
            &lt;xs:element name="id" type="NumericIdentity"/&gt;</emphasis>
            &lt;xs:element name="givenName" type="xs:string"/&gt;
            &lt;xs:element name="surname" type="xs:string"/&gt;
            &lt;xs:element name="birthday" type="xs:string"/&gt;
            &lt;xs:element name="sex" type="xs:string"/&gt;
            &lt;xs:element name="email" type="xs:string"/&gt;
            &lt;xs:element name="phone" type="xs:string"/&gt;
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;

<emphasis role="bold">&lt;xs:element name="department"&gt;</emphasis>
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            <emphasis role="bold">&lt;!-- type referral to previously defined type NumericIdentity --&gt;
</emphasis>            <emphasis role="bold">&lt;xs:element name="id" type="NumericIdentity"/&gt;</emphasis>
            ...</programlisting>
        </figure>

        <qandaset defaultlabel="qanda" xml:id="sda1QandaEncFormats">
          <title>Predefined set of track encoding formats</title>

          <qandadiv>
            <qandaentry xml:id="quandaentry_XmlRestrictCodecs">
              <question>
                <para>Your current <filename><link
                linkend="fig_TwoTrackBasic">track.xsd</link></filename> schema
                allows for arbitrary encoding values like <tag
                class="starttag">mediaformat</tag>mp3<tag
                class="endtag">mediaformat</tag>. Restrict its values to the
                set {<code>mp3</code>, <code>flac</code>,
                <code>mp4</code>}.</para>

                <para>Hint: Media encoding names disallow spaces and special
                characters. In a nutshell allowed values syntactically
                resemble typical naming conventions for tokens in programming
                languages. So a token type may be restricted to the above set.
                The corresponding chapter in <xref linkend="bib_Walmsley02"/>
                dealing with data types provides a similar example.</para>
              </question>

              <answer>
                <para>We may define a new type named
                <code>TrackEncoding</code> allowing our three allowed
                values:</para>

                <programlisting language="xml"><emphasis role="bold">&lt;xs:simpleType name="TrackEncoding"&gt;</emphasis>
    &lt;xs:restriction&gt;
        &lt;xs:simpleType&gt;
            &lt;xs:list itemType="xs:token"/&gt;
        &lt;/xs:simpleType&gt;
        <emphasis role="bold">&lt;xs:enumeration value="mp3"/&gt;
        &lt;xs:enumeration value="mp4"/&gt;
        &lt;xs:enumeration value="flac"/&gt;</emphasis>
    &lt;/xs:restriction&gt;
&lt;/xs:simpleType&gt;</programlisting>

                <para>This new type replaces our current
                <code>xs:string</code> data type:</para>

                <programlisting language="xml">&lt;xs:element name="track"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
           ...
            &lt;xs:element name="duration" type="xs:duration"/&gt;
            <emphasis role="bold">&lt;xs:element name="mediaformat" type="TrackEncoding"/&gt;</emphasis>
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="sect_OptionsAndChoices">
        <title>Optional content and choices</title>

        <para>Elements may be optional. We allow an element <tag
        class="starttag">middleInitial</tag> in employee instances:</para>

        <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="employee.xsd"&gt;
    &lt;id&gt;21&lt;/id&gt;
    &lt;givenName&gt;John&lt;/givenName&gt;
    <emphasis role="bold">&lt;middleInitial&gt;Fitzgerald&lt;/middleInitial&gt;</emphasis>
    &lt;surname&gt;Kennedy&lt;/surname&gt;
    ...
&lt;/employee&gt;</programlisting>

        <para>Since not every employee has a middle initial this element has
        to be optional. This may be achieved by defining
        <code>minOccurs="0"</code> and <code>maxOccurs="1"</code>:</para>

        <programlisting language="xml">&lt;xs:element name="employee"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
            &lt;xs:element name="givenName" type="xs:string"/&gt;
            <emphasis role="bold">&lt;xs:element name="middleInitial" type="xs:string" minOccurs="0" maxOccurs="1"/&gt;</emphasis>
            &lt;xs:element name="surname" type="xs:string"/&gt;
            ...
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

        <qandaset defaultlabel="qanda"
                  xml:id="sda1QandaMutualExclusiveElemens">
          <title>Mutual exclusive elements</title>

          <qandadiv>
            <qandaentry xml:id="quandaentry_SchemaChoice">
              <question>
                <para>XML Schema does allow us to define an exclusive choice
                between a given set of alternatives. Consider the following
                example:</para>

                <programlisting language="xml">&lt;xs:element name="employee"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
            &lt;xs:element name="givenName" type="xs:string"/&gt;
            &lt;xs:element name="middleInitial" type="xs:string" minOccurs="0" maxOccurs="1"/&gt;
            &lt;xs:element name="surname" type="xs:string"/&gt;
            &lt;xs:element name="birthday" type="xs:string"/&gt;
            &lt;xs:element name="sex" type="xs:string"/&gt;
            &lt;xs:element name="email" type="RfcEmail" minOccurs="0" maxOccurs="1"/&gt;
            &lt;xs:element name="phone" type="xs:string" minOccurs="0" maxOccurs="1"/&gt;
            &lt;xs:element name="skype" type="xs:string" minOccurs="0" maxOccurs="1"/&gt;
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

                <para>This XML schema allows for eight (why?) different
                classes with respect to the occurrence of <tag
                class="starttag">email</tag>, <tag
                class="starttag">phone</tag> or <tag
                class="starttag">skype</tag> id. We show possible
                examples:</para>

                <glosslist>
                  <glossentry>
                    <glossterm>An employee having all three address attributes
                    <coref linkend="prog_PowerAllEmail"/>, <coref
                    linkend="prog_PowerAllPhone"/> and <coref
                    linkend="prog_PowerAllSkype"/>:</glossterm>

                    <glossdef>
                      <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="employee.xsd"&gt;
    &lt;id&gt;25&lt;/id&gt;
    &lt;givenName&gt;Laura&lt;/givenName&gt;
    &lt;surname&gt;Anderson&lt;/surname&gt;
    &lt;birthday&gt;1956-41-02&lt;/birthday&gt;
    &lt;sex&gt;f&lt;/sex&gt;
    &lt;email&gt;anderson@corporate.com&lt;/email&gt; <co
                          xml:id="prog_PowerAllEmail"/>
    &lt;phone&gt;223-324323&lt;/phone&gt; <co xml:id="prog_PowerAllPhone"/>
    &lt;skype&gt;l.anderson&lt;/skype&gt; <co xml:id="prog_PowerAllSkype"/>
&lt;/employee&gt;</programlisting>
                    </glossdef>
                  </glossentry>

                  <glossentry>
                    <glossterm>An employee having no address attribute at
                    all:</glossterm>

                    <glossdef>
                      <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    ...
    &lt;surname&gt;Anderson&lt;/surname&gt;
    &lt;birthday&gt;1956-41-02&lt;/birthday&gt;
    &lt;sex&gt;f&lt;/sex&gt;
&lt;/employee&gt;</programlisting>
                    </glossdef>
                  </glossentry>

                  <glossentry>
                    <glossterm>An employee having <tag
                    class="starttag">email</tag> <coref
                    linkend="prog_PowerEmailSkype_Email"/> and <tag
                    class="starttag">skype</tag> id <coref
                    linkend="prog_PowerEmailSkype_Skype"/> but no <tag
                    class="starttag">phone</tag> number:</glossterm>

                    <glossdef>
                      <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    ...
    &lt;surname&gt;Anderson&lt;/surname&gt;
    &lt;birthday&gt;1956-41-02&lt;/birthday&gt;
    &lt;sex&gt;f&lt;/sex&gt;
    &lt;email&gt;anderson@corporate.com&lt;/email&gt; <co
                          xml:id="prog_PowerEmailSkype_Email"/>
    &lt;skype&gt;l.anderson&lt;/skype&gt; <co
                          xml:id="prog_PowerEmailSkype_Skype"/>
&lt;/employee&gt;</programlisting>
                    </glossdef>
                  </glossentry>
                </glosslist>

                <para>We want to resole this ambiguity by restricting employee
                data according to the following rule:</para>

                <itemizedlist>
                  <listitem>
                    <para>Each employee shall have exactly one address either
                    being <tag class="starttag">email</tag>, <tag
                    class="starttag">phone</tag> or <tag
                    class="starttag">skype</tag> id.</para>
                  </listitem>
                </itemizedlist>

                <para>Create an appropriate XML Schema. <emphasis
                role="bold">Hint:</emphasis> Read the documentation of the XML
                Schema element <emphasis role="bold"><tag
                class="starttag">xs:choice</tag></emphasis>.</para>
              </question>

              <answer>
                <para>An <tag class="starttag">xs:choice</tag> element
                implements the rule in question:</para>

                <programlisting language="xml">&lt;xs:element name="employee"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
           ...
            &lt;xs:choice&gt;
                &lt;xs:element name="email" type="RfcEmail"/&gt;
                &lt;xs:element name="phone" type="xs:string"/&gt;
                &lt;xs:element name="skype" type="xs:string"/&gt;
            &lt;/xs:choice&gt;
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

                <para>Now the following example will fail with the error
                message <emphasis role="bold"><quote>Invalid content was found
                starting with element 'phone'. No child element is expected at
                this point</quote></emphasis>:</para>

                <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="addressChoice.xsd"&gt;
    &lt;id&gt;25&lt;/id&gt;
    &lt;givenName&gt;Laura&lt;/givenName&gt;
    &lt;surname&gt;Anderson&lt;/surname&gt;
    &lt;birthday&gt;1956-41-02&lt;/birthday&gt;
    &lt;sex&gt;f&lt;/sex&gt;
    &lt;email&gt;anderson@corporate.com&lt;/email&gt;
    &lt;phone&gt;223-324323&lt;/phone&gt;
    &lt;skype&gt;l.anderson&lt;/skype&gt;
&lt;/employee&gt;</programlisting>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="xmlAttributes">
        <title>XML Attributes</title>

        <para>So far we used sub elements to represent data in XML. It is
        possible to use attributes instead. The following example is
        equivalent to <xref linkend="fig_EmployeeWellformed"/>:</para>

        <programlisting language="xml">&lt;employee id = "21" givenName = "Bob" surname = "Hope"
    birthday = "1982-07-22" sex = "m"
    email = "hope@exploitation.com" phone= "1123-33244"/&gt;</programlisting>

        <para>XML Schema allows to impose data integrity constraints on
        attributes equivalent to element definitions. Thus <xref
        linkend="fig_XsdEmployee"/> may be replaced by <tag
        class="emptytag">xs:attribute</tag> definitions:</para>

        <programlisting language="xml">&lt;xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" &gt;

    &lt;xs:element name="employee"&gt;
        &lt;xs:complexType&gt;
            &lt;xs:attribute name="id" type="xs:unsignedInt"/&gt;
            &lt;xs:attribute name="givenName" type="xs:string"/&gt;
            &lt;xs:attribute name="surname" type="xs:string"/&gt;
            &lt;xs:attribute name="birthday" type="xs:string"/&gt;
            &lt;xs:attribute name="sex" type="xs:string"/&gt;
            &lt;xs:attribute name="email" type="xs:string"/&gt;
            &lt;xs:attribute name="phone" type="xs:string"/&gt;
        &lt;/xs:complexType&gt;
    &lt;/xs:element&gt;

&lt;/xs:schema&gt;</programlisting>

        <para>We may reference the above schema constructing a valid XML
        instance:</para>

        <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:noNamespaceSchemaLocation="employeeAttrib.xsd"
    id = "21"
    givenName = "Bob"
    surname = "Hope"
    birthday = "1982-07-22"
    sex = "m"
    email = "hope@exploitation.com"
    phone = "1123-33244"
    /&gt;</programlisting>

        <para>So you might have the impression about <tag
        class="starttag">xs:element</tag> and <tag
        class="starttag">xs:attribute</tag> being equivalent. Attributes
        however do have limitations with respect to element
        declarations:</para>

        <orderedlist>
          <listitem>
            <para>XML attributes forbid multiple values whereas <tag
            class="starttag">xs:element</tag> declarations may allow (via
            <code>maxOccurs</code>) the repeated appearance of a given
            element:</para>

            <programlisting language="xml">&lt;employee&gt;
   ...
    &lt;email&gt;gill@flashmaster.com&lt;/email&gt;
    &lt;email&gt;gilly@private.org&lt;/email&gt;
    &lt;phone&gt;9771-43421&lt;/phone&gt;
&lt;/employee&gt;</programlisting>

            <para>The following <quote>equivalent</quote> with duplicate
            attribute names is off course wrong as it would also be in a
            <trademark linkend="gloss_Java">Java</trademark> class
            declaration:</para>

            <programlisting language="xml">&lt;employee xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:noNamespaceSchemaLocation="employeeAttrib.xsd"
    ...
    email = "gill@flashmaster.com"
    email = "gilly@private.org"
    phone ="9771-43421"
    /&gt;</programlisting>
          </listitem>

          <listitem>
            <para>During schema evolution elements sometimes require a more
            complex structure. Consider an element representing phone
            numbers:</para>

            <programlisting language="xml">&lt;employee&gt;
   ...
    &lt;phone&gt;9771-43421&lt;/phone&gt;
&lt;/employee&gt;</programlisting>

            <para>The transformation into an <tag
            class="starttag">xs:attribute</tag> declaration is
            straightforward. Beware however about a possible business
            requirement for the representation of area codes:</para>

            <programlisting language="xml">&lt;employee&gt;
   ...
  &lt;phone&gt;
     &lt;area&gt;9771&lt;/area&gt;
     &lt;local&gt;43421&lt;/local&gt;
  &lt;/phone&gt;
&lt;/employee&gt;</programlisting>

            <para>The transformation of <tag
            class="starttag">xs:attribute</tag> based data and dependent
            applications may be more difficult than having the schema based on
            <tag class="starttag">xs:element</tag> in the first place.</para>

            <remark>Rule of thumb: Whenever you do consider a value possibly
            demanding an inner structure choose &lt;xs:element&gt; in favour
            of &lt;xs:attribute&gt;.</remark>
          </listitem>
        </orderedlist>

        <para><tag class="starttag">xs:attribute</tag> declarations offer an
        optional attribute <code>use</code>:</para>

        <programlisting language="xml">&lt;xs:element name="employee"&gt;
   &lt;xs:complexType&gt;
   ...
      &lt;xs:attribute name="email" type="xs:string" use="required"/&gt;
      &lt;xs:attribute name="phone" type="xs:string" use="optional" default="12212-0"/&gt;
   &lt;/xs:complexType&gt;
&lt;/xs:element&gt;</programlisting>

        <qandaset defaultlabel="qanda" xml:id="sda1QandaAttDefaultChoices">
          <title>Attribute default choices</title>

          <qandadiv>
            <qandaentry xml:id="qanda_RequiredDefault">
              <question>
                <para>How does a <code>default = "..."</code> definition
                coexists with <code>use="required"</code>?</para>
              </question>

              <answer>
                <para>In the presence of <code>use="required"</code> a
                <code>default = "..."</code> declaration is being
                ignored.</para>
              </answer>
            </qandaentry>

            <qandaentry xml:id="qand_IdElementOrAttribute">
              <question>
                <para>Which is a natural choice to implement id-like values as
                being given in <xref linkend="fig_XsdEmployee"/>? Would you
                prefer <tag class="starttag">xs:element</tag> or <tag
                class="starttag">xs:attribute</tag>? Give reasons to explain
                your choice.</para>
              </question>

              <answer>
                <para>Database id values are simply numbers. Many relational
                databases offer IDENTITY declarations to define at maximum one
                element per table to become a data identity attribute,
                typically the primary key. So there is and never will be an
                inner structure and the value is unique per dataset. Thus <tag
                class="starttag">xs:attribute</tag> in conjunction with
                <code>use="required"</code> is a natural choice.</para>
              </answer>
            </qandaentry>
          </qandadiv>
        </qandaset>
      </section>

      <section xml:id="sec_Keys">
        <title>Unique values</title>

        <para>So far we know about types to ensure data integrity regarding
        individual datasets. This however still permits all kind of
        oddities:</para>

        <figure xml:id="fig_DuplicateEmployeeValue">
          <title>Duplicate value <code>id=21</code> for different
          employees</title>

          <programlisting language="xml">&lt;employeeList xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
  xsi:noNamespaceSchemaLocation="employee.xsd"&gt;

    &lt;employee&gt;
        &lt;id&gt;21&lt;/id&gt;
        &lt;givenName&gt;Bob&lt;/givenName&gt;
        ...
    &lt;/employee&gt;

    &lt;employee&gt;
        &lt;id&gt;21&lt;/id&gt; <emphasis role="bold">&lt;!-- Uups: id=21 again? --&gt;</emphasis>
        &lt;givenName&gt;Gill&lt;/givenName&gt;
        ...
    &lt;/employee&gt; ...</programlisting>
        </figure>

        <para>In analogy to<xref linkend="fig_SqlDdlEmployee"/> would like to
        add the equivalent of a primary key to <xref
        linkend="fig_ListOfEmployees"/>. XML Schema offers <tag
        class="starttag">xs:key</tag> and <tag
        class="starttag">xs:unique</tag> declarations for this purpose:</para>

        <programlisting language="xml">&lt;xs:element name="employee"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element name="id" type="xs:unsignedInt"/&gt;
            &lt;xs:element name="givenName" type="xs:string"/&gt;
            ...
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
&lt;/xs:element&gt;

&lt;xs:element name="employeeList"&gt;
    &lt;xs:complexType&gt;
        &lt;xs:sequence&gt;
            &lt;xs:element ref="employee" minOccurs="0" maxOccurs="unbounded"/&gt;
        &lt;/xs:sequence&gt;
    &lt;/xs:complexType&gt;
    <emphasis role="bold">&lt;xs:key name="employeeId"&gt;</emphasis> <co
            xml:id="co_Keydef"/>
        <emphasis role="bold">&lt;xs:selector xpath="employee"/&gt;
        &lt;xs:field xpath="id"/&gt;
    &lt;/xs:key&gt;</emphasis>
&lt;/xs:element&gt;
</programlisting>

        <para/>
      </section>
    </chapter>

    <chapter xml:id="basicXpathQuery">
      <title>Basic XPath query elements</title>

      <para>The <link linkend="gloss_XPath">XPath</link> <link
      linkend="gloss_W3C">W3C</link> standard acts on XML document instances
      like <link linkend="gloss_SQL">SQL</link> acts on relational data. We
      consider an example:</para>
    </chapter>
  </part>

  <appendix>
    <title>Glossary</title>

    <para/>

    <glossary>
      <glossentry xml:id="gloss_API">
        <glossterm><abbrev xlink:href="https://en.wikipedia.org/wiki/Api"
        xml:id="abbr_api">API</abbrev></glossterm>

        <glossdef>
          <para>Application programming interface</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_SqlDdl">
        <glossterm><abbrev
        xlink:href="https://en.wikipedia.org/wiki/Data_definition_language"
        xml:id="abbr_Ddl">DDL</abbrev> <link
        linkend="gloss_SQL">(SQL)</link></glossterm>

        <glossdef>
          <para>Data definition language. The subset of <link
          linkend="gloss_SQL">SQL</link> dealing with the creation of tables,
          views etc.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_DOM">
        <glossterm><acronym xlink:href="https://www.w3.org/DOM"
        xml:id="abbr_Dom">DOM</acronym></glossterm>

        <glossdef>
          <para>The <link linkend="gloss_W3C">W3C</link> <link
          xlink:href="https://www.w3.org/DOM">Document Object Model</link>
          standard</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_DTD">
        <glossterm><abbrev
        xlink:href="https://en.wikipedia.org/wiki/Document_Type_Declaration"
        xml:id="abbr_Dtd">DTD</abbrev></glossterm>

        <glossdef>
          <para>Document Type Definition. An older standard with respect to
          <link linkend="gloss_RelaxNG">RelaxNG</link> and <link
          linkend="gloss_RelaxNG">XML schema</link> to define an XML documents
          grammar.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_EBNF">
        <glossterm><abbrev>EBNF</abbrev></glossterm>

        <glossdef>
          <para>Extended Backus-Naur form.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_ftp">
        <glossterm><abbrev
        xlink:href="https://en.wikipedia.org/wiki/File_Transfer_Protocol"
        xml:id="abbr_Ftp">ftp</abbrev></glossterm>

        <glossdef>
          <para>File Transfer Protocol</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_FO">
        <glossterm><abbrev
        xlink:href="https://www.w3.org/TR/2001/REC-xsl-20011015/slice6.html#fo-section"
        xml:id="abbr_Fo">FO</abbrev></glossterm>

        <glossdef>
          <para>The Formatting Objects Standard for printable output
          generation</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_HDM">
        <glossterm><orgname xlink:href="https://www.hdm-stuttgart.de"
        xml:id="org_Hdm">Hdm</orgname></glossterm>

        <glossdef>
          <para xml:lang="de">Hochschule der Medien.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_Hql">
        <glossterm><abbrev
        xlink:href="https://docs.jboss.org/hibernate/orm/4.1/manual/en-US/html/ch16.html"
        xml:id="abbr_Hql">HQL</abbrev></glossterm>

        <glossdef>
          <para>The <link
          xlink:href="https://docs.jboss.org/hibernate/orm/3.3/reference/en/html/queryhql.html">Hibernate
          Query Language</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_http">
        <glossterm><abbrev xlink:href="https://www.w3.org/Protocols"
        xml:id="abbr_Http">http</abbrev></glossterm>

        <glossdef>
          <para>The Hypertext Transfer Protocol</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_IDE">
        <glossterm><abbrev
        xlink:href="https://en.wikipedia.org/wiki/Integrated_development_environment"
        xml:id="abbr_Ide">IDE</abbrev></glossterm>

        <glossdef>
          <para>Integrated Development Environment</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_J2EE">
        <glossterm><trademark
        xlink:href="http://www.oracle.com/technetwork/java/javaee"
        xml:id="tm_J2ee">J2EE</trademark></glossterm>

        <glossdef>
          <para>Java Platform, Enterprise Edition</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_Java">
        <glossterm><trademark
        xlink:href="http://www.oracle.com/us/legal/third-party-trademarks/index.html">Java</trademark></glossterm>

        <glossdef>
          <para>General purpose programming language with support for object
          oriented concepts.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_Javadoc">
        <glossterm><trademark
        xlink:href="https://docs.oracle.com/javase/1.5.0/docs/guide/javadoc">Javadoc</trademark></glossterm>

        <glossdef>
          <para>Extracting documentation embedded in <link
          linkend="gloss_Java"><trademark>Java</trademark></link> source
          code.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_JDBC">
        <glossterm><trademark
        xlink:href="http://www.oracle.com/technetwork/java/javase/jdbc"
        xml:id="tm_Jdbc">JDBC</trademark></glossterm>

        <glossdef>
          <para>XXX.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_JDK">
        <glossterm><trademark
        xlink:href="http://www.oracle.com/technetwork/java/javase"
        xml:id="tm_Jdk">JDK</trademark></glossterm>

        <glossdef>
          <para>Java Development Kit.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_JPA">
        <glossterm><abbrev
        xlink:href="http://www.javaworld.com/javaworld/jw-01-2008/jw-01-jpa1.html"
        xml:id="abbr_Jpa">JPA</abbrev></glossterm>

        <glossdef>
          <para><link
          xlink:href="http://www.javaworld.com/javaworld/jw-01-2008/jw-01-jpa1.html">Java
          Persistence Architecture</link></para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_JRE">
        <glossterm><trademark
        xlink:href="http://www.oracle.com/technetwork/java/javase"
        xml:id="tm_Jre">JRE</trademark></glossterm>

        <glossdef>
          <para>Java Runtime Environment</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_MathML">
        <glossterm><abbrev>MathML</abbrev></glossterm>

        <glossdef>
          <para><link xlink:href="https://www.w3.org/Math">Mathematical Markup
          Language</link></para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_MIB">
        <glossterm><orgname xlink:href="https://www.mi.hdm-stuttgart.de"
        xml:id="org_Mib">MIB</orgname></glossterm>

        <glossdef>
          <para xml:lang="de">Bachelor Studiengang Medieninformatik</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_Mysql">
        <glossterm><trademark
        xlink:href="https://www.mysql.com/about/legal/trademark.html"
        xml:id="tm_Mysql">Mysql</trademark></glossterm>

        <glossdef>
          <para>Open source Oracle database product</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_MP3">
        <glossterm><abbrev>MP3</abbrev></glossterm>

        <glossdef>
          <para>Audio codec.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_ORM">
        <glossterm><abbrev>ORM</abbrev></glossterm>

        <glossdef>
          <para>Object relational mapping.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_PHP">
        <glossterm><abbrev
        xlink:href="https://secure.php.net">PHP</abbrev></glossterm>

        <glossdef>
          <para>Hypertext preprocessor</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_RelaxNG">
        <glossterm><acronym
        xlink:href="http://relaxng.org">RelaxNG</acronym></glossterm>

        <glossdef>
          <para>An <link
          xlink:href="http://standards.iso.org/ittf/PubliclyAvailableStandards/c037605_ISO_IEC_19757-2_2003(E).zip">ISO</link>
          standard to define the grammar of XML documents. Primary use for
          document oriented applications.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_SAX">
        <glossterm><acronym
        xlink:href="http://www.saxproject.org">SAX</acronym></glossterm>

        <glossdef>
          <para><link xlink:href="http://www.saxproject.org">Simple API for
          XML</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_SQL">
        <glossterm><acronym
        xlink:href="https://en.wikipedia.org/wiki/Sql">SQL</acronym></glossterm>

        <glossdef>
          <para><link xlink:href="https://en.wikipedia.org/wiki/SQL">Structured
          query language</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_SVG">
        <glossterm><abbrev>SVG</abbrev></glossterm>

        <glossdef>
          <para><link xlink:href="https://www.w3.org/Graphics/SVG">Scalable
          Vector Graphics</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_TCP">
        <glossterm><acronym
        xlink:href="https://en.wikipedia.org/wiki/Transmission_Control_Protocol"
        xml:id="abbr_Tcp">TCP</acronym></glossterm>

        <glossdef>
          <para>Transmission Control Protocol</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_URL">
        <glossterm><abbrev xlink:href="https://www.ietf.org/rfc/rfc1738.txt"
        xml:id="abbr_Url">URL</abbrev></glossterm>

        <glossdef>
          <para>Uniform Resource Locator</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_W3C">
        <glossterm><orgname
        xlink:href="http://www.w3.org">W3C</orgname></glossterm>

        <glossdef>
          <para>World Wide Web Consortium</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XHTML">
        <glossterm><abbrev>XHTML</abbrev></glossterm>

        <glossdef>
          <para>Html as <link linkend="gloss_XML">XML</link> <link
          xlink:href="https://www.w3.org/TR/xhtml11">standard</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XML">
        <glossterm><abbrev
        xlink:href="https://www.w3.org/XML">Xml</abbrev></glossterm>

        <glossdef>
          <para>The <link xlink:href="https://www.w3.org/XML">Extensible Markup
          Language</link>.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XmlSchema">
        <glossterm>XML Schema</glossterm>

        <glossdef>
          <para>A W3C standard to define grammars for XML documents. Rich set
          of features with respect to data modeling.</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XPath">
        <glossterm><acronym xlink:href="https://www.w3.org/TR/xpath"
        xml:id="abbr_Xpath">XPath</acronym></glossterm>

        <glossdef>
          <para>XML Path Language</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XSD">
        <glossterm><abbrev
        xlink:href="https://www.w3.org/Style/XSL">XSD</abbrev></glossterm>

        <glossdef>
          <para>XML Schema description Language</para>
        </glossdef>
      </glossentry>

      <glossentry xml:id="gloss_XSL">
        <glossterm><abbrev xlink:href="https://www.w3.org/Style/XSL"
        xml:id="abbr_Xsl">XSL</abbrev></glossterm>

        <glossdef>
          <para>Extensible Stylesheet Language</para>
        </glossdef>
      </glossentry>
    </glossary>
  </appendix>

  <xi:include href="../Common/bibliography.xml" xpointer="element(/1)"/>
</book>