diff --git a/Doc/sda1.xml b/Doc/sda1.xml new file mode 100755 index 0000000000000000000000000000000000000000..a0b2492357fc825aaf17e0066a12bb0a0a406c0a --- /dev/null +++ b/Doc/sda1.xml @@ -0,0 +1,419 @@ +<?xml version="1.0" encoding="UTF-8"?> +<book version="5.0" xmlns="http://docbook.org/ns/docbook" + xmlns:xlink="http://www.w3.org/1999/xlink" + xmlns:xi="http://www.w3.org/2001/XInclude" + xmlns:svg="http://www.w3.org/2000/svg" + xmlns:m="http://www.w3.org/1998/Math/MathML" + xmlns:html="http://www.w3.org/1999/xhtml" + xmlns:db="http://docbook.org/ns/docbook"> + <info> + <title xml:id="sda1"/> + + <author> + <personname><firstname/><surname/></personname> + + <affiliation> + <orgname/> + </affiliation> + </author> + + <pubdate/> + </info> + + <chapter xml:id="company"> + <title>Data models</title> + + <para>Wanted: A data model to represent entities and their + relationships</para> + + <section xml:id="companyRepresent"> + <title>Representing a company</title> + + <para>Entities to be represented:</para> + + <itemizedlist> + <listitem> + <para>Employee</para> + </listitem> + + <listitem> + <para>Departments</para> + </listitem> + + <listitem> + <para>Projects</para> + </listitem> + + <listitem> + <para>...</para> + </listitem> + </itemizedlist> + + <para>Relationships:</para> + + <itemizedlist> + <listitem> + <para>Each employee is being assigned to exactly one + department</para> + </listitem> + + <listitem> + <para>Each employee may participate in an arbitrary number of + projects. For each project an employee's number of weekly working + hours participation is being defined.</para> + </listitem> + + <listitem> + <para>Each department may have a parent department</para> + </listitem> + </itemizedlist> + + <para>We start creating a database table representing + <code>Department</code> entities:</para> + + <programlisting>CREATE TABLE Department( + id BIGINT PRIMARY KEY <co linkends="departmentTable-1" + xml:id="deptTableSurrogateKey"/> + ,name CHAR(20) NOT NULL UNIQUE <co linkends="departmentTable-2" + xml:id="deptTableBusinessKey"/> +)</programlisting> + + <calloutlist> + <callout arearefs="deptTableSurrogateKey" xml:id="departmentTable-1"> + <para>The <code>Department</code> table's + <emphasis>surrogate</emphasis> key</para> + </callout> + + <callout arearefs="deptTableBusinessKey" xml:id="departmentTable-2"> + <para>The <code>Department</code> table's unique name, e.g. + <quote>Sales</quote>.</para> + </callout> + </calloutlist> + + <para>Each <code>Employee</code> is being affiliated (see <coref + linkend="employeeAffiliation"/> below) with exactly one + <code>Department</code>. Its fax number is the only nullable + property.</para> + + <programlisting>CREATE TABLE Employee ( + id BIGINT PRIMARY KEY + ,number INTEGER NOT NULL UNIQUE + ,name CHAR(20) NOT NULL + ,surname CHAR(20) NOT NULL + ,department BIGINT NOT NULL REFERENCES Department <co + xml:id="employeeAffiliation"/> + ,telephone CHAR(4) NOT NULL + ,fax CHAR(4) +);</programlisting> + + <qandaset> + <qandadiv> + <qandaentry> + <question> + <para>Explain the (<acronym linkend="gloss_Sql">SQL</acronym>) + technical difference between a <code>PRIMARY KEY</code> and a + UNIQUE <acronym linkend="gloss_SqlDdl">DDL</acronym> statement + with respect to data integrity constraints.</para> + </question> + + <answer> + <para>A primary key attribute must not be <code>NULL</code> and + defines a unique set of attribute values within a given table. A + UNIQUE attribute allows <code>NULL</code> values and may thus be + complemented by a NOT NULL constraint to form a <link + linkend="glossCandidateKey">candidate key</link> for the sake of + allowing to identify <emphasis>all</emphasis> datasets of a + given table..</para> + </answer> + </qandaentry> + </qandadiv> + + <qandadiv> + <qandaentry> + <question> + <para>Explain the above notion of <emphasis>surrogate + key</emphasis>s, business keys and their roles for application + developers. Why should surrogate be preferred when defining + foreign key targets?</para> + </question> + + <answer> + <para>A surrogate key is being defined as a technical means to + reference objects on database level by foreign keys in a + <quote>sound </quote> manner. Its values are not intended to be + visible to end users and should thus not appear in e.g. <abbrev + linkend="gloss_Gui">GUI</abbrev> components. This allows + surrogate keys to be implemented by immutable attributes.</para> + + <para>On contrary business keys should not be used as reference + targets. This avoids transactional hassles being imposed by e.g. + <code>ON UPDATE CASCADE</code> and other transitive foreign key + attribute updates caused by changing business key values. The + following variant uses a business key as a foreign key + target:</para> + + <programlisting>CREATE TABLE Department( + id BIGINT PRIMARY KEY + ,name CHAR(20) NOT NULL UNIQUE <co linkends="cascadeBusinessKey-1" + xml:id="employeeDepartmentBusiness"/> +); + +CREATE TABLE Employee ( + id BIGINT PRIMARY KEY + ,number INTEGER NOT NULL UNIQUE + ,name CHAR(20) NOT NULL + ,surname CHAR(20) NOT NULL + ,department CHAR(20) NOT NULL REFERENCES Department(name) <co + linkends="cascadeBusinessKey-2" + xml:id="employeeDepartmentCascade"/> + ,telephone CHAR(4) NOT NULL + ,fax CHAR(4) +);</programlisting> + + <calloutlist> + <callout arearefs="employeeDepartmentBusiness" + xml:id="cascadeBusinessKey-1"> + <para>Defining a business key</para> + </callout> + + <callout arearefs="employeeDepartmentCascade" + xml:id="cascadeBusinessKey-2"> + <para>Referencing a business key</para> + </callout> + </calloutlist> + + <para>If we update a <code>Department.name</code> attribute, all + <code>Employee</code> datasets referencing this department have + to be changed as well. Thus the intended change of a single + attribute possibly causes a whole bunch of transitive updates + adding to potential lock conflicts in a concurrent database + application .</para> + + <para>If being intended though we should at least add a CASCADE + clause;</para> + + <programlisting>CREATE TABLE Employee ( +... + ,department CHAR(20) NOT NULL REFERENCES Department(name) + <emphasis role="bold">ON UPDATE CASCADE</emphasis> +...</programlisting> + </answer> + </qandaentry> + </qandadiv> + </qandaset> + + <para>We now add projects <coref linkend="projectEntity"/> and employee + affiliations to projects <coref linkend="projectAffiliation"/>:</para> + + <programlisting>CREATE TABLE Project ( <co xml:id="projectEntity"/> + id BIGINT PRIMARY KEY + ,number INTEGER NOT NULL UNIQUE + ,name CHAR(20) NOT NULL UNIQUE + ,status SMALLINT NOT NULL + ,CHECK (status IN (1,2,3,3)) -- planned, active, on hold, finished -- +); + +CREATE TABLE EmployeeProject ( <co xml:id="projectAffiliation"/> + employee BIGINT NOT NULL REFERENCES Employee + ,project BIGINT NOT NULL REFERENCES Project + ,PRIMARY KEY(employee, project) + ,weeklyHours INTEGER NOT NULL +)</programlisting> + + <qandaset> + <qandadiv> + <qandaentry> + <question> + <para>The property <code>status</code> has been implemented as + type <code>SMALLINT</code>. Consider the following + alternative:</para> + + <programlisting>CREATE TABLE Project ( + id BIGINT PRIMARY KEY + ,number INTEGER NOT NULL UNIQUE + ,name CHAR(20) NOT NULL UNIQUE + ,<emphasis role="bold">status CHAR(10) NOT NULL</emphasis> + ,CHECK (<emphasis role="bold">status IN ('planned', 'active', 'onHold','finished')</emphasis>) +);</programlisting> + + <para>What are the pros and cons?</para> + </question> + + <answer> + <glosslist> + <glossentry> + <glossterm>Advantage of CHAR(10) implementation</glossterm> + + <glossdef> + <para>Data residing in a database can be easily retrieved + by <code>SELECT</code> statements without having + difficulties interpreting both <code>WHERE </code>clauses + and result sets:</para> + + <programlisting>SELECT * FROM Project +WHERE status = 'active'</programlisting> + + <para>On contrary using the <code>SMALLINT</code> + implementation requires: </para> + + <programlisting>SELECT * FROM Project +WHERE status = 2 -- any clue about status definitions when lacking documentation? --</programlisting> + </glossdef> + </glossentry> + + <glossentry> + <glossterm>Advantage of <code>SMALLINT</code> + implementation</glossterm> + + <glossdef> + <para>When building applications our <code>SMALLINT</code> + status field will best be mapped as an user defined + <code>enum</code>:</para> + + <programlisting>public enum ProjectStatus { + planned + ,active + ,onHold + ,finished; +}</programlisting> + + <para>This allows for easily understandable switch + statements:</para> + + <programlisting>ProjectStatus status; + ... +switch (status) { + case planned: + ... + break; + + case active: + ... + break; + default: + ... +} +...</programlisting> + + <para>In fact starting from Java 1.7 the above + <code>switch</code> statement will work for + <classname>String</classname> objects as well:</para> + + <programlisting>String status; + ... +switch (status) { + case "planned": + ... + break; + + case "active": + ... + break; + default: + ... +}</programlisting> + + <para>But despite being syntactic sugar the above code + actually implements an <code>if ... else if ... + else</code> block relying on + <methodname>String.equals()</methodname> rather than a + true integer based jump table:</para> + + <programlisting>String status; + ... + if (status.equals("planned")) { + ... + } else if (status.equals("active")) { + } else { + ... + } +}</programlisting> + </glossdef> + </glossentry> + </glosslist> + </answer> + </qandaentry> + </qandadiv> + </qandaset> + </section> + + <section> + <title>A media store</title> + + <para>The media store example has been chosen as a second main example + in addition to the company database. It will be developed by you in a + step by step manner thereby exploring lecture related features and + technological standards.</para> + + <para>We have outlined a schema forming a building block for a company's + internal organization. You are now being expected to develop a data + model suitable to build a media store. After talking to potential + customers the following entities have been identified to be + required:</para> + + <glosslist> + <glossentry> + <glossterm>Album</glossterm> + + <glossdef> + <para>The basic building block to be sold. An album is being + composed of tracks. Each Album does have a name not necessarily + unique across your media store. Albums may be related to + performing artists.</para> + </glossdef> + </glossentry> + + <glossentry> + <glossterm>Track</glossterm> + + <glossdef> + <para>Each song belongs to exactly one Album. The smallest media + item having the following properties:</para> + + <glosslist> + <glossentry> + <glossterm>name</glossterm> + + <glossdef> + <para>A song's name is unique with respect to its + <quote>owning</quote> album.</para> + </glossdef> + </glossentry> + + <glossentry> + <glossterm>format</glossterm> + + <glossdef> + <para>One of <code>mp3</code>, <code>flac</code> or + <code>mp4</code>.</para> + </glossdef> + </glossentry> + </glosslist> + </glossdef> + </glossentry> + + <glossentry> + <glossterm>Artist</glossterm> + + <glossdef> + <para>An artist may participate in arbitrary albums. It shall be + possible to annotate an artist's <quote>lead</quote> role with + respect to an album he/she is being affiliated with.</para> + + <itemizedlist> + <listitem> + <para>name, surname</para> + </listitem> + + <listitem> + <para>sex</para> + </listitem> + </itemizedlist> + </glossdef> + </glossentry> + </glosslist> + </section> + </chapter> + + <xi:include href="glossary.xml" xpointer="element(/1)"/> +</book>