<?xml version="1.0" encoding="UTF-8"?>
<chapter version="5.1" xml:id="sd1ReadCharStreams"
         xmlns="http://docbook.org/ns/docbook"
         xmlns:xlink="http://www.w3.org/1999/xlink"
         xmlns:xi="http://www.w3.org/2001/XInclude"
         xmlns:svg="http://www.w3.org/2000/svg"
         xmlns:ns="http://docbook.org/ns/transclusion"
         xmlns:m="http://www.w3.org/1998/Math/MathML"
         xmlns:html="http://www.w3.org/1999/xhtml"
         xmlns:db="http://docbook.org/ns/docbook">
  <title>Reading character streams</title>

  <section xml:id="sd1ReadCharStreamsPrepare">
    <title>Preparations</title>

    <itemizedlist>
      <listitem>
        <para>Section <quote>Interfaces</quote> of chapter 6 in .</para>
      </listitem>

      <listitem>
        <para>Chapter 8 of excluding the last chapter <quote>The Standard
        Streams</quote>.</para>
      </listitem>

      <listitem>
        <para><link linkend="sd1FigExceptionBasics">Exception basics</link>
        already being discussed in <xref
        linkend="sw1ChapterErrorHandling"/>.</para>
      </listitem>
    </itemizedlist>
  </section>

  <section xml:id="sd1GnuWc">
    <title>Exercises</title>

    <qandaset defaultlabel="qanda" xml:id="sd1QandaLinenumbers">
      <title>Adding line numbers to text files</title>

      <qandadiv>
        <qandaentry>
          <question>
            <para>We want to add line numbers to arbitrary text files not
            necessarily being related to programming. Consider the following
            HTML example input:</para>

            <programlisting language="java">&lt;html&gt;
    &lt;head&gt;
        &lt;title&gt;A simple HTML example&lt;/title&gt;
    &lt;/head&gt;
    &lt;body&gt;
        &lt;p&gt;Some text ... &lt;/p&gt;
    &lt;/body&gt;
&lt;/html&gt;</programlisting>

            <para>Your application shall add line numbers:</para>

            <programlisting language="xml">1: &lt;html&gt;
2:     &lt;head&gt;
3:         &lt;title&gt;A simple HTML example&lt;/title&gt;
4:     &lt;/head&gt;
5:     &lt;body&gt;
6:         &lt;p&gt;Some text ... &lt;/p&gt;
7:     &lt;/body&gt;
8: &lt;/html&gt;</programlisting>

            <para>Hints:</para>

            <orderedlist>
              <listitem>
                <para>Given the name of an existing file you may create an
                instance of <classname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>:</para>

                <programlisting language="java">final FileReader fileReader = new FileReader(inputFileName);
final BufferedReader inputBufferedReader = new BufferedReader(fileReader);</programlisting>
              </listitem>

              <listitem>
                <para>You will have to deal with possible <classname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/FileNotFoundException.html">FileNotFoundException</classname>
                problems providing meaningful error messages.</para>
              </listitem>

              <listitem>
                <para>The <classname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>
                class provides a method <methodname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html#readLine--">readLine()</methodname>
                allowing to access a given file's content line by line.</para>

                <caution>
                  <para>Even if a file exists you have my encounter <classname
                  xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/IOException.html">IOException</classname>
                  problems being related to <acronym>i.e.</acronym> missing
                  permissions.</para>
                </caution>
              </listitem>
            </orderedlist>
          </question>

          <answer>
            <annotation role="make">
              <para role="eclipse">P/Sd1/Wc/readFile</para>
            </annotation>

            <para>This solutions reacts both to inexistent files and general
            IO problems:</para>

            <screen>File not found: Testdata/input.java</screen>

            <para>Two test cases deal both with readable and non-existing
            files: and expected exceptions:</para>

            <programlisting language="java">@Test
public void testReadFileOk() throws FileNotFoundException, IOException {
  ReadFile.openStream("Testdata/input.txt"); // Existing file
}
@Test (expected=FileNotFoundException.class) // We expect this exception to be
                                             // thrown.
public void testReadMissingFile() throws FileNotFoundException, IOException {
  ReadFile.openStream("Testdata/input.java"); // Does not exist
}</programlisting>

            <para>Notice the second test which will only succeed if a
            <classname
            xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/FileNotFoundException.html">FileNotFoundException</classname>
            is being thrown.</para>
          </answer>
        </qandaentry>
      </qandadiv>
    </qandaset>

    <qandaset defaultlabel="qanda" xml:id="sd1QandaWc">
      <title>A partial implementation of GNU UNIX
      <command>wc</command></title>

      <qandadiv>
        <qandaentry>
          <question>
            <para>In this exercise we will partly implement the (Gnu) UNIX
            command line tool <command
            xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">wc</command>
            (word count). Prior to starting this exercise you may want
            to:</para>

            <itemizedlist>
              <listitem>
                <para>Execute <command
                xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">wc</command>
                for sample text files like e.g. a Java source file of
                similar:</para>

                <screen>goik &gt;wc BoundedIntegerStore.java
  58  198 1341 BoundedIntegerStore.java
</screen>

                <para>What do these three numbers 58, 198 and 1341 mean?
                Execute <command>wc</command> <option>--help</option> or
                <command>man</command> <option>wc</option> or read the <link
                xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">HTML
                documentation</link>.</para>
              </listitem>

              <listitem>
                <para><command>wc</command> may process several file in
                parallel thereby producing an extra line <coref
                linkend="sd1PlWcExtraLine"/> summing up all values:</para>

                <screen>goik &gt;wc bibliography.xml swd1.xml
    69     83   2087 bibliography.xml
  6809  18252 248894 swd1.xml
  <emphasis role="bold">6878  18335 250981 total</emphasis> <co
                    xml:id="sd1PlWcExtraLine"/>
</screen>
              </listitem>

              <listitem>
                <para><command>wc</command> can be used in <link
                xlink:href="https://en.wikipedia.org/wiki/Pipeline_(Unix)">pipes</link>
                () like:</para>

                <screen>goik &gt;grep int BoundedIntegerStore.java | wc
     12      76     516</screen>

                <para>The above output <quote>12 76 516</quote> tells us that
                our file <filename>BoundedIntegerStore.java</filename> does
                have 12 lines containing the string <quote>int</quote>.</para>
              </listitem>
            </itemizedlist>

            <para>A partial implementation shall offer all features being
            mentioned in the introduction. The following steps are a proposal
            for your implementation:</para>

            <orderedlist>
              <listitem>
                <para>Write a method counting the number of words within a
                given string. We assume words to be separated by at least one
                white space character (<code>space</code> or <code>\t</code>).
                Write some tests to assure correct behaviour.</para>
              </listitem>

              <listitem>
                <para>Read input either from a list of files or from standard
                input depending on the number of arguments to main(String[]
                args):</para>

                <itemizedlist>
                  <listitem>
                    <para>If <code language="java">args.length == 0</code>
                    assume to read from standard input.</para>
                  </listitem>

                  <listitem>
                    <para>if <code language="java">0 &lt; args.length</code>
                    try to interpret the arguments as filenames.</para>
                  </listitem>
                </itemizedlist>
              </listitem>

              <listitem>
                <para>Write a class <classname>TextFileStatistics</classname>
                being able to and count characters, words and lines of a
                single input file. Instances of this class may be initialized
                from a <classname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>.</para>

                <para>Write corresponding tests.</para>
              </listitem>

              <listitem>
                <para>You may create an instance of <classname
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>
                from <link
                xlink:href="https://docs.oracle.com/javase/10/docs/api/java/lang/System.html#in">System.in</link>
                via:</para>

                <programlisting language="java">new BufferedReader(new InputStreamReader(System.in))</programlisting>
              </listitem>

              <listitem>
                <para>Create an executable Jar archive and execute some
                examples. The UNIX command <command
                xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html#cat-invocation">cat</command>
                writes a file's content to standard output. This output may be
                piped as input to your application as in <code>cat
                filename.txt | java -jar .../wc-1.0.jar</code>.</para>
              </listitem>
            </orderedlist>
          </question>

          <answer>
            <annotation role="make">
              <para role="eclipse">P/Sd1/Wc/wc</para>
            </annotation>

            <para>Executing <command>mvn</command> <option>package</option>
            creates an executable Jar file
            <filename>../target/wc-1.0.jar</filename>. We test both ways of
            operation:</para>

            <glosslist>
              <glossentry>
                <glossterm>Reading from standard input</glossterm>

                <glossdef>
                  <screen>goik &gt;cat Testdata/input.html | java -jar target/wc-1.0.jar
  9    14    137</screen>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm>Passing file names as parameters</glossterm>

                <glossdef>
                  <screen>goik &gt;java -jar target/wc-1.0.jar Testdata/*
  9    14    137  Testdata/input.html
  4     5     41  Testdata/model.css
 13    19    178  total</screen>
                </glossdef>
              </glossentry>
            </glosslist>

            <para><xref linkend="glo_Junit"/> tests of internal
            functionality:</para>

            <glosslist>
              <glossentry>
                <glossterm>Counting words in a given string:</glossterm>

                <glossdef>
                  <programlisting language="java">@Test
public void testNoWord() {
  Assert.assertEquals("Just white space", 0,
       TextFileStatistics.findNoOfWords(" \t"));
}

@Test
public void testSingleWord() {
  final String s = "We're";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testTwoWords() {
  final String s = "We are";
  Assert.assertEquals("text='" + s + "'", 2,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWordsWhiteHead() {
  final String s = "\t \tBegin_space";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWordsWhiteTail() {
  final String s = "End_space \t ";
  Assert.assertEquals("text='" + s + "'", 1,
       TextFileStatistics.findNoOfWords(s));
}

@Test
public void testWhiteMulti() {
  final String s = "    some\t\tinterspersed   \t  spaces \t\t ";
  Assert.assertEquals("text='" + s + "'", 3,
        TextFileStatistics.findNoOfWords(s));
}</programlisting>
                </glossdef>
              </glossentry>

              <glossentry>
                <glossterm>Analyzing test file data:</glossterm>

                <glossdef>
                  <programlisting language="java">@Test
public void testTwoInputFiles() throws FileNotFoundException, IOException {

  final String model_css_filename =
    "Testdata/model.css",      //  4 lines   5  words  41 character
      input_html_filename =
    "Testdata/input.html";     //  9 lines  14  words 137 character
                               //_________________________________________
                               // total 13 lines  19  words 178 character

  final TextFileStatistics
    model_css = new TextFileStatistics(
      new BufferedReader(new FileReader(model_css_filename)),
           model_css_filename),

    input_html = new TextFileStatistics(new BufferedReader(
        new FileReader(input_html_filename)), input_html_filename);

  // File Testdata/model.css
  Assert.assertEquals( 4, model_css.numLines);
  Assert.assertEquals( 5, model_css.numWords);
  Assert.assertEquals(41, model_css.numCharacters);

  // File Testdata/input.html
  Assert.assertEquals(  9, input_html.numLines);
  Assert.assertEquals( 14, input_html.numWords);
  Assert.assertEquals(137, input_html.numCharacters);

  // Grand total
  Assert.assertEquals( 13, TextFileStatistics.getTotalNumLines());
  Assert.assertEquals( 19, TextFileStatistics.getTotalNumWords());
  Assert.assertEquals(178, TextFileStatistics.getTotalNumCharacters());
}</programlisting>
                </glossdef>
              </glossentry>
            </glosslist>
          </answer>
        </qandaentry>
      </qandadiv>
    </qandaset>
  </section>
</chapter>