<?xml version="1.0" encoding="UTF-8"?> <chapter version="5.1" xml:id="sd1ReadCharStreams" xmlns="http://docbook.org/ns/docbook" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns:xi="http://www.w3.org/2001/XInclude" xmlns:svg="http://www.w3.org/2000/svg" xmlns:ns="http://docbook.org/ns/transclusion" xmlns:m="http://www.w3.org/1998/Math/MathML" xmlns:html="http://www.w3.org/1999/xhtml" xmlns:db="http://docbook.org/ns/docbook"> <title>Reading character streams</title> <section xml:id="sd1ReadCharStreamsPrepare"> <title>Preparations</title> <itemizedlist> <listitem> <para>Section <quote>Interfaces</quote> of chapter 6 in .</para> </listitem> <listitem> <para>Chapter 8 of excluding the last chapter <quote>The Standard Streams</quote>.</para> </listitem> <listitem> <para><link linkend="sd1FigExceptionBasics">Exception basics</link> already being discussed in <xref linkend="sw1ChapterErrorHandling"/>.</para> </listitem> </itemizedlist> </section> <section xml:id="sd1GnuWc"> <title>Exercises</title> <qandaset defaultlabel="qanda" xml:id="sd1QandaLinenumbers"> <title>Adding line numbers to text files</title> <qandadiv> <qandaentry> <question> <para>We want to add line numbers to arbitrary text files not necessarily being related to programming. Consider the following HTML example input:</para> <programlisting language="java"><html> <head> <title>A simple HTML example</title> </head> <body> <p>Some text ... </p> </body> </html></programlisting> <para>Your application shall add line numbers:</para> <programlisting language="xml">1: <html> 2: <head> 3: <title>A simple HTML example</title> 4: </head> 5: <body> 6: <p>Some text ... </p> 7: </body> 8: </html></programlisting> <para>Hints:</para> <orderedlist> <listitem> <para>Given the name of an existing file you may create an instance of <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>:</para> <programlisting language="java">final FileReader fileReader = new FileReader(inputFileName); final BufferedReader inputBufferedReader = new BufferedReader(fileReader);</programlisting> </listitem> <listitem> <para>You will have to deal with possible <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/FileNotFoundException.html">FileNotFoundException</classname> problems providing meaningful error messages.</para> </listitem> <listitem> <para>The <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname> class provides a method <methodname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html#readLine--">readLine()</methodname> allowing to access a given file's content line by line.</para> <caution> <para>Even if a file exists you have my encounter <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/IOException.html">IOException</classname> problems being related to <acronym>i.e.</acronym> missing permissions.</para> </caution> </listitem> </orderedlist> </question> <answer> <annotation role="make"> <para role="eclipse">P/Sd1/Wc/readFile</para> </annotation> <para>This solutions reacts both to inexistent files and general IO problems:</para> <screen>File not found: Testdata/input.java</screen> <para>Two test cases deal both with readable and non-existing files: and expected exceptions:</para> <programlisting language="java">@Test public void testReadFileOk() throws FileNotFoundException, IOException { ReadFile.openStream("Testdata/input.txt"); // Existing file } @Test (expected=FileNotFoundException.class) // We expect this exception to be // thrown. public void testReadMissingFile() throws FileNotFoundException, IOException { ReadFile.openStream("Testdata/input.java"); // Does not exist }</programlisting> <para>Notice the second test which will only succeed if a <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/FileNotFoundException.html">FileNotFoundException</classname> is being thrown.</para> </answer> </qandaentry> </qandadiv> </qandaset> <qandaset defaultlabel="qanda" xml:id="sd1QandaWc"> <title>A partial implementation of GNU UNIX <command>wc</command></title> <qandadiv> <qandaentry> <question> <para>In this exercise we will partly implement the (Gnu) UNIX command line tool <command xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">wc</command> (word count). Prior to starting this exercise you may want to:</para> <itemizedlist> <listitem> <para>Execute <command xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">wc</command> for sample text files like e.g. a Java source file of similar:</para> <screen>goik >wc BoundedIntegerStore.java 58 198 1341 BoundedIntegerStore.java </screen> <para>What do these three numbers 58, 198 and 1341 mean? Execute <command>wc</command> <option>--help</option> or <command>man</command> <option>wc</option> or read the <link xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/wc-invocation.html">HTML documentation</link>.</para> </listitem> <listitem> <para><command>wc</command> may process several file in parallel thereby producing an extra line <coref linkend="sd1PlWcExtraLine"/> summing up all values:</para> <screen>goik >wc bibliography.xml swd1.xml 69 83 2087 bibliography.xml 6809 18252 248894 swd1.xml <emphasis role="bold">6878 18335 250981 total</emphasis> <co xml:id="sd1PlWcExtraLine"/> </screen> </listitem> <listitem> <para><command>wc</command> can be used in <link xlink:href="https://en.wikipedia.org/wiki/Pipeline_(Unix)">pipes</link> () like:</para> <screen>goik >grep int BoundedIntegerStore.java | wc 12 76 516</screen> <para>The above output <quote>12 76 516</quote> tells us that our file <filename>BoundedIntegerStore.java</filename> does have 12 lines containing the string <quote>int</quote>.</para> </listitem> </itemizedlist> <para>A partial implementation shall offer all features being mentioned in the introduction. The following steps are a proposal for your implementation:</para> <orderedlist> <listitem> <para>Write a method counting the number of words within a given string. We assume words to be separated by at least one white space character (<code>space</code> or <code>\t</code>). Write some tests to assure correct behaviour.</para> </listitem> <listitem> <para>Read input either from a list of files or from standard input depending on the number of arguments to main(String[] args):</para> <itemizedlist> <listitem> <para>If <code language="java">args.length == 0</code> assume to read from standard input.</para> </listitem> <listitem> <para>if <code language="java">0 < args.length</code> try to interpret the arguments as filenames.</para> </listitem> </itemizedlist> </listitem> <listitem> <para>Write a class <classname>TextFileStatistics</classname> being able to and count characters, words and lines of a single input file. Instances of this class may be initialized from a <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname>.</para> <para>Write corresponding tests.</para> </listitem> <listitem> <para>You may create an instance of <classname xlink:href="https://docs.oracle.com/javase/10/docs/api/java/io/BufferedReader.html">BufferedReader</classname> from <link xlink:href="https://docs.oracle.com/javase/10/docs/api/java/lang/System.html#in">System.in</link> via:</para> <programlisting language="java">new BufferedReader(new InputStreamReader(System.in))</programlisting> </listitem> <listitem> <para>Create an executable Jar archive and execute some examples. The UNIX command <command xlink:href="https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html#cat-invocation">cat</command> writes a file's content to standard output. This output may be piped as input to your application as in <code>cat filename.txt | java -jar .../wc-1.0.jar</code>.</para> </listitem> </orderedlist> </question> <answer> <annotation role="make"> <para role="eclipse">P/Sd1/Wc/wc</para> </annotation> <para>Executing <command>mvn</command> <option>package</option> creates an executable Jar file <filename>../target/wc-1.0.jar</filename>. We test both ways of operation:</para> <glosslist> <glossentry> <glossterm>Reading from standard input</glossterm> <glossdef> <screen>goik >cat Testdata/input.html | java -jar target/wc-1.0.jar 9 14 137</screen> </glossdef> </glossentry> <glossentry> <glossterm>Passing file names as parameters</glossterm> <glossdef> <screen>goik >java -jar target/wc-1.0.jar Testdata/* 9 14 137 Testdata/input.html 4 5 41 Testdata/model.css 13 19 178 total</screen> </glossdef> </glossentry> </glosslist> <para><xref linkend="glo_Junit"/> tests of internal functionality:</para> <glosslist> <glossentry> <glossterm>Counting words in a given string:</glossterm> <glossdef> <programlisting language="java">@Test public void testNoWord() { Assert.assertEquals("Just white space", 0, TextFileStatistics.findNoOfWords(" \t")); } @Test public void testSingleWord() { final String s = "We're"; Assert.assertEquals("text='" + s + "'", 1, TextFileStatistics.findNoOfWords(s)); } @Test public void testTwoWords() { final String s = "We are"; Assert.assertEquals("text='" + s + "'", 2, TextFileStatistics.findNoOfWords(s)); } @Test public void testWordsWhiteHead() { final String s = "\t \tBegin_space"; Assert.assertEquals("text='" + s + "'", 1, TextFileStatistics.findNoOfWords(s)); } @Test public void testWordsWhiteTail() { final String s = "End_space \t "; Assert.assertEquals("text='" + s + "'", 1, TextFileStatistics.findNoOfWords(s)); } @Test public void testWhiteMulti() { final String s = " some\t\tinterspersed \t spaces \t\t "; Assert.assertEquals("text='" + s + "'", 3, TextFileStatistics.findNoOfWords(s)); }</programlisting> </glossdef> </glossentry> <glossentry> <glossterm>Analyzing test file data:</glossterm> <glossdef> <programlisting language="java">@Test public void testTwoInputFiles() throws FileNotFoundException, IOException { final String model_css_filename = "Testdata/model.css", // 4 lines 5 words 41 character input_html_filename = "Testdata/input.html"; // 9 lines 14 words 137 character //_________________________________________ // total 13 lines 19 words 178 character final TextFileStatistics model_css = new TextFileStatistics( new BufferedReader(new FileReader(model_css_filename)), model_css_filename), input_html = new TextFileStatistics(new BufferedReader( new FileReader(input_html_filename)), input_html_filename); // File Testdata/model.css Assert.assertEquals( 4, model_css.numLines); Assert.assertEquals( 5, model_css.numWords); Assert.assertEquals(41, model_css.numCharacters); // File Testdata/input.html Assert.assertEquals( 9, input_html.numLines); Assert.assertEquals( 14, input_html.numWords); Assert.assertEquals(137, input_html.numCharacters); // Grand total Assert.assertEquals( 13, TextFileStatistics.getTotalNumLines()); Assert.assertEquals( 19, TextFileStatistics.getTotalNumWords()); Assert.assertEquals(178, TextFileStatistics.getTotalNumCharacters()); }</programlisting> </glossdef> </glossentry> </glosslist> </answer> </qandaentry> </qandadiv> </qandaset> </section> </chapter>