Tutorial Exercises Using pipes, files and filters

The demos directory

Some of these exercises refer to filenames to which access is required on your computer. You will either be given or have already been given instructions for accessing a file demo.tgz which extracts to a directory of files. Later on in these notes this directory will be referred to simply as the demos directory. You are likely to need to refer to the lecture notes in order to complete these exercises.

Using shell redirections and pipes

Create a new directory under your home directory and make it the working directory using cd. Next create a few empty files using touch called f1, f2, f3 etc. Copy programs cautious and anxious from the demos directory. For the next exercise we will be using these to demonstrate substitution of interactive input. cautious and anxious perform a similar operation to rm. Try them out and confirm their successful operation on your temporary files. In general, to run a program called program_file from the current directory, check first it is executable using the

ls -l
command and if it doesn't have user x rights the command:
 chmod +x program_file
to give it x (execute) rights. Then use command:
to execute the program. To get cautious and anxious to delete files you will have to supply the names of files to be deleted as parameters to these commands.

Then after making sure f1 exists, use the echo command to pipe a single y argument into the standard input of the ./cautious f1 command. You will need to use the pipe symbol between 2 commands on one line to do this. This should delete the f1 file and demonstrates substitution of interactive input by using piped standard input.

Next create a file called yy using echo twice. yy must contain 2 lines, each line containing just a single y character. You will need to use stdout redirection with one command to create the yy file, and then appending to the existing file with a second command to add the second line. Check correctness of this file using either less yy or cat yy .

Make sure you have an empty file f2 and delete it using the ./anxious command, but this time redirecting the standard input to ./anxious directly from the yy file without using a pipe. Finally repeat the automated use of cautious and anxious but this time also redirecting stdout to a file called rubbish to avoid displaying the unnecessary messages on the screen.

Recording and reusing software you are learning

Open a text editor and remember to file your command history every 10 commands or so (using the history command and a cut and paste operation; edit out commands which don't work) during the next exercises, because then you won't have to relearn these commands from scratch when you want to reuse them in a shell script as a later exercise; so don't logout and forget to save this information. The idea is to build up a set of working commands that do useful things in a file. Add comments as appropriate, e.g. on seperate lines starting with # characters. You can then copy and paste these commands into future shell scripts and print them out as part of your logbook for future reference.

A text file containing a modern software database

The demos directory (module website, as updated 10 Oct 2009) contains a file apropos.txt, obtained using the command:

apropos -w '*' > apropos.txt

This contains a one line description, similar to that obtainable using man -k, for all manual entries.

Using pipes and filters, explore and find out:

A text file containing an ancient software database: more advanced exercises

The demos directory contains a large software index to a collection called hensa.index . Try and avoid spending more than 2-3 minutes reading this file manually; just give it a quick scan to grasp the kind of information in it and how this is arranged. After an introductory section, this index has one line for each of more than 3000 software packages. Each such line has an access code, an acquisition or most recent change date, a key field to denote whether it is shareware and whether it has been reviewed, a name and a short description.

Selection of rows in a file

Create a directory called hensa, copy hensa.index into it and use grep to find the line number of the line containing e001, the code for the first product line, in the hensa.index file. Pipe this line from the grep command into awk to output just this line number. Hint: to avoid having to reenter the grep command preceding awk in the pipeline, you could use up and down arrow keys to scroll through command history, then edit the command selected. Within some command consoles on Linux it is also possible to select a section of text with the mouse and press the left and right mouse keys together (this simulates the middle mouse key with 3 key mice for which X-Windows was designed) to paste the highlighted buffer at the cursor.

Use either head or sed to place the top part of the hensa.index file up to but not including the e001 product line, into a new file called index.head in your hensa directory. Then use tail or sed to place all the rest of the file into another file in hensa called index.tail.

Use wc -l to count the total number of lines in hensa.tail This is the total number of software products covered. Use grep to search for anti-virus products and check which are the most recently updated. (The hensa.index is too old to be of much practical use, but this is still a useful exercise in learning incrementally how to make shell commands and scripts carry out automated data processing.) The grep -i option will ignore the case of letters. You will probably do better to search for descriptions containing the word virus than a longer form or expression. How many anti-virus products are there ?

Use of awk to select columns

Use awk to display just the product dates and CHEST access codes from the index.tail file. Use sed to change the / characters separating day/month/year fields to periods(.).

Use of sort

Read sort(1) and passwd(5) and experiment by sorting the /etc/passwd file either on ascending on the string userid or descending on the integer UID column. For a more interesting and difficult sorting exercise, sort the index.tail file into descending date order using sort(1) and place the output into a file called date.order. Then use cat to recombine index.head and date.order. You will probably need to pre-process the index.tail file with awk first before you try sorting it.

Selection of fields

See if you can combine some of the techniques you have just learned into a command pipeline, which provides a single shell command line to output just the product date and Hensa access code of the latest anti-virus product present in hensa.index