Scripting on Linux

Origins of scripting languages

Hardwired programs and embedded OS allowed for very little system control

Job control language allowed IBM mainframe operators to "batch" a set of programs into a routine operation or job, e.g. deducting payments made from accounts outstanding.

Job parameters allowed rerun of failed (partially completed) jobs.

Benefits:

Mainframe features then repeatedly reinvented on smaller and cheaper machines.

Many command languages developed:

E.G TSO/Clist (IBM on-line), PCL (Prime Primos), DCL (DEC VAX/VMS), Aegis (Apollo DomainOS). MS-DOS Batch File (.BAT)

Microsoft froze Batch file language in early 1990s. Very early versions of Microsoft Basic were packaged with MS-DOS, while later versions were sold seperately.

Unix saw competitive shell development and transfer of ideas between shell languages.

Unix shell scripts have all the usual 3rd generation features, including loop constructs, variables, arrays, branches and functions.

Learning scripting languages

Many features shared between shell languages and 'C', Perl, Awk.

The 80/20 rule: 80% of the benefit derives from knowledge of the 20% most useful subset.

Programmers learn by:

Those requiring an in-depth understanding of these languages will need to read the appropriate books and on-line tutorials and carry out a comprehensive series of programming exercises. In other cases a useable subset of knowledge can be obtained by reading the source code of existing programs and executing these, and by conducting a number of small experiments supplemented with tactical use of the reference information provided with these languages. By this means, your programming knowledge can grow on an as-needed basis.

A simple example of a shell script

This example program puts a wrapper around rm

#!/bin/sh
# cautious shell script
#
# performs similar function to rm but cautiously

if [ $# = 0 ]; then
  echo usage:
  echo cautious name_of_file_to_be_deleted
  exit
fi

echo "are you sure you want to delete $1" '? (y/n)'
read ans
if [ "$ans" = "y" -o "$ans" = "Y" ]; then
  if [ -w $1 ]; then
    rm $1
    echo $1 has been deleted
  else
    echo cautious: $1 access denied or does not exist
  fi
else
  echo $1 not deleted
fi

Return values and tests

The Unix shell convention is for a program which exits successfully to end with a return of 0 (which the shell considers as true), and for an error exit to result in a return of 1 or more (which the shell considers false).

This is the opposite way round to how this is done within 'C'. The value returned by a program (e.g. using the return statement in 'C') is different from the standard output of the same program. The shell command:

echo $?
gives you the return code of the last foreground command. You can also obtain this value within shell scripts using the special variable $? .

A script with a loop and debugging

#!/bin/sh
# skeleton shell script to provide menu framework
#
# Author: Richard Kay

debug=`echo ${1:-nodebug}`
if [ $debug = "debug" ]; then
  set -vx
fi

finish=no
while [ $finish = "no" ]; do
  echo
  echo SHELL SCRIPT MENU
  echo =================
  echo
  echo "enter   for option"
  echo "-----   ----------"
  echo "  1     first option"
  echo "  2     second option"
  echo "  3     third option"
  echo "  Q     to quit"
  echo
  echo "please enter option"
  read option
  option=${option:-c}
  if [ $option = "1" ]; then
    echo you have selected option 1
  elif [ $option = "2" ]; then
    echo you have selected option 2
  elif [ $option = "3" ]; then
    echo you have selected option 3
  elif [ $option = "q" -o $option = "Q" ]; then
    finish=yes
  else
    echo invalid entry. please try again
    echo
  fi
  echo
  echo press return to continue
  read dummy
done

Processing a table of data by selecting rows and columns

#!/bin/sh
# login analysis shell program

rm temp1
cat may.logins | grep console | grep ')' | awk '{ print $9 }' \
  | sed '1,$s/)/ /g' | sed '1,$s/(/ /g' \
  | sed '1,$s/:/ /g' | sed '1,$s/+/ /g' > temp1

total=0
numrecs=`cat temp1 | wc -l `
count=0
while [ $count -lt $numrecs ]; do
  count=`expr $count + 1`
  record=`sed -n ${count}p temp1`
  fields=`echo $record | wc -w`
  if [ $fields -eq 3 ]; then
    days=`echo $record | awk '{ print $1 }'`
    mdays=`expr $days \* 24 \* 60`
    hours=`echo $record | awk '{ print $2 }'`
    mhours=`expr $hours \* 60`
    mins=`echo $record | awk '{ print $3 }'`
  else
    mdays=0
    hours=`echo $record | awk '{ print $1 }'`
    mhours=`expr $hours \* 60`
    mins=`echo $record | awk '{ print $2 }'`
  fi
  total=`expr $total + $mdays + $mhours + $mins`
# echo $total $fields
done
loggedhours=`expr $total / 60`
echo total logged hours is $loggedhours

This example combines a number of the features of previous examples, using awk, grep and sed filters to access specific rows from columns, and to exclude unwanted data from the analysis. The input data is a set of login records. This application was used to analyse average usage of 20 workstations during particular months. Here are some records from the may.logins input file:

reboot    ~                  Tue May 11 13:14
shutdown  ~                  Tue May 11 13:15
usr11361  console            Tue May 11 09:04 - 10:22  (01:18)
usr11187  console            Mon May 10 18:53 - 20:30  (01:36)
usr11187  console            Mon May 10 18:50 - 18:53  (00:02)
usr11187  console            Mon May 10 18:38 - 18:50  (00:12)
usr11513  console            Mon May 10 15:15 - 16:27  (01:11)
usr11451  console            Mon May 10 12:11   still logged in
usr11456  console            Mon May 10 10:53 - 15:14  (04:21)
usr11138  console            Mon May 10 09:03 - 10:40  (01:36)
usr12069  console            Sat May  8 11:05 - 09:01 (1+21:55)
usr12069  console            Sat May  8 11:00 - 11:04  (00:04)

Doing arithmetic internally within the shell

Some of the earlier Unix shells didn't have builtin arithmetic operators, so they farmed this job out to external programs such as expr as we saw above. This can be done more quickly using the Bash shell let builtin, as in the following example script:

#!/bin/sh
# new bash shell arithmetic example
echo 'enter 2 numbers'
read first second

let "plus = $first + $second"
let "minus = $first - $second"
let "times = $first * $second"
let "divide = $first / $second"

echo plus $plus minus $minus times $times divide $divide

Extreme scripting: Perl, Python, Tcl and Ruby

Shell scripts provide glue logic together with other utilities, pipelines and redirection.

Advantage: Very rapid development of systems/network administration and automated operations.

Disadvantage: Slow and difficult to develop very large programs, possibly system dependant.

Have to load and execute other small programs very many times ?

Adding 100 users to system:

Fully portable scripting languages: Perl, Tcl, Python and Ruby can be used to handle simple scripting-type applications together with more complex applications, e.g. requiring object-oriented active website development. These languages build upon the features available in Unix shell languages, using almost identical syntax for many purposes, e.g. handling regular expressions.

These languages trade off machine for programmer efficiency, compare against 'C', 'C++' .

Can combine benefits by using 'C' modules in scripting interpreter or split application design after profiling to find where cycles are used.

Perl example

The login analysis program described above, was rewritten in Perl.

#!/usr/bin/perl
# login analysis
$total=0;
open(LOGINS,"may.logins");
while(<LOGINS>){
  if(/console/) {
    @rec=split;
    $_ = @rec[8];
    s/\)//;
    s/\(//;
    s/\+/:/;
    @fields=split /:/;
    $numf=@fields;
    if ($numf == 3){
      $mdays=@fields[0] * 24 * 60;
      $mhours=@fields[1] * 60;
      $mins=@fields[2];
    } else {
      $mdays = 0;
      $mhours = @fields[0] * 60;
      $mins=@fields[1];
    }
    $total=$total + $mdays + $mhours + $mins;
    # print "$total\n";
  }
}
$loggedhours=$total/60;
print "total hours logged is $loggedhours\n";

This program runs much faster than the shell script, because everything is done inside the same process. There are many syntactic similarities, but Perl borrows array and loop notation from 'C' and some other notation from grep, awk and sed. As in Bash, $ is used to introduce single (scalar) variables and @ is used for arrays. $_ is used for a default scalar variable, so that the substitution operation e.g:
s/\)//; which strips a closing round bracket from a string, doesn't need to specify which string it operates upon.