Pointers to Functions in 'C' and Code Reuse

Richard Kay, 5 January 2001
 

1. Introduction


The 'C' language was designed before the principles of Object Oriented Programming (OOP) were widely known. Later programming languages such as C++ and Java  had OOP design principles encouraging greater code reuse such as encapsulation, inheritance, dynamic instantiation of programmer-defined classes (comprising types and associated functions) as core language design features. However, those interested in designing compilers, embedded systems, database engines and operating system components - which can still benefit from the performance advantages of 'C' in comparison with other languages - can use 'C' in an object-oriented manner. This approach is likely to be needed in order to make large projects manageable due to the increasing complexity of modern systems and the difficulties older software-engineering methodologies have had in coping with this complexity.

A Java programmer who has created a simple applet will be aware that functionality and data made available through a publicly-available class package (e.g. the Java Applet class) can be extended by the programmer adding methods (the Java name for functions) which are auto-magically called from the extended class whenever the applet needs to be started, stopped or redrawn.

This makes it possible for complex and interesting programs to be created which provide sophisticated graphical user interfaces without the programmer needing to know very much about how all the repetitive details are done. Because the class package is designed in a reuseable manner the programmer just needs to know how to write the functions it automatically calls which specialise the more general class as required for the application.

Code reuse is at the heart of OOP methodologies. For one programmer to be able to write code that can be used by many others this needs to be generalised, which requires that classes or functions made available for other programmers through software libraries must be able to handle a very wide variety of tasks which come within the scope of the kind of action being performed. This handout uses sorting as anrequirement to demonstrate the issues involved and develop practical solutions.
 
 

2. Generalised Sorting


The example program which is developed below handles a task which needs to be performed many times in many different applications.  This requirement can therefore benefit from reuseable code. A sort program has to provide 3 facilities:
 

2.1 Iteration


Comparisons and swaps are carried out repetitively until the data is in the required order. For a fast sort a recursive algorithm such as quick sort or merge sort will be used to control this sequence of operations, but sorting can be demontrated more simply using non-recursive algrorithms. Either way the same algorithm can be applied to many different sorting applications. (It is however unlikely that the same approch will be used for sorting data stored in entirely different structures such as arrays, linked lists and binary trees as the sort ordering is more likely to determine how dynamically linked structures are generated.)
 

2.2 Comparison


data from 2 records are compared to decide whether or not these records need to be swapped. This part of the sort operation is application specific so needs to be customised. Different sort applications will require comparisons on different fields or parts of the data to be carried out in different ways, e.g. ascending, descending, string and numeric, case sensitive and insensitive based on different alphabets (ASCII, Greek, Japanese) etc.
 

2.3 Swapping


data in dynamic structures is conveniently swapped by relocating the addresses of the data while the data can stay in the same place. Data in arrays is generally swapped directly. Direct swapping requires that the records be of the same size and that this size be known. If addresses or references to variable length data are in an array, or if the data  is stored directly in an array the swapping operation can be generalised.
 
 

3. Approaches To Reuseable Sort Program Customisation

 

3.1 Parameterisation


The GNU command-line sort program currently accepts up to 19 different command line parameters, which allow the user to specify how the comparison is to be carried out. These allow for alphabetic and numeric sorts in ascending and descending order based on specified positions for primary, secondary and ternary etc. sort fields within variable and fixed-length data. Someone using this sort might need to spend an hour or so reading the documentation and experimenting the first time it is used, unless the requirement is relatively trivial (e.g. single field records) , or the user is already familiar with Unix style console documentation conventions. The next time this user is likely to be able to figure out how to apply it to (or program it into) a different application somewhat more quickly. (This approach also allows end users with command line skills but without programming skills to manipulate data more effectively and efficiently than would otherwise be possible. This approach also tends to enable the skilled end user to become a programmer which can help ease skill shortages.)

3.2 Cut, Paste and Modification of Source Code

A programmer who has already written a sort function which works in one program can relatively easily cut and paste the entire function into another program and modify the program statement which carries out the comparison. This approach will work more quickly if the function is already suitably parameterised so that it can handle different array sizes and record lengths. The downside of cut and pasted code is cut and pasted bugs and a potential long-term software maintenance cost explosion. However, this approach is valid for disposable code used in throw-away applications, e.g. for one off data conversions during a system upgrade.
 

3.3 Enabling the Code Reuser to Supply a Comparison Function


The only part of the sort program likely to need changing is the part that carries out the comparison.  The application programmer  can use a generalised sort function provided by a library programmer and supply this sort function with a customised comparison function.  To achieve this using 'C' the application programmer supplies the address of the comparison function as a sort function parameter.
 
 

4. An Application to Sort Student Records by Name and Mark


These 2 sorts are carried out using the same sort function and passing it the address of the comparison function to be used. 2 different comparison functions have to be written to compare by student name or by mark.

The prototype of the gsort (general sort) function has to include the prototype of the comparison function as a parameter.

void gsort(RECORD *a,int rows, int (*comp)(RECORD *x,RECORD *y));
  /* gsort parameters:
  *a: address of array to be sorted
  rows: number of records in the array
  *comp: address of the user-defined comparison function */
The last parameter of gsort(): int (*comp)(RECORD *x,RECORD *y) is the prototype of the comparison function, which gsort() will call as comp().  The use of brackets between the return type of int and the function pointer int (*comp)  are needed to indicate that this parameter of gsort is a pointer to a function. Without these brackets the declaration would suggest that the return type of comp was  a pointer to int, int* rather than int. The final part of this parameter: (RECORD *x,RECORD *y) indicates that the function known to gsort() as comp() will require 2 pointers to data type RECORD as its parameters when comp() is called by gsort(). These will be the addresses of the 2 records to be compared.

The first implementation of gsort() requires the user to define a data type called RECORD as this type is used by gsort() and swap(). The second implementation avoids this requirement by casting the relevant pointers to (null*). This avoids potential conflict with the symbol RECORD if this is used for a different purpose within the program so allows the code to become slightly more general purpose. The information lost when RECORD pointers are cast to null pointers is the record length, so this information needs to be passed as a seperate parameter in the second iteration of the program.
 
 

5. First Implementation Requiring Use of RECORD typedef

/* gensort.c
   Performs a general sort on data stored as an array. Demonstrates
   calling a function gsort() which calls a comparison function passed as
   a parameter. The gsort() function is called twice to sort a set of
   records firstly by name, then by mark.

   This first version requires the user of gsort() to define
   a data type called RECORD

   Richard Kay
   3 Jan 2001
*/

#include <stdio.h>
#include <string.h>

struct student {
  char name[30];
  float mark;
};
typedef struct student RECORD;

/* use of RECORD typedef within swap() and gsort() functions places
   requirement that type RECORD is defined globally.
*/


void gsort(RECORD *a,int rows, int (*comp)(RECORD *x,RECORD *y));
  /* gsort parameters:
     *a: address of array to be sorted
     rows: number of records in the array
     *comp: address of the user-defined comparison function 

     comp parameters:
     *x,*y: address of first and second records to compare    
     comp return values 
        -1 indicates x less than y, 
        0 indicates x == y, 
        1 indicates  x greater than y. */

void swap(RECORD a[], int x,int y);
  /* swap parameters:
     *v the address of the start of the array
     x and y: indexes of array records to be swapped */

void pause(void); /* pauses program */

RECORD a[]={{"Ashok",92.6},
            {"Seamus",74.7},
            {"Jaswinder",56.4},
            {"George",63.2},
            {"Suresh",34.2},
            {"Stuart",11.1},
            {"Harry",87.4},
            {"Delroy",55.5}};
/* 8 records */

/* comp_name() and comp_mark() functions are defined above the
   call of gsort() in main() so that their addresses are known
   beforehand. */


int comp_name(RECORD *x,RECORD *y){
  /* compares 2 names to see if in order, same or in reverse sort order */
  return strcasecmp(x->name,y->name);
      /* Linux strcasecmp() is the same as Borland stricmp() */
    /* both return -1 , 0 or +1 */
}

int comp_mark(RECORD *x,RECORD *y){
  /* compares sort position based on mark */
  if(x->mark > y->mark)
    return 1;
  else if(y->mark > x->mark)
    return -1;
  else
    return 0;
}


int main(void){
  int n,i;
  n=sizeof(a)/sizeof(RECORD); /* should be 8 */
  printf("there are %d records\n",n);

  /* sort and display by name */
  gsort(a,n,comp_name);
  printf("data sorted by name\n");
  for(i=0;i<n;i++)
    printf("%30s %f\n",a[i].name,a[i].mark);
  pause();

  /* sort and display by mark */
  gsort(a,n,comp_mark);
  printf("data sorted by mark\n");
  for(i=0;i<n;i++)
    printf("%30s %f\n",a[i].name,a[i].mark);

  pause();
  return 0;
}


void pause(void){
  printf("press enter to continue\n");
  getchar();
}
 
void gsort(RECORD a[],int rows, int (*comp)(RECORD *x,RECORD *y) ){
   /* realistic general sorts functions are more likely to use the
      quicksort recursive algorithm. This one uses bubble
      sort for simplicity.  */
   int i,j=rows;
   for(j=rows;j>0;j--){
     for(i=0;i<j-1;i++){
       if((*comp)(a+i,a+i+1) > 0 )
         swap(a,i,i+1);
     }
   }
}
 
void swap(RECORD *v,int x,int y){
  RECORD temp;
  temp=v[x];
  v[x]=v[y];
  v[y]=temp;
}

6. Second Implementation Using Void Pointers .

/* gensort2.c
   Performs a general sort on data stored as an array. Demonstrates
   calling a function gsort() which calls a comparison function passed as
   a parameter. The gsort() function is called twice to sort a set of
   records firstly by name, then by mark.

   This second version is more general in that it will sort any
   data. It does not require the user of gsort() to define
   a data type for swapping or comparison purposes, but results in
   more complex swap and (user defined) comparison functions.

   Richard Kay
   4 Jan 2001
*/

#include <stdio.h>
#include <string.h>

struct student {
  char name[30];
  float mark;
};
typedef struct student RECORD;

/* To make gsort truly general we need to pass the record length
   as a parameter to it and swap() functions. swap() has to
   copy 1 byte (char) at a time up to the record length . This
   is easier with variable record length data as with this we just
   swap fixed size pointers to sort it.
*/


void gsort(void *a, size_t rleng, int rows, int (*comp)(void *x,void *y));
  /* gsort parameters:
     *a: address of array to be sorted
     rleng: record length in bytes
     rows: number of records in the array
     *comp: address of the user-defined comparison function */

  /* comp parameters:
     *x,*y: addresses of first and second records to compare    */

void swap(void *x,void *y, size_t rleng);
  /* swap parameters:
     *x and *y the addresses of the starts of records to be swapped
     rleng: record length in bytes */

void pause(void); /* pauses program */

RECORD a[]={{"Ashok",92.6},
            {"Seamus",74.7},
            {"Jaswinder",56.4},
            {"George",63.2},
            {"Suresh",34.2},
            {"Stuart",11.1},
            {"Harry",87.4},
            {"Delroy",55.5}};

/* comp_name() and comp_mark() functions are defined above the
   call of gsort() in main() so that their addresses are known
   beforehand. */


int comp_name(void *x,void *y){
  /* compares sort position based on name */
  RECORD *a,*b;
  a=(RECORD*)x; /* cast addresses in x and y to type RECORD */
  b=(RECORD*)y;
  /* compares 2 names to see if in order, same or in reverse sort order */
  return strcasecmp(a->name,b->name);
      /* Linux strcasecmp() is the same as Borland stricmp(),
        to compare strings case insensitively
        both return -1 , 0 or +1 */
}

int comp_mark(void *x,void *y){
  /* compares sort position based on mark */
  RECORD *a,*b;
  a=(RECORD*)x; /* cast addresses in x and y to type RECORD* */
  b=(RECORD*)y;
  if(a->mark > b->mark)
    return 1;
  else if(a->mark > b->mark)
    return -1;
  else
    return 0;
}


int main(void){
  int n,i;
  n=sizeof(a)/sizeof(RECORD); /* should be 8 */
  printf("there are %d records\n",n);

  /* sort and display by name */
  gsort((void*)a,sizeof(RECORD),n,comp_name);
  printf("data sorted by name\n");
  for(i=0;i<n;i++)
    printf("%30s %f\n",a[i].name,a[i].mark);
  pause();

  /* sort and display by mark */
  gsort((void*)a,sizeof(RECORD),n,comp_mark);
  printf("data sorted by mark\n");
  for(i=0;i<n;i++)
    printf("%30s %f\n",a[i].name,a[i].mark);

  pause();
  return 0;
}


void pause(void){
  printf("press enter to continue\n");
  getchar();
}

void gsort(void *a,size_t rleng,int rows, int (*comp)(void *x,void *y) ){
   /* realistic general sorts functions are more likely to use the
      quicksort recursive algorithm. This one uses bubble
      sort for simplicity.  */
   int i,j=rows;
   for(j=rows;j>0;j--){
     for(i=0;i<j-1;i++){
        /* Using a void array pointer we now have to multiply array indexes
         * by the record length in order to get the 2 record addresses */
       if((*comp)(a+i*rleng,a+(i+1)*rleng) > 0 )
         swap(a+i*rleng,a+(i+1)*rleng,rleng);
     }
   }
}
 
void swap(void *x,void *y,size_t rleng){
  /* swaps rleng bytes at address x with rleng bytes at address y.
   * This depends upon char being 1 byte wide. Unicode users beware! 
   * Is lack of a byte primitive data type a gap in the ANSI 'C' standard ?
   * We could change the loop arithmetic based on rleng/sizeof(char) 
   * to make this swap function more portable, but then we'd have to
   * consider word alignment issues. YUKK! 
*/      
  char temp, *a,*b;
  int i;
  a=(char*)x; /* cast void* pointers to char* pointers */
  b=(char*)y;
  for(i=0;i<rleng;i++){
    temp=*(a+i);
    *(a+i)=*(b+i);
    *(b+i)=temp;
  }
}

7. Output of both programs

This demonstrates that the hard coded input data is sorted firstly by name then by mark by passing pointers to different comparison functions to the same sort function.
there are 8 records
data sorted by name
                         Ashok 92.599998
                        Delroy 55.500000
                        George 63.200001
                         Harry 87.400002
                     Jaswinder 56.400002
                        Seamus 74.699997
                        Stuart 11.100000
                        Suresh 34.200001
press enter to continue
 
data sorted by mark
                        Stuart 11.100000
                        Suresh 34.200001
                        Delroy 55.500000
                     Jaswinder 56.400002
                        George 63.200001
                        Seamus 74.699997
                         Harry 87.400002
                         Ashok 92.599998
press enter to continue