The 'C' language was designed before the principles of Object Oriented
Programming (OOP) were widely known. Later programming languages such as
C++ and Java had OOP design principles encouraging greater code reuse
such as encapsulation, inheritance, dynamic instantiation of programmer-defined
classes (comprising types and associated functions) as core language design
features. However, those interested in designing compilers, embedded systems,
database engines and operating system components - which can still benefit
from the performance advantages of 'C' in comparison with other languages
- can use 'C' in an object-oriented manner. This approach is likely to
be needed in order to make large projects manageable due to the increasing
complexity of modern systems and the difficulties older software-engineering
methodologies have had in coping with this complexity.
A Java programmer who has created a simple applet will be aware that functionality and data made available through a publicly-available class package (e.g. the Java Applet class) can be extended by the programmer adding methods (the Java name for functions) which are auto-magically called from the extended class whenever the applet needs to be started, stopped or redrawn.
This makes it possible for complex and interesting programs to be created which provide sophisticated graphical user interfaces without the programmer needing to know very much about how all the repetitive details are done. Because the class package is designed in a reuseable manner the programmer just needs to know how to write the functions it automatically calls which specialise the more general class as required for the application.
Code reuse is at the heart of OOP methodologies. For one programmer
to be able to write code that can be used by many others this needs to
be generalised, which requires that classes or functions made available
for other programmers through software libraries must be able to handle
a very wide variety of tasks which come within the scope of the kind of
action being performed. This handout uses sorting as anrequirement to demonstrate
the issues involved and develop practical solutions.
The example program which is developed below handles a task which
needs to be performed many times in many different applications.
This requirement can therefore benefit from reuseable code. A sort program
has to provide 3 facilities:
Comparisons and swaps are carried out repetitively until the data
is in the required order. For a fast sort a recursive algorithm such as
quick sort or merge sort will be used to control this sequence of operations,
but sorting can be demontrated more simply using non-recursive algrorithms.
Either way the same algorithm can be applied to many different sorting
applications. (It is however unlikely that the same approch will be used
for sorting data stored in entirely different structures such as arrays,
linked lists and binary trees as the sort ordering is more likely to determine
how dynamically linked structures are generated.)
data from 2 records are compared to decide whether or not these
records need to be swapped. This part of the sort operation is application
specific so needs to be customised. Different sort applications will require
comparisons on different fields or parts of the data to be carried out
in different ways, e.g. ascending, descending, string and numeric, case
sensitive and insensitive based on different alphabets (ASCII, Greek, Japanese)
etc.
data in dynamic structures is conveniently swapped by relocating
the addresses of the data while the data can stay in the same place. Data
in arrays is generally swapped directly. Direct swapping requires that
the records be of the same size and that this size be known. If addresses
or references to variable length data are in an array, or if the data
is stored directly in an array the swapping operation can be generalised.
The GNU command-line sort program currently accepts up to 19 different
command line parameters, which allow the user to specify how the comparison
is to be carried out. These allow for alphabetic and numeric sorts in ascending
and descending order based on specified positions for primary, secondary
and ternary etc. sort fields within variable and fixed-length data. Someone
using this sort might need to spend an hour or so reading the documentation
and experimenting the first time it is used, unless the requirement is
relatively trivial (e.g. single field records) , or the user is already
familiar with Unix style console documentation conventions. The next time
this user is likely to be able to figure out how to apply it to (or program
it into) a different application somewhat more quickly. (This approach
also allows end users with command line skills but without programming
skills to manipulate data more effectively and efficiently than would otherwise
be possible. This approach also tends to enable the skilled end user to
become a programmer which can help ease skill shortages.)
The only part of the sort program likely to need changing is the
part that carries out the comparison. The application programmer
can use a generalised sort function provided by a library programmer and
supply this sort function with a customised comparison function.
To achieve this using 'C' the application programmer supplies the address
of the comparison function as a sort function parameter.
These 2 sorts are carried out using the same sort function and passing
it the address of the comparison function to be used. 2 different comparison
functions have to be written to compare by student name or by mark.
The prototype of the gsort (general sort) function has to include the prototype of the comparison function as a parameter.
void gsort(RECORD *a,int rows, int (*comp)(RECORD *x,RECORD *y)); /* gsort parameters: *a: address of array to be sorted rows: number of records in the array *comp: address of the user-defined comparison function */The last parameter of gsort(): int (*comp)(RECORD *x,RECORD *y) is the prototype of the comparison function, which gsort() will call as comp(). The use of brackets between the return type of int and the function pointer int (*comp) are needed to indicate that this parameter of gsort is a pointer to a function. Without these brackets the declaration would suggest that the return type of comp was a pointer to int, int* rather than int. The final part of this parameter: (RECORD *x,RECORD *y) indicates that the function known to gsort() as comp() will require 2 pointers to data type RECORD as its parameters when comp() is called by gsort(). These will be the addresses of the 2 records to be compared.
The first implementation of gsort() requires the user to define a data
type called RECORD as this type is used by gsort() and swap(). The second
implementation avoids this requirement by casting the relevant pointers
to (null*). This avoids potential conflict with the symbol RECORD if this
is used for a different purpose within the program so allows the code to
become slightly more general purpose. The information lost when RECORD
pointers are cast to null pointers is the record length, so this information
needs to be passed as a seperate parameter in the second iteration of the
program.
/* gensort.c
Performs a general sort on data stored as an array. Demonstrates
calling a function gsort() which calls a comparison function passed as
a parameter. The gsort() function is called twice to sort a set of
records firstly by name, then by mark.
This first version requires the user of gsort() to define
a data type called RECORD
Richard Kay
3 Jan 2001
*/
#include <stdio.h>
#include <string.h>
struct student {
char name[30];
float mark;
};
typedef struct student RECORD;
/* use of RECORD typedef within swap() and gsort() functions places
requirement that type RECORD is defined globally.
*/
void gsort(RECORD *a,int rows, int (*comp)(RECORD *x,RECORD *y));
/* gsort parameters:
*a: address of array to be sorted
rows: number of records in the array
*comp: address of the user-defined comparison function
comp parameters:
*x,*y: address of first and second records to compare
comp return values
-1 indicates x less than y,
0 indicates x == y,
1 indicates x greater than y. */
void swap(RECORD a[], int x,int y);
/* swap parameters:
*v the address of the start of the array
x and y: indexes of array records to be swapped */
void pause(void); /* pauses program */
RECORD a[]={{"Ashok",92.6},
{"Seamus",74.7},
{"Jaswinder",56.4},
{"George",63.2},
{"Suresh",34.2},
{"Stuart",11.1},
{"Harry",87.4},
{"Delroy",55.5}};
/* 8 records */
/* comp_name() and comp_mark() functions are defined above the
call of gsort() in main() so that their addresses are known
beforehand. */
int comp_name(RECORD *x,RECORD *y){
/* compares 2 names to see if in order, same or in reverse sort order */
return strcasecmp(x->name,y->name);
/* Linux strcasecmp() is the same as Borland stricmp() */
/* both return -1 , 0 or +1 */
}
int comp_mark(RECORD *x,RECORD *y){
/* compares sort position based on mark */
if(x->mark > y->mark)
return 1;
else if(y->mark > x->mark)
return -1;
else
return 0;
}
int main(void){
int n,i;
n=sizeof(a)/sizeof(RECORD); /* should be 8 */
printf("there are %d records\n",n);
/* sort and display by name */
gsort(a,n,comp_name);
printf("data sorted by name\n");
for(i=0;i<n;i++)
printf("%30s %f\n",a[i].name,a[i].mark);
pause();
/* sort and display by mark */
gsort(a,n,comp_mark);
printf("data sorted by mark\n");
for(i=0;i<n;i++)
printf("%30s %f\n",a[i].name,a[i].mark);
pause();
return 0;
}
void pause(void){
printf("press enter to continue\n");
getchar();
}
void gsort(RECORD a[],int rows, int (*comp)(RECORD *x,RECORD *y) ){
/* realistic general sorts functions are more likely to use the
quicksort recursive algorithm. This one uses bubble
sort for simplicity. */
int i,j=rows;
for(j=rows;j>0;j--){
for(i=0;i<j-1;i++){
if((*comp)(a+i,a+i+1) > 0 )
swap(a,i,i+1);
}
}
}
void swap(RECORD *v,int x,int y){
RECORD temp;
temp=v[x];
v[x]=v[y];
v[y]=temp;
}
/* gensort2.c
Performs a general sort on data stored as an array. Demonstrates
calling a function gsort() which calls a comparison function passed as
a parameter. The gsort() function is called twice to sort a set of
records firstly by name, then by mark.
This second version is more general in that it will sort any
data. It does not require the user of gsort() to define
a data type for swapping or comparison purposes, but results in
more complex swap and (user defined) comparison functions.
Richard Kay
4 Jan 2001
*/
#include <stdio.h>
#include <string.h>
struct student {
char name[30];
float mark;
};
typedef struct student RECORD;
/* To make gsort truly general we need to pass the record length
as a parameter to it and swap() functions. swap() has to
copy 1 byte (char) at a time up to the record length . This
is easier with variable record length data as with this we just
swap fixed size pointers to sort it.
*/
void gsort(void *a, size_t rleng, int rows, int (*comp)(void *x,void *y));
/* gsort parameters:
*a: address of array to be sorted
rleng: record length in bytes
rows: number of records in the array
*comp: address of the user-defined comparison function */
/* comp parameters:
*x,*y: addresses of first and second records to compare */
void swap(void *x,void *y, size_t rleng);
/* swap parameters:
*x and *y the addresses of the starts of records to be swapped
rleng: record length in bytes */
void pause(void); /* pauses program */
RECORD a[]={{"Ashok",92.6},
{"Seamus",74.7},
{"Jaswinder",56.4},
{"George",63.2},
{"Suresh",34.2},
{"Stuart",11.1},
{"Harry",87.4},
{"Delroy",55.5}};
/* comp_name() and comp_mark() functions are defined above the
call of gsort() in main() so that their addresses are known
beforehand. */
int comp_name(void *x,void *y){
/* compares sort position based on name */
RECORD *a,*b;
a=(RECORD*)x; /* cast addresses in x and y to type RECORD */
b=(RECORD*)y;
/* compares 2 names to see if in order, same or in reverse sort order */
return strcasecmp(a->name,b->name);
/* Linux strcasecmp() is the same as Borland stricmp(),
to compare strings case insensitively
both return -1 , 0 or +1 */
}
int comp_mark(void *x,void *y){
/* compares sort position based on mark */
RECORD *a,*b;
a=(RECORD*)x; /* cast addresses in x and y to type RECORD* */
b=(RECORD*)y;
if(a->mark > b->mark)
return 1;
else if(a->mark > b->mark)
return -1;
else
return 0;
}
int main(void){
int n,i;
n=sizeof(a)/sizeof(RECORD); /* should be 8 */
printf("there are %d records\n",n);
/* sort and display by name */
gsort((void*)a,sizeof(RECORD),n,comp_name);
printf("data sorted by name\n");
for(i=0;i<n;i++)
printf("%30s %f\n",a[i].name,a[i].mark);
pause();
/* sort and display by mark */
gsort((void*)a,sizeof(RECORD),n,comp_mark);
printf("data sorted by mark\n");
for(i=0;i<n;i++)
printf("%30s %f\n",a[i].name,a[i].mark);
pause();
return 0;
}
void pause(void){
printf("press enter to continue\n");
getchar();
}
void gsort(void *a,size_t rleng,int rows, int (*comp)(void *x,void *y) ){
/* realistic general sorts functions are more likely to use the
quicksort recursive algorithm. This one uses bubble
sort for simplicity. */
int i,j=rows;
for(j=rows;j>0;j--){
for(i=0;i<j-1;i++){
/* Using a void array pointer we now have to multiply array indexes
* by the record length in order to get the 2 record addresses */
if((*comp)(a+i*rleng,a+(i+1)*rleng) > 0 )
swap(a+i*rleng,a+(i+1)*rleng,rleng);
}
}
}
void swap(void *x,void *y,size_t rleng){
/* swaps rleng bytes at address x with rleng bytes at address y.
* This depends upon char being 1 byte wide. Unicode users beware!
* Is lack of a byte primitive data type a gap in the ANSI 'C' standard ?
* We could change the loop arithmetic based on rleng/sizeof(char)
* to make this swap function more portable, but then we'd have to
* consider word alignment issues. YUKK!
*/
char temp, *a,*b;
int i;
a=(char*)x; /* cast void* pointers to char* pointers */
b=(char*)y;
for(i=0;i<rleng;i++){
temp=*(a+i);
*(a+i)=*(b+i);
*(b+i)=temp;
}
}
there are 8 records data sorted by name Ashok 92.599998 Delroy 55.500000 George 63.200001 Harry 87.400002 Jaswinder 56.400002 Seamus 74.699997 Stuart 11.100000 Suresh 34.200001 press enter to continue data sorted by mark Stuart 11.100000 Suresh 34.200001 Delroy 55.500000 Jaswinder 56.400002 George 63.200001 Seamus 74.699997 Harry 87.400002 Ashok 92.599998 press enter to continue