Richard Kay updated October 2005
A key is one or more fields used to identify a record or row in a database table.
For the key to be useful in identifying records this key should be unique, i.e. any value it may have should never be present in more than one record.
A key can be kept unique if:
A suitable natural key may or may not be available. E.G. 2 persons sharing an email address are required to share the same account if they both intend accessing the same on-line system. Accounts on this system may use the email address as a natural key. The advantage of this would be that users can be expected to remember their email address when they login, so they don't have to remember any other account identifier. The email address would be a natural key because it is unique, but having other purposes within the system, e.g. for sending account passwords.
Sometimes a surrogate (made-up or artificial) key is preferred, because no suitable natural unique key exists. An email address would be unsuitable if 2 accounts could share the same email address. It might also be unsuitable if the system design required account anonymity. Another reason for having a surrogate key would be to minimise key size, in order to speed manual input of data required for record identification. Perhaps one of the best known surrogate keys is the UK postcode system. This only provides a unique address record key when combined with other parts of the address, e.g. house number.
Many system designs will prevent modification of key fields. Changing a value which is a foreign key in another table will leave records referring to it unlinked. Multi-table system designs will often require records containing a foreign key to be removed before the record they refer to can be removed.
Arrays are often the simplest and most convenient data structure used for maintaining sets of related data. Having defined a record we can use an array to store as many records as memory availability allows. The array can be allocated either statically or dynamically. If we use dynamic allocation we can calculate the memory needed by counting the records in the file before allocating the required amount of space.
This assumes we:
Static allocation is simpler but less flexible. With static allocation the maximum number of records is decided as a design consideration of the program at coding time, most conveniently defined through a global constant e.g. MAXRECS which may be changed prior to compilation. In either case, if we want to add records while the program is operating this can only be accomplished safely, without overflowing the array bounds, if a check is made prior to the record being added to ensure that there is unused space within the array for the additional record. If we allocate the array as having space for MAX e.g. RECORD a[MAXRECS]; we can safely use positions from a[0] to a[MAXRECS-1] .
If the array size allows for more records than are in use at any given time 2 approaches are often used which enable identification of used and unused array positions:
/* global definitions */
typedef struct record { char name[30]; float mark; } RECORD;
RECORD *a; /* end of global definitions */
...
/* somewhere near the beginning of main() */
int nrecs,maxrecs;
nrecs=count_records(); /* count number of records in file and assign to nrecs */
maxrecs=nrecs + 5; /* allows for up to 5 records to be added manually */
a=(RECORD*)malloc(sizeof(RECORD)*maxrecs); /* dynamic array allocation */
/* array a[] can now be used just as if it had been declared: RECORD a[maxrecs]; */
if(a==NULL){ /* couldn't get enough memory from heap so quit now */
fprintf(stderr,"not enough memory in heap to read file !\n");
exit(1); /
}
load_recs(nrecs); /* function to read nrecs records into array a[] */
...
... /* within loop controlled by menu such that addrec() and
/* other functions might be called as needed */
addrec(&nrecs); /* addrec() adds a record */
...
/* addrec function definition */
void addrec(int *nrecs, int maxrecs){ /* function which adds record to array a */
if(*nrecs<maxrecs){ /*a[] is global */
printf("enter student name\n");
fgets(a[nrecs].name,30,stdin); /* safer than scanf("%s",a[nrecs].name) */
printf("enter student mark\n");
scanf("%f",&a[nrecs].mark);
*nrecs = *nrecs+1; /* must keep score ! */
} else {
printf("max allocation for new records used: stop and restart program\n");
/* better to call save_recs and load_recs here ? */
}
}
The above program code dynamically allocates space for maxrecs records in the array a[] with maxrecs being set as the number of records found in the input file plus 5, to allow for the user to add up to 5 extra records during the program run. The program keeps track of how many records within the array a[] are used at any given time.
The approach coded above is naive, because it does not prevent addition of duplicate keys.
In the example above, where there are maxrecs positions within the array of which nrecs positions are used, it has been assumed that at any given time all array indices between 0 and nrecs-1 are in use and remaining indices from nrecs to maxrecs-1 are vacant. If we adopt this approach there are 2 methods by which we could delete a record.
The first involves finding the record to be deleted and shuffling all higher records down by one in turn starting with the record above the delete position and then taking one away from (decrementing) nrecs. If the record we wish to delete is randomly positioned within the used slots we will have to move nrecs/2 records on average to accomplish this:
/* this should be within a del() function */
int delpos,i,key;
printf("which key do you want deleted ? \n");
scanf("%d",&key);
delpos=find(key,*nrecs);/* find() returns index of key or -1 */
if(delpos>=0){ /* key found */
for(i=delpos+1;i<nrecs;i++)
/* move all records above delpos down by one*/
a[i-1]=a[i];
*nrecs--; /* one less record in array a[] */
} else
printf("key %d not found, can't delete record\n",key);
The above approach will be required if the array is to be kept in sorted order as it doesn't change the order of the remaining records. However moving on average nrecs/2 records is inefficient for a very large unsorted array if there is no need to maintain the record order. A faster approach is to to move the top record used to the deletion position and decrement nrecs. This only moves a single record e.g:
int delpos,i,key;
printf("which key do you want deleted ? \n");
scanf("%d",&key);
delpos=find(key,*nrecs); /* find() returns index of key or -1 */
if(delpos!= -1)
a[delpos]=a[--nrecs];
/* decrements nrecs before using it to index array a[] */
| Maintaining Record Order | Record Order Unimportant |
![]() |
![]() |
A function which modifies a record should generally not modify a key. If a key needs to be modified, this can be done more safely by using the functions to delete the old and add the new version of the record. A well-designed delete() function will prevent removal of a referent from a foreign key in a multi-table database.
Other fields can be modified so long as the same validation constraints are maintained as when a record is added. The index of the record to be modified must be identified. An add() function must identify that the key does not already exist to prevent duplicates. A del() function must identify the index of the record to be removed. This requirement is shared between these 3 modules. To avoid breaking modularity, finding the index corresponding to a key or identifying the absence of a key within the array must be programmed as a separate find() function.
The add() and del() functions may need to alter nrecs, the number of records in the used partition of the array. The mod() function will not need to change this value.