Unix pathnames notes

The file system and pathnames

Within a Unix system, files are organised into a structure conventionally visualised as an inverted tree, like the pattern of roots which may exist underground beneath a single trunk. This structure consists of a number of directory files, usually referred to as directories or folders, and ordinary files sometimes referred to as "plain files".

Every file and directory within a Unix filesystem has a pathname which may be used to refer to that object when carrying out various operations such as displaying the contents of a file, setting the current or working directory or copying files. At the top of the filesystem is the root directory whose pathname is the forward slash character / . Directories may contain other directories and plain files.

                     /  (root directory)
                     |
     ------------------------------------------------------
     |           |           |           |                |
    dir1       file1       file2        dir2            home
     |                                   |                |
  ----------------        -------------------        ----------
  |              |        |                 |        |        |
 file3          dir3      file4             file5   fred     bill
                 |                                   |
        -------------------------------          ----------
        |              |              |          |        |
      file6          file7          file8   cupboard   wardrobe
                                                           |
                                         ----------------------
                                         |          |         |
                                       socks      vests    shirts

In the above example the pathname of file2 would be /file2 the pathname for file6 would be /dir1/dir3/file6 and the pathname for dir3 would be /dir1/dir3 . In Unix (and the pathname parts of web URLs) forward slashes (/) are used to delimit components of pathnames. In Windows back slashes (\) are used.

The Unix filesystem supports working or current directories and relative pathnames; a relative pathname is relative to the working directory and does not start with / . For example to change working directory from dir1 to dir3 you could either enter cd dir3 (pathname relative to dir1 ) or cd /dir1/dir3 (absolute pathname from root).

Here are some more examples of pathnames and (comments):

pathname comment
wardrobe this is a name of a directory within /home/fred, the current working directory)
Wardrobe this name is different from wardrobe
fred/wardrobe/socks file or directory called socks within a directory called wardrobe which is within a directory called fred within /home, the current working directory
/home/fred/wardrobe/vests This pathname to the object vests could be used from any directory on the system, including the current working directory.
. a period on its own used as a pathname indicates the current working directory
.. two periods together indicate the parent of the current working directory. e.g. the parent of dir3 is dir1

Unix does not support the concept of drive letters or use physical device names in the same way. Unix file trees are made up of one or more physical or network devices mounted at various points within the same tree, so if on Unix the floppy disk drive is mounted at the directory /mnt/floppy , then the Windows pathname a:\report.txt will become /mnt/floppy/report.txt when the floppy disk is used on Unix.

Links

UNIX also supports multiple paths to the same file, by the use of links. This enables a file or directory to be logically present wherever it may be needed, without having multiple copies wasting disk-space and not kept up-to date. Hard links are simply additional names for the same object while soft links contain the usual pathname of the linked object and can be used in some situations where hard links can not be used, e.g. filesystems residing on more than one disk partition. E.G.

ln /sales/personnel/simon /bonuses/recipients/simon

gives file /sales/personnel/simon the additional name using a hard link.

ln -s /usr/wizard/wands/magic/software/binaries /wiz

creates a soft link called /wiz so the relevant binaries can be executed and accessed more simply.

Wildcards

Wildcards or constructs using them are usually referred to in UNIX documentation as "regular expressions". Some UNIX commands use their own regular expressions which usually work in a similar manner to those interpreted by the shell. Note that when you use shell commands with regular expressions, e.g. to use the command on a number of files, the expression is expanded by the shell. The program called by the shell to perform the command is passed the matched filenames as separate arguments. Therefore this program does not know that you used a regular expression and does not need to know how to interpret it. This makes the shell both a powerful tool and dangerous in the wrong hands. Putting a superfluous space between a prefix and a * wildcard used with the rm (remove or delete) command has meant the difference between clearing out 2 unwanted files with the same prefix and accidently destroying a main project directory. Enough of the theory, lets look at the practice.
wildcard comment
* matches any filename or part of filename except those starting with a period (.)
? matches any single character
[123] matches the single character 1, 2 or 3
[1-5] matches any single character in the range 1-5

Technically the string abc is an expression which matches any occurrence of itself. So expressions can be combined to give greater flexibility. E.G.

fred* matches anything prefixed by fred
fred*.? matches fred.1 or frederick.Z but not freddy.bak
fred[A-Z]* matches fredB or fredXor but not fred or freddy
fred * will try to match fred (if it exists) and then match EVERYTHING matched by * . Note that the space will cause the shell to treat fred and * as separate arguments.

Note that . at the beginning of a filename must be matched explicitly, and / must always be matched explicitly. Special characters such as *,?$^()[]\/<>|"'! and spaces should not normally be included in actual filenames for reasons which should by now be becoming increasingly obvious), but not everyone who creates files for use on a UNIX system is aware of this and a user may sometimes have reasons for exceptions which outweigh the awkwardness involved, so you may sometimes have to find ways of telling the shell that a special character is actually part of a filename. In this event the backslash special character \ will usually escape the special character following it so that rm \$fred will delete the file called $fred .

Other characters worth mentioning now include period (.) which on its own means the current working directory and tilde (~) which is shorthand for your home directory. Hyphen (-) period (.) and underscore (_) characters can be used safely within filenames, but period (.) will hide files from ordinary directory listings and won't match wildcard searches if used at the start of filenames. For this reason, files created by and for the use of system software are often preceded by period (.) and these files should not be deleted unless you you know what effect this will have.