Linux Training

Linux training for private, public & voluntary sector.

0800 024 8425

City LinUX Training Courses

Section 13.
Linux file store
.


"Making files is easy under the UNIX operating system. Therefore, users tend to create numerous files using large amounts of file space. It has been said that the only standard thing about all UNIX systems is the message-of-the-day telling users to clean up their files."

System V.2 administrator’s guide.

13. Unix / Linux file store.

13.1. Linux file hierarchy.

For a fuller description of the main directories common to most Linux distributions see

sa101$ man hier

Image imgs/sa101-13.png

Image imgs/sa101-14.png

Figure 3 - File hierarchy (extract)

Files:

are not typed by name.
have random access at the byte level
are grouped into directories which are organised in a multilevel hierarchical tree.
can be multi-volume - may be spread across many physical and logical devices.

File access is standard across all version of UNIX/Linux.

Peripheral devices are accessed in the same fashion as data files.

A users home directory is set in the password file /etc/password. When a user logs onto the system their session begins with their home directory set as the current active directory.

sa101$ pwd
sa101$ cd /var
sa101$ cd log

To return to the home directory you can use the environment variable HOME, tilde expansion, or the command cd’s default directory. i.e.

sa101$ cd $HOME
sa101$ cd ~
sa101$ cd

all produce the same result.

File Hierarchy

For a description of some of the main directories common to most Linux distributions see

sa101$ man hier

13.2. File naming in Linux.

One of the fundamental objects in Linux is the disk-file. When we execute a program we copy the contents of a disk-file into semiconductor memory where it then runs, usually accessing other files containing data or text. Every file has a name which may consist of up to 255 characters from the ASCII character set (excluding the non-printing characters, null, and newline). The following are valid distinct filenames:-

fileName
filename
FileName

YY_UR_YY_UB_IC_UR_YY_4ME

program1.dat.version11


This_is_a_VALID_ Unix_filename_which_would_be_most_tedious_\
to_type_everytime_we_wished_to_use_it.Please_note_the_\
back-slash_character_allows_continuation_onto_the_\
next_line

Although punctuation and non-printing characters may be used to construct filenames, generally speaking it is best to avoid them for reasons which should become obvious. Usually the filename should provide more than a hint to the use of the file. The use of long file names should always be discouraged but the disadvantage of having to type long names is compensated for to a degree by bash’s filename completion mechanism and the use of metanotation.

13.3. File name substitution

Filename metanotation is similar to that in other software tools. Metanotation is expanded by the shell before execution of a command by the shell. E.g.

sa101$ cd eg
sa101$ ls
d  d00    d01  d02  d03  f1  f2  f3
sa101$ echo d*
d d00 d01 d02 d03

If no match is found the metacharacters are interpreted literally.

E.g.

sa101$ cd eg
sa101$ ls    echo k*
k*

13.3.1. Metanotation in filename substitution.

Image imgs/sa101-15.png

13.4. File name completion.

By typing the first characters of a filename, then the <tab> (or <esc> depending upon the "flavor" of Unix you are running), the shell will attempt multiple types of completion before attempting to complete the file name by matching the characters entered to the file names available in the current directory.

Using filename completion encourages the practice of placing the unique part of a file name at the beginning rather than at the end. For example, if I had log files from my daily work in the month of May, and named them:

logfiledata052200
logfiledata052300
logfiledata052400

I would need to type the following information to have the shell complete the file name for me:

sa101$ less logfiledata0522<CR>

While had I named my log files like this:

220500logfiledata
230500logfiledata
240500logfiledata

All I would need to type is:

sa101$ less 22<tab>

The shell would complete the file name. Potentially this would save 15 keystrokes! However with the use of file name metanotation we could find the former with

sa101$ less *22

and the latter with

sa101$ less 22*

It should be noted that using the day, month and year in the DDMMYY format as above will mean that a standard listing with ls will not result in a date ordered list. While using appropriate options will solve this problem by listing in the order of the creation date stamp, the problem becomes more frustrating if we need to step through an ordered list in a shell program loop.

The string "logfiledata" in the above example contributes little to our understanding of the file contents. It would be better by far, to collect all the log files together in a directory created for the purpose and to use the date in the form YYMMDD thus:

000522
000523
000524

The operating system imposes no structure rules about filenames. File extensions and version numbers are application dependent, for example the C compiler will expect source code to be stored in files with an extension of .c i.e. prog.c and web servers expect web pages to end in either .html or .htm.

It is common practice to name the instruction files for sed and awk scripts as below.

namechng.awk - (an awk script)
lowerhtml.sed - (a sed script)

File naming is very important, and good practice can save a lot of time and confusion.

Although, Linux filename syntax is very flexible there will be times when files will be used on multiple operating systems, hence, the syntax rules of the more restrictive operating system will prevail.

Some operating systems allow spaces within the file name e.g. "My documents". Since the Linux shells use white space (one or more spaces or tabs) as the delimiter if you transferred the file to a Linux machine, and then tried to issue the following command:

sa101$ cat My documents

the shell would look for a file called "My" and failing to find it, would issue an error message.

sa101$ cd eg
sa101$ ls -l M*
-rw-r--r-- 1 fulford fulford 0 Dec  7 01:58 My documents
sa101$ cat My documents
cat: My: No such file or directory
cat: documents: No such file or directory

If we must access My documents on a UNIX / Linux file system we can either use the escape character (\) before the intermediate space or place the whole file name in single or double quotes.

sa101$ ls -l M*
-rw-r--r-- 1 fulford fulford 0 Dec  7 01:58 My documents
sa101$ cat My\ documents
sa101$ cat M "My documents"
sa101$ cat ’My documents’

13.5. File contents

If we want to know the content type of a file the command is

sa101$ file <filename>

Text files can be examined with a variety of tools. Pagers format text files and input streams for viewing on screen. The default BSD pager was more. The standard Linux pager has become less, named by analogy with more, with the old sore that "less is more" in mind.

The default systems V pager for many years was pg.

All three are now commonly available in Linux.

To quickly check the first few lines in a file, perhaps to see the source code control statements, copyright statements etc. we can use head.

To see the end of a file use tail.

The command tail has a particularly useful flag -f which when used causes the program when it reaches the end of file to wait for further input. This allows us to monitor log files e.g.

tail -f /var/log/syslog

13.6. Linux Filestore protection.

Each file, directory and device has protection attributes in 3 categories; user (u), group (g) and others (o).

Each category has 3 modes of access: read (r), write (w) and execute / search (x), protection information given by long listing (ls -l), change protection (mode) with chmod. For basic modes 9 bits are used which can either be on (1) or off (0).

Image imgs/sa101-16.png

In the example above the permission are:

Image imgs/sa101-17.png

The easy way to work out the octal value is to treat each set of 3 permissions as a separate binary number so that the 3 columns equal 4,2 and 1 from left to right.

sa101$ export PS1=sa101$ 
sa101$ touch demo;ls -l demo
-rw-r--r-- 1 fulford operators 0 Nov 25 07:32 demo
sa101$ chmod 777 demo;ls -l demo
-rwxrwxrwx 1 fulford operators 0 Nov 25 07:32 demo
sa101$ chmod 664 demo;ls -l demo
-rw-rw---- 1 fulford operators 0 Nov 25 07:32 demo
sa101$ chmod 660 demo;ls -l demo
-rw-rw---- 1 fulford operators 0 Nov 25 07:32 demo
sa101$ chmod o+r demo;ls -l demo
-rw-rw-r-- 1 fulford operators 0 Nov 25 07:32 demo

The long file listing also shows the owner and the group assignment of the file, in the examples above fulford and operators respectively.

The owner of the file can change the group to another group of which they are a member. An ordinary user cannot give away ownership of a file. The root user can reassign the ownership and group of any file to any owner or group.

E.g.

sa101$ touch demo;ls -l demo
-rw-rw-r-- 1 fulford fulford 0 Nov 25 08:09 demo
sa101$ chgrp bin demo;ls -l demo
chgrp: changing group of ’demo’: Operation not permitted
-rw-rw-r-- 1 fulford fulford 0 Nov 25 08:09 demo
sa101$ grep ^bin /etc/group
bin:x:1:root,bin
sa101$ chgrp operator demo;ls -l demo
-rw-rw-r-- 1 fulford operator 0 Nov 25 08:09 demo
sa101$ grep operator /etc/group
operator:x:503:fulford, smith

13.7. Linux file types.

If we take a long file listing the file type is shown on the left of each output line.

Image imgs/sa101-18.png

There are seven file types.

Image imgs/sa101-19.png

Ordinary or regular (-) files are files that are not directories or "special" files. These may be text files, binaries, graphic images, sound files or any other regular file.

Directories (d) are the mechanism whereby groups of files are are collected together in a hierarchy of file groups.

There are five types of special file.

In Linux / Unix all devices (with the exception of network devices) are handled as files and have a location in the file system. The device file is used to apply access rights and direct operations to the appropriate device drivers.

There are two types of device file, character special files (c) and block special files (b). Character devices provide for a serial stream of input or output. Block special devices allow random access.

Unix domain sockets (c) are special files used for inter process communication. Sockets are fully duplex capable, that is they allow two way communication through the file.

FIFO (first in, first out) files are named pipes (p). Named pipes allow inter-process communication between processes that exist in different user spaces. They can be created on demand anywhere in the file hierarchy.

Symbolic links (s) are files which reference another file. The file stores a text representation of the referenced file’s path. The referenced path may be absolute or relative and in fact may not exist at all.

Symbolic links may be made between directories and may be made across file systems.

13.8. Exercise

sa101$ mkdir test
sa101$ cd test
sa101$ touch 310511 140611 300611 090512 120512 130512
sa101$ ls

Rename the files to produce a date ordered list with the ls command.

sa101$ cat >main.c
#include <stdio.h>
main(_)
{ printf("Hello, World\n");
}
^d
sa101$ gcc main.c
sa101$ file main.c a.out
sa101$ ./a.out
sa101$ file /bin/bash
sa101$ file /bin/sh

sa101$ pwd
/home/fulford/sa101
sa101$ mkdir eg;ls -ld eg
drwxr-xr-x 2 fulford fulford 4096 Nov 25 10:47 eg
sa101$ cd eg
sa101$ mkdir d
sa101$ ls -ld d
drwxr-xr-x 2 fulford fulford 4096 Nov 25 10:48 d
sa101$ chmod 000 d;ls -ld d
d--------- 2 fulford fulford 4096 Nov 25 10:48 d
sa101$ cd d
bash: cd: d: Permission denied
sa101$ chmod 100 d;ls -ld d
d--x------ 2 fulford fulford 4096 Nov 25 10:48 d
sa101$ cd d
sa101$ touch file1
touch: cannot touch ’file1’: Permission denied
sa101$ chmod 300 .;ls -ld .
d-wx------ 2 fulford fulford 4096 Nov 25 10:48 .
sa101$ touch file1;ls
ls: cannot open directory .: Permission denied
sa101$ sudo ls
file1

sa101$ sudo ls -l
total 0
-rw-r--r-- 1 fulford fulford 0 Nov 25 10:50 file1
sa101$ cat >file1
contents
^d
sa101$ cat file1
contents


sa101$ cd eg;ls -l
total 4
d-wx------ 2 fulford fulford 4096 Nov 25 10:50 d
sa101$ l touch f1
sa101$ link f1 f2;ls -l
total 4
d-wx------ 2 fulford fulford 4096 Nov 25 10:50 d
-rw-r--r-- 2 fulford fulford    0 Nov 25 21:25 f1
-rw-r--r-- 2 fulford fulford    0 Nov 25 21:25 f2

NB: The files f1 an f2 are identical. They are in fact the same file. We have created 2 directory entries for the same file.

sa101$ ln -s f1 f3
sa101$ ls -l
total 4
d-wx------ 2 fulford fulford 4096 Nov 25 10:50 d
-rw-r--r-- 2 fulford fulford    0 Nov 25 21:25 f1
-rw-r--r-- 2 fulford fulford    0 Nov 25 21:25 f2
lrwxrwxrwx 1 fulford fulford    2 Nov 25 21:26 f3 -> f1

Note the difference between the results of ln and ln -s. In the second instance we create a new file with a symbolic reference to the old file. If f3 is deleted the file f1 remains.

13.9. Exercise.

Try deleting f3 and take a new long listing.

Delete f1 or f2 and take a long listing.

Deleting the target file leaves broken symbolic links on the system.

Using the table below revise the material covered so far.

13.10. Tools and metanotation:

Image imgs/sa101-20.png

 


The layout and associated style sheets for this page are taken from the World Wide Web Consortium and used here under the W3C software licence.