From Documentation
Jump to: navigation, search

Key Features

Unix Architecture page on Wikipedia

Files are stored on disk in a hierarchical file system, with a single top location throughout the system (root, or "/"), with both files and directories, subdirectories, sub-subdirectories, and so on below it.

With few exceptions, devices and some types of communications between processes are managed and visible as files or pseudo-files within the file system hierarchy. This is known as everything's a file.

Doug McIlroy (inventor of Unix pipes)

This is the Unix philosophy: Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface

File system

Key differences from Windows

  • there are mount points instead of A:, C:, etc.,
  • directories and files are case sensitive, and
  • the separation character is / instead of \

What would appear as a separate media hierarchy in Windows (e.g., A:\MyDir\MyCode.c) simply appears under a separate directory (known as a mount point) in Unix (e.g., /media/disk/MyDir/MyCode.c).

Root (/)

boot loader files
configuration files
device files
user programs required for booting
system programs required for booting
libraries required for booting
programs, libraries, and such not required for booting
superuser directory
users directories (shared by all clusters)
temporary files
variable data (spool files, log files, etc.)
add on package directory
mount point for removable media
process information pseudo-file system
system information pseudo-file system

User (/usr)

The /usr directory is split off from the / directory mostly because disk space used to be precious.

user programs not required for booting
system programs not required for booting
libraries not required for booting
game programs
architecture independent data
on-line manuals
source code
header files

User Local (/usr/local)

The /usr/local directory is a place to locally install programs without messing up /usr.

user programs not required for booting
system programs not required for booting
libraries not required for booting
game programs
architecture independent data
on-line manuals
source code
header files


user data files (shared by all cluster)
user temporary data files (local to each cluster)


Some of the special /dev files are

discards all data written and provides no data
provides a constant stream of NULL characters
provides a stream of random characters
provides a constant stream of pseudo-random characters


Programs are run by specifying the command followed by the arguments separated by spaces.

program [argument...]

By convention, arguments are switches followed by strings (e.g., regexps, paths, file names, etc.). Switches are usually single dashes followed by letter for each switch or a double dash followed by a descriptive string (e.g., rm -fr mydir or rm --force --recurse mydir). Most commands also understand

as a file name means read or write to the terminal
the end of switches and the start of the strings (in case the string needs to start with - or --).


Traditionally man pages (a single help page) have been the de facto documentation source, however, some software suites have been switching to info pages (a collection of hyperlinked pages). Help for the shell built in commands is available by the built in help.

man command
on-line reference manuals
apropos [-a] keyword ...
search on-line reference manuals (same as man -k)
info item
info documents


The current directory is . and the parent directory is ...

current directory
cd directory
change directory
mkdir directory
make directory
rmdir directory
remove directory


Files beginning with . are considered hidden and not normally shown.

ls [-a] [-l] destination
list files
cp [-a|-p] [-r] [-s] source ... destination
copy files
ln [-s] target name
link to file
mv source ... destination
move files
rm [-r] [-f] destination ...
remove files


Standard permissions are read, write, and execute for user, group, and other. They are frequently abbreviated as three octal numbers (0=000, 1=001, 2=010, 3=011, 4=100, 5=101, 6=110, 7=111) corresponding to user read, write, and execute; group read, write, and execute; other read, write, and execute.

For directories, read allows the contents to be listed, write allows files to be added or removed, and execute allows the directory to be traversed.

chmod [u|g|o|a]...[+|-|=][r|w|x|X]... [-R] destination ...
change mode (user/group/other permissions)
chown [-R] user destination ...
change owner
chgrp [-R] group destination ...
change group
setfacl [-m|-x] [-R] [[u|g|o|m]...:user:[r|w|x|X]...] destination ...
set file access control list (individual users)
getfacl destination ...
get file access control list (individual users)

View Files

The space key will advance a page and the q key will quit in more and less. In addition, the arrow keys will move in the appropriate direction in less.

more file
view one page at a time
less file
view forward and backwards
cat [file ...]
concatenate files in sequence
head [-n lines] [file ...]
first part of files
tail [-n lines] [-f] [file ...]
last part of files
paste [-d deliminator] [file ...]
concatenate files in parallel
cut [-d deliminator] [-f range] [file ...]
extract columns
sort [-g] [-f] [-u] [file ...]
sort lines


Digests are numbers computed from the content of files such that it is extremely difficult to come up with two different files with the same number.

diff [-w] [-i] [-u number|-y] file1 file2
compare files line by line
sdiff [-W] file1 file2
compare files side by side (similar to diff -y)
md5sum [file ...]
compute MD5 digest
sha256sum [file ...]
compute SHA256 digest


egrep [-i] [-v] regexp [file ...]
find lines matching regexp in files (same as grep -E)
fgrep [-i] [-v] strings [file ...]
find lines matching strings in files (same as grep -F)
find directory ... predicates
find files satisfying predicates in directories


Each process (a running programs) is identified by a unique number.

ps [-A|-U user] [-H] [-f]
process list
kill [-s signal] process ...
signal process
nohup command
command will continue to run even if user disconnects on clusters which are set up for that to happen. Most clusters nowadays don't require this command.
nice command
low priority command


ssh [user@]host [command]
login to remote system
scp [[user@]host:] source ... [[user@]host:]destination
copy remote files
unix2dos file ...
convert to DOS line breaks
dos2unix file ...
convert to Unix line breaks


sleep seconds
waits given number of seconds
echo [-n] [-e] strings
prints strings
test tests
perform various string (e.g., equality) of file (e.g., existence) tests


The two most popular Unix editors are vi and emacs. Both are extremely powerful and very complex. A simpler editor is nano.

vi [file ...]
common Unix editor
emacs [-nw] [file ...]
common Unix editor
nano [file ...]
simple U


Vi distinguishes between command and insert mode. Command mode allows you to move around and enter commands. Insert mode allows you to edit text.

:w[!] [file]
write file (excalmation forces it)
:e file
edit file
quit Vi (exclamation forces it)
next file (excalmation forces it)
append after cursor or at end of line
insert (capital for beginning of line)
change word/line or to end of line
delete word/line or to end of line
copy word/line or to end of line
paste before or after cursor/line
join lines
undo (captial for current line)
revert to command mode


Emacs is a more traditional single mode editor. Partially typed entries can be completed by pressing TAB (twice to list).

help (b list keys and k describes keys)
abort current operation
single window or split vertical/horizontal window
save current buffer
CTRL+x b
switch current buffer
CTRL+x k
quit current buffer
quit Emacs
mark start of region
copy from start of region to cursor
past copied region
delete to end of line or line if start of line
search for text
enter command (TAB twice to list)

Command Line

The shell is a command line interpreter that lets users run programs. It provides ways to start programs and to manipulate/setup the context in which they run. The main parts of this are

  • arguments,
  • environment,
  • standard input (stdin),
  • standard output (stdout),
  • standard error (stderr), and
  • return value

A standard command looks like so

command [<stdinfile] [>[>]stdoutfile] [2>[>]stderrfile] [&]


Options passed to the program to tweak it's behaviour. Traditionally switches (e.g., -xzf or --extract --gzip --file) followed by strings (e.g., regexp, paths, file names, etc.). Partially typed file names and directories can be completed by pressing TAB (twice to list).

...{...}... (brace expansion)
if not quoted, expands once for each comma separated list or once for each number in .. separated range
~... (tilde expansion)
if not quoted, expands to home directory of user following the tilde or the current user if no user specified
${...} (parameter and variable expansion)
if not single quoted, expands to environment variable specified or the corresponding parameter if number specified ({ and } are not always required)
$(...) (command substitution)
if not single quoted, expands to output for command (`...` is an alternative syntax)
$((...)) (arithmetic substitution)
if not single quoted, expands to evaluated result of the expression
... (word splitting)
if not quoted, splits into separate arguments anywhere an IFS character (by default space, tab, and newline) occurs
...[*|?|[...]]... (path name expansion)
if not quoted, is considered a pattern and replaced with matching file names (* matches any string, ? matches any character, and [...] matches all the enclosed characters)


Special characters can be escaped with \ to remove their special meaning. Single and double quoting strings affect escaping as well as which expansions and substitutions are preformed.

no expansion or substitutions is preformed
only escaping, parameter and variable expansion, command substitutions, and arithmetic substitutions occur


A set of key value pairs (e.g., USER=root) that programs can look up and use. Each program gets a fresh copy (i.e., changing it will not change the original) of all environment variables marked for export.

make a local environment variable
export key[=value]
mark an environment variable for export
unset *key
delete an environment variable

Two important environment variables are

list of : separated directories to look for programs in
list of : separated directories to look for libraries in (ahead of the system defaults specified in /etc/

Input and Output

Programs are run with a standard place to read input from, a standard place to write output to, and a standard place to write error messages to. By default these are all the terminal window in which the program is run. This can be changed via

< file
read standard input from file
[>|>>] file
write standard output to file (overwriting or appending)
[2>|2>>] file
write standard error to file (overwriting or appending)
[&>|&>>] file
write standard output and error to file (overwriting or appending)


Programs return an integer exit status. The stats of the most recent executed foreground command is available as $?.

TORQUE Resource Manager scheduler exit code, job could not start (list)
program completed successfully
program specific error code
program terminated by signal 127+signal

Job Control

Programs run in the foreground by default. Background jobs will suspended if they require input. Existing jobs will be sent SIGHUP when the shell exits.

list jobs
fg id
switch job to foreground
bg id ...
switch jobs to background
disown id ...
release jobs from job control

Foreground jobs usually respond to the following key combinations

suspend program
abort program
end of input

Multiple Commands

Commands can be combined in several ways.

... ; ...
run first command and then second (same as pressing ENTER)
... & ...
run first command in background at the same time as second
... | ...
run first command in background with its output going to the second as input
... && ...
run first command and then second only if first returns success
... || ...
run first command and then second only if first returns failure
group command in current shell -- has to end with ; or newline
group command in sub shell -- does not have to end with ; or newline


Executable text files that start with #!command (#!/bin/bash` for shell scripts) are run as command file.


number of parameters
name of shell or shell script
positional parameter
all positional parameters (in double quotes expands as one arguement)
all positional parameters (in double quotes expands as separate arguements)

The following functions manipulate parameters

shift [number]
drop specified number of parameters (one if unspecified)
set parameter ...
set parameters to given parameters


if command ...; then command ...; [elif command ...; then command ...;] ... [else command ...;] fi
conditionally run commands depending on success if and elif commands
for key in value ...; do command ...; done
for each value, set key to value and run commands
while command ...; do command ...; done
repeatedly run commands until while commands fail
case value in [pattern [| pattern]... ) command ...;;] ... esac
run commands that where first pattern matches (same as path name expansion)
continue [number]
next iteration of enclosed loop (last if not specified)
break [number]
exit enclosed loop (last if not specified)
function name { command; } ...
create a command that runs the commands with passed parameters
return [number]
return from function with given exit status (last command if not specified)
exit [number]
quit shell with given exit status (last command if not specified)

Regular Expressions

Regular expressions are strings where several of the non-alphanumeric characters have special meaning. They provide a concise and flexible means for matching strings, and are used by several Unix programs.


match start of line
match end of line


the indicated character
any character
any character in the list or range (^ inverts)


match either or


match zero or one times
match zero or more times
match one or more times
match a range of times