A Toolkit of commonly used external commands
The following commands are very frequently used in shell scripts. Many of them are used
in the examples in these notes. This is just a brief recap -- see the man pages for details on usage.
The most useful are flagged with *.
Most of these commands will operate on a one or more named files, or will operate on a stream of
data from standard input if no files are named.
Listing, copying and moving files and directories
ls
*
- list contents of a directory, or list details of files and directories.
-
mkdir
; rmdir
*
- Make and Remove directories.
-
rm
; cp
; mv
*
- Remove (delete), Copy and Move (rename) files and directories
touch
*
- Update the last modifed timestamp on a file, to make it appear to have just been written.
tee
- Make a duplicate copy of a data stream - used in pipelines to send one copy to a log file
and a second copy on to another program. (Think plumbing).
Displaying text, files or parts of files
echo
*
- Echo the arguments to standard output -- used for messages from scripts.
Some versions of "sh", and all csh/ksh/bash shells internalized "echo".
cat
*
- Copy and concatenate files; display contents of a file
head
, tail
*
- Display the beginning of a file, or the end of it.
cut
- Extract selected fields from each line of a file. Often awk is easier to use, even though it is
a more complex program.
wc
- Count lines, words and characters in the input.
Compression and archiving
compress
; gzip
, zip
; tar
*
- Various utilities to compress/uncompress individual files, combine multiple files into a single archive, or
do both.
Sorting and searching for patterns
sort
*
- Sort data alphabetically or numerically.
grep
*
- Search a file for lines containing character patterns. The patterns can be simple fixed text, or very complex
regular expressions.
uniq
*
- Remove duplicate lines, and generate a count of repeated lines.
wc
*
- Count lines, words and characters in a file.
System information (users, processes, time)
date
*
- Display the current date and time (flexible format). Useful for conditional execution based on
time, and for timestamping output.
ps
*
- List the to a running processes.
kill
*
- Send a signal (interrupt) to a running process.
id
- Print the user name and UID and group of the current user (e.g. to distinguish priviledged users before
attempting to run programs which may fail with permission errors)
who
- Display who is logged on the system, and from where they logged in.
uname
*
- Display information about the system, OS version, hardware architecture etc.
mail
*
- Send mail, from a file or standard input, to named recipients. Since scripts are often used to automate
long-running background jobs, sending notification of completion by mail is a common trick.
logger
- Place a message in the central system logging facility. Scripts can submit messages
with all the facilities available to compiled programs.
hostname
- Display the hostname of the current host - usful to keep track of where your programs are running
Conditional tests
test
; [
*
- The conditional test, used extensively in scripts, is also an external program which evaluates
the expression given as an argument and returns true (0) or false (1) exit status. The name "[" is a
link to the "test" program, so a line like:
if [ -w logfile ]
actually runs a program "[", with arguments "-w logfile ]", and returns a true/false value to the "if"
command.
Stream Editing
awk
*
- A pattern matching and data manipulation utility, which has its own scripting language. It also duplicates
much functionality from 'sed','grep','cut','wc', etc.
sed
*
- Stream Editor. A flexible editor which operates by applying editing rules to every line in a data stream
in turn.
tr
- Transliterate - perform very simple single-character edits on a file.
Finding and comparing files
find
*
- Search the filesystem and find files matching certain criteria (name pattern, age, owner, size,
last modified etc.)
xargs
*
- Apply multiple filename arguments to a named command and run it.
diff
*
- Compare two files and list the differences between them.
basename
pathname
- Returns the base filename portion of the named pathname, stripping off all the directories
dirname
pathname
- Returns the directory portion of the named pathname, stripping off the filename
Arithmetic and String Manipulation
expr
*
- The "expr" command takes an numeric or text pattern expression as an argument, evaluates it, and
returns a result to stdout. The original Bourne shell had no built-in arithmetic operators.
E.g.
expr 2 + 1
expr 2 '*' '(' 21 + 3 ')'
Used with text strings, "expr" can match regular expressions and extract sub expressions. Similar functionality
can be achived with sed
.
e.g.
expr SP99302L.Z00 : '[A-Z0-9]\{4\}\([0-9]\{3\}\)L\.*'
dc
- Desk Calculator - an RPN calculator, using arbitrary precision arithmetic and
user-specified bases. Useful for more complex arithmetic expressions than can be performed
internally or using
expr
bc
- A preprocessor for
dc
which provides infix notation and a C-like syntax for
expressions and functions.
Merging files
paste
- Merge lines from multiple files into tab-delimited columns.
join
- Perform a join (in the relational database sense) of lines in two sorted input files.