Assumptions:
It is assumed that you already know how to:
like this
. Many commands
are shown with links to their full man pages
(sh
)Some descriptions in these notes have more detail available, and are denoted like this:
More details of this item would appear here. The printed notes include all of the additional information
These notes are updated from time to time. The "development" set of notes are http://northstar-www.dartmouth.edu/~richard/classes/ksh (Dartmouth only)Richard Brittain, Dartmouth College Computing Services.
© 2003,2004,2010 Dartmouth College.
Comments and questions, contact Richard.Brittain @ dartmouth.edu![]()
Table of Contents
(1)
There is no difference in syntax between interactive command line use and placing the commands in a file. Some commands are only useful when used interactively (e.g. command line history recall) and other commands are too complex to use interactively.
#!/path/to/shell
(e.g. #!/bin/ksh
).
The#!
characters tell the system to locate the following pathname, start it up and feed it the rest of the file as input. Any program which can read commands from a file can be started up this way, as long as it recognizes the#
comment convention. The program is started, and then the script file is given to it as an argument. Because of this, the script must be readable as well as executable. Examples are perl, awk, tcl and python.
ksh myscript
chmod
, it becomes a new command
and available for use (subject to the usual $PATH search).chmod +x myscript
1: #!/bin/sh 2: date 3: pwd 4: du -k
(2)
(e.g. apply the same analysis to every data file on a CD, without needing to repeat the commands)
e.g. set environment variables, switch to a special directory, create or select a configuration file, redirect output, log usage, and then run the program.
Other tools may create fancier installers (e.g. tcl/tk), but can not be assumed to be installed already. Shell scripts are used because they are very portable. Some software comes with a complete installation of the tool it wants to use (tcl/tk/python) in order to be self contained, but this leads to software bloat.
cron
or at
)
AUTOMATE, AUTOMATE, AUTOMATE
(3)
sh
csh
tcsh
We use it as the default interactive shell for new accounts on all of our public systems.
Not many people write scripts in [t]csh. See Csh Programming Considered Harmful by Tom Christiansen for a discussion of problems with programming csh scripts.
ksh
It was slow to gain acceptance because earlier versions were encumbered by AT&T licensing. This shell is now freely available on all systems, but sometimes not installed by default on "free" Unix. There are two major versions. ksh88 was the version incorporated into AT&T SVR4 Unix, and may still be installed by some of the commercial Unix vendors. ksh93 added more features, primarily for programming, and better POSIX compliance.
On most systems, /bin/sh is now a POSIX compliant shell. Korn shell and Bash are POSIX compliant, but have many features which go beyond the standard. On Solaris, the POSIX/XPG4 commands which differ slightly in behaviour from traditional SunOS commands are located in /usr/xpg4/bin
bash
zsh
(4)
$PATH
.
shellname scriptfile
, or run a single
command using shellname -c "command"
[]*?
. Each shell has some additional
wildcard metacharacters, but these are common to all shells.
<,>,>>,|
cd
)
&
$var
interpretation;
'single quotes' protect all metacharacters from interpretation.
~user
(except for sh
)
#
comments
`command`
(backtics)
$varname
syntax
&&
and ||
\
"
setenv
vs export
.cshrc
and .login
, vs .profile
) and default options
source filename
, vs . filename
)
(5)
"#!/path/to/program"
convention
allows any of them to be used as a scripting language to create new commands.
Some are highly specialized, and some are much more efficient than the equivalent
shell scripts at certain tasks. There is never only one way to perform a function,
and often the choice comes down to factors like:
awk
A pattern matching and data (text and numeric) manipulation tool. Predates perl. Installed on all Unix systems. Often used in combination with shell scripts.
perl
The most used scripting language for Web CGI applications and system administration tasks. Perl is harder to learn, and is uusually installed by default now. It is more efficient and has an enormous library of functions available. You could use Perl for almost all scripting tasks, but the syntax is very different to the shell command line
python
An object-oriented scripting language. Commonly installed by default on modern systems.
tcl
/tk
Tool Command Language. Another general purpose scripting language. The "tk" component is a scripted interface to standard X-windows graphical components, so the combination is often used to create graphical user interfaces.Ksh93 can be extended by linking to shared libraries providing additional internal commands. One example of an extended shell is
tksh
which incorporates Tcl/Tk with ksh and allows generation of scripts using both languages. It can be used for prototyping GUI applications.
(6)
The philosophy of separate Unix tools each performing a single operation was followed closely by the designers of the original shell, so it had very few internal commands and used external tools for very trivial operations (likeecho
and[
). Ksh and bash internally performs many of the basic string and numeric manipulations and conditional tests. Occasional problems arise because the internal versions of some commands likeecho
are not fully compatible with the external utility they replaced.The action taken every time a shell needs to run an external program is to locate the program (via $PATH),
fork()
, which creates a second copy of the shell, adjust the standard input/output for the external program, andexec()
, which replaces the second shell with the external program. This process is computationally expensive (relatively), so when the script does something trivial many times over in a loop, it saves a lot of time if the function is handled internally.
If you follow textbooks on Bourne shell programming, all of the advice should apply no matter which of the Bourne-derived shells you use. Unfortunately, many vendors have added features over the years and achieving complete portability can be a challenge. Explicitly writing for ksh (or bash) and insisting on that shell being installed, can often be simpler.
The sh and ksh man pages use the term special command for the internal commands - handled by the shell itself.
(7)
#!
magic header.
All the parsing rules, filename wildcards, $PATH searches etc., which were summarized
above, apply.
#
as the first non-whitespace character on a line
\
as the last character on a line
This is actually just a particular instance of \
being to escape, or remove
the special meaning from, the following character.
;
as a separator between words on a line
1: #!/bin/ksh 2: # For the purposes of display, parts of the script have 3: # been rendered in glorious technicolor. 4: ## Some comments are bold to flag special sections 5: 6: # Line numbers on the left are not part of the script. 7: # They are just added to the HTML for reference. 8: 9: # Built-in commands and keywords (e.g. print) are in blue 10: # Command substitutions are purple. Variables are black 11: print "Disk usage summary for $USER on `date`" 12: 13: # Everything else is red - mostly that is external 14: # commands, and the arguments to all of the commands. 15: print These are my files # end of line comment for print 16: # List the files in columns 17: ls -C 18: # Summarize the disk usage 19: print 20: print Disk space usage 21: du -k 22: exit 0
exit
N
, or it defaults to the value of the last command run.
The exit status is an integer 0-255. Conventionally 0=success and any other value indicates a problem. Think of it as only one way for everything to work, but many possible ways to fail. If the command was terminated by a signal, the value is 128 plus the signal value.
(8)
Wildcards may be used in the directory parts of a pathname as well as the filename part. If no files match the wildcard, it is left unchanged. Wildcards are not full regular expressions. Sed, grep, awk etc. work with more flexible (and more complex) string matching operators.
*
?
[...]
[ - ]
[!...]
chapter[1-5].*
could match chapter1.tex
, chapter4.tex
, chapter5.tex.old
.
It would not match chapter10.tex
or chapter1
(9)
srcfile=dataset1
set
unset
srcfile
srcfile=
export
srcfile
export
$srcfile
${srcfile}
Example:
datafile=census2000 # Tries to find $datafile_part1, which doesn't exist echo $datafile_part1.sas # This is what we intended echo ${datafile}_part1.sas
${datafile-default}
$datafile
, if it has been defined, otherwise use the string "default". This is an easy
way to allow for optional variables, and have sensible defaults if they haven't been set. If datafile
was
undefined, it remains so.
${datafile=default}
datafile
has not been defined, set it to the string "default".
${datafile+default}
datafile
has been defined, use the string "default", otherwise use null. In this case the
actual value $datafile
is not used.
${datafile?"error message"}
$datafile
, if it has been defined, otherwise display datafile: error message
.
This is used for diagnostics when a variable should have been set and there is no sensible default value to use.
Placing a colon (:) before the operator character in these constructs has the effect of counting a null value the same as an undefined variable. Variables may be given a null value by setting them to an empty string, e.g.datafile=
.
Example:echo ${datafile:-mydata.dat}
Echo the value of variabledatafile
if it has been set and is non-null, otherwise echo "mydata.dat".
var=value command args
(10)
$USER
),
while others are used only by the shell. set
or env
$USER, $LOGNAME
$PATH
$TERM
$PAGER
man
). This isn't actually used by the shell itself, but
shell scripts should honour it if they need to page output to the user.
$EDITOR
$PWD
$OLDPWD
cd
command). However, changing directories
in a script is often dangerous.
$?
(readonly)
$-
$IFS
$$
(readonly)
$PPID
(readonly)
$!
(readonly)
$SECONDS
(readonly)
$RANDOM
$RANDOM
returns a random integer in the range 0-32k. RANDOM
may be set to "seed" the random number generator.
$LINENO
(readonly)
(11)
These are often filenames, but can be interpreted by the script in any way. Options
are often specified using the "-flag" convention used by most Unix programs, and a ksh command
getopts
is available to help parse them.
The shell expands wildcards and makes variable and
command substitutions as normal, then parses the resulting words by whitespace (actually
special variable $IFS
), and places the resulting text strings into the
positional variables
as follows:
$0, $1, $2, ... $9
shift
, or $*, $@
. The variable $0
contains the name of the
script itself.
${10}, ${11}, ...
shift
shift N
" will shift N arguments at once.
$#
$*
$@
a1 a2 "a3 which contains spaces" a4
1: #!/bin/sh 2: # 3: # Check positional argument handling 4: echo "Number of arguments: $#" 5: echo "\$0 = $0" 6: 7: echo "Loop over \$*" 8: for a in $*; do 9: echo \"$a\" 10: done 11: 12: echo "Loop over \"\$@\"" 13: for a in "$@"; do 14: echo \"$a\" 15: done
set
command, followed by a set of arguments, creates a new set of
positional arguments. This is often used, assuming the original arguments are no longer needed, to parse
a set of words (possibly using different field separators). Arguments may be reset any number of times.
1: #!/bin/sh 2: # Find an entry in the password file 3: pwent=`grep '^root:' /etc/passwd` 4: # Turn off globbing - passwd lines often contain '*' 5: set -o noglob 6: # The "full name" and other comments are in 7: # field 5, colon delimited. Get this field using shell word splitting 8: OIFS=$IFS; IFS=: ; set $pwent; IFS=$OIFS 9: echo $5
Example: pickrandom display, text
Selects a random file from a directory.
Uses the ksh RANDOM feature.
1: #!/bin/ksh 2: 3: # Select a random image from the background logo collection 4: # This could be used to configure a screen saver, for example. 5: # 6: # This works even if the filenames contain spaces. 7: 8: # switch to the logos directory to avoid long paths 9: logos=/afs/northstar/common/usr/lib/X11/logos/backgrounds 10: cd $logos 11: 12: # '*' is a filename wildcard to match all files in the current directory 13: set * 14: 15: # Use the syntax for arithmetic expressions. "%" is the modulo operator 16: # Shift arguments by a random number between 0 and the number of files 17: shift $(($RANDOM % $#)) 18: 19: # Output the resulting first argument 20: echo "$logos/$1"
(12)
ksh -options scriptname
set -x
$-
-o noglob
instead of -f
). Many options
are unique to ksh or bash.
(13)
`command`
echo
, newlines and multiple spaces will be removed.
$(command)
$(<file)
`cat file`
, but implemented internally for efficiency.
1: #!/bin/ksh 2: 3: echo Today is `date` 4: 5: file=/etc/hosts 6: echo The file $file has $(wc -l < $file) lines 7: 8: hostname -s > myhostname 9: echo This system has host name $(<myhostname)
(14)
> filename
noclobber
option is set. The file is created if it does not exist.
The special device file /dev/null
can be used to explicitly discard unwanted output.
Reading from /dev/null results in an End of File status.
>> filename
>| filename
< filename
command | command [ | command ...]
No more than one command in a pipeline should be interactive (attempt to read from the terminal). This construct is much more efficient than using temporary files, and most standard Unix utilities are designed such that they work well in pipelines.The exit status of a pipeline is the exit status of the last command. In compound commands, a pipeline can be used anywhere a simple command could be used.
(15)
echo
Beware that the shell may process backslashes beforeecho
sees them (may need to double backslash). Internal in most shells, but was originally external.\0n where n is the 8-bit character whose ASCII code is the 1-, 2- or 3-digit octal number representing that character.
\b backspace \c print line without new-line (some versions) \f form-feed \n new-line \r carriage return \t tab \v vertical tab \\ backslash
print
(ksh internal)
read
var1 var2 rest
1: #!/bin/sh 2: echo "Testing interactive user input: enter some keystrokes and press return" 3: read x more 4: echo "First word was \"$x\"" 5: echo "Rest of the line (if any) was \"$more\""
(16)
test
command, or its alias, [
, or the ksh/bash built-in [[ ... ]]
command, which has slightly different options, or it can be any command which returns
a suitable exit status. Zero is taken to be "True", while any non-zero value is "False".
Note that this is backwards from the C language convention.
-e file
-f file
-d file
-r file
-w
= writable, -x
= executable, -L
= is a symlink.
-s file
-t filedescriptor
-n "string"
-z "string"
With[
, the argument must be quoted, because if it is a variable that has a null value, the resulting expansion ( [ -z ] ) is a syntax error. An expansion resulting in "" counts as a null string.
For[
only, a quoted string alone is equivalent to the -n test, e.g. [ "$var" ]. In older shells for which[
is an external program, the only way to test for a null string is:
if [ "X$var" = "X" ]
This is rarely needed now, but is still often found.
$variable = text
$variable < text
>
= comes after
(17)
$variable -eq number
$variable -ne number
-lt
= less than, -le
= less than or equal,
-gt
= greater than, -ge
= greater than or equal
$variable = pattern
The pattern must not be quoted. Since [[...]] is internal to the shell, the pattern in this case is treated differently and not filename-expanded as an external command would require.
file1 -nt file2
-ot
= older than
file1 -ef file2
!
operator, and combined with boolean
AND and OR operators using the syntax:
conditional -a conditional, conditional -o conditional
test
and [
conditional && conditional, conditional || conditional
[[ ... ]]
Examples:
if [[ -x /usr/local/bin/lserve && \ -w /var/logs/lserve.log ]]; then /usr/local/bin/lserve >> /var/logs/lserve.log & fi pwent=`grep '^richard:' /etc/passwd` if [ -z "$pwent" ]; then echo richard not found fi
(18)
A list can also be a set of simple commands or pipelines separated by ";,&,&&,||,|&". For the compound commands which branch on the success or failure of some list, it is usually[
or[[
, but can be anything.
list && list
list || list
Example:
mkdir tempdir && cp workfile tempdir sshd || echo "sshd failed to start"
You can use both forms together (with care) - they are processed left to right, and && must come first.
Example:mkdir tempdir && cp workfile tempdir || \ echo "Failed to create tempdir"
if
list; then list ; elif list; then list; else list; fi
Example:
if [ -r $myfile ] then cat $myfile else echo $myfile not readable fi
while
list; do list; done
until
list; do list; done
until
form just negates the test.
1: #!/bin/ksh 2: count=0 3: max=10 4: while [[ $count -lt $max ]] 5: do 6: echo $count 7: count=$((count + 1)) 8: done 9: echo "Value of count after loop is: $count"
for
identifier [ in words ]; do; list; done
Example:
for file in *.dat do echo Processing $file done
$?
can be used instead of some of the above.
Compound commands can be thought of as running in an implicit subshell. They
can have I/O redirection independant of the rest of the script. Setting of variables in a real subshell does not
leave them set in the parent script. Setting variables in implicit subshells varies in behaviour among shells.
Older sh
could not set variables in an implicit subshell and then use them later,
but current ksh
can do this (mostly).
Example: ex11 display, text
Reading a file line by line. The book by Randal Michael contains 12 example ways to read a file line by line,
which vary tremendously in efficiency. This example shows the simplest and fastest way.
1: #!/bin/sh 2: 3: # Demonstrate reading a file line-by-line, using I/O 4: # redirection in a compound command 5: # Also test variable setting inside an implicit subshell. 6: # Test this under sh and ksh and compare the output. 7: 8: line="TEST" 9: save= 10: 11: if [ -z "$1" ]; then 12: echo "Usage: $0 filename" 13: else 14: if [ -r $1 ]; then 15: while read line; do 16: echo "$line" 17: save=$line 18: done < $1 19: fi 20: fi 21: echo "End value of \$line is $line" 22: echo "End value of \$save is $save"
(19)
case
word in pattern) list;; esac
(ksh and bash only) A pattern-list is a list of one or more patterns separated from each other with a |. Composite patterns can be formed with one or more of the following:
?(pattern-list)
- Optionally matches any one of the given patterns.
*(pattern-list)
- Matches zero or more occurrences of the given patterns.
+(pattern-list)
- Matches one or more occurrences of the given patterns.
@(pattern-list)
- Matches exactly one of the given patterns.
!(pattern-list)
- Matches anything, except one of the given patterns.
Example:
case $filename in *.dat) echo Processing a .dat file ;; *.sas) echo Processing a .sas file ;; *) # catch anything else that doesn't match patterns echo "Don't know how to deal with $filename" ;; esac
break
[n]
continue
[n];
while
or processing the next element of a
for
.
. filename
( ... )
Command grouping
(20)
case
with a pattern:
case $var in
/*) echo "starts with /" ;;
`cut`
:
if [ "`echo $var | cut -c1`" = "/" ] ; then
.
if [ "${var%${var#?}}" = "/" ]; then
if [[ $var = /* ]]; then
The [[...]] syntax is handled internally by the shell and can therefore interpret "wildcard" patterns differently than an external command. An unquoted wildcard is interpreted as a pattern to be matched, while a quoted wildcard is taken literally. The [...] syntax, even if handled internally, is treated as though it were external for backward compatability. This requires that wildcard patterns be expanded to matching filenames.
if [ "${var:0:1}" = "/" ]; then
${varname:start:length}
(21)
eval
args
netdev=NETDEV_ NETDEV_1=hme0 # As part of an initialization step defining multiple devices devnum=1 # As part of a loop over those devices ifname=$netdev$devnum # construct a variable name NETDEV_1 eval device=\$$ifname # evaluate it - device is set to hme0
exec
command args
:
while :; do # this loop will go forever until broken by # a conditional test inside, or a signal done
unset
var ...
typeset
[+/- options] [ name[=value] ] ...
(ksh only,
bash uses declare
for similar functions)
-L[n]
-R[n]
-l
integer
is an alias for typeset -i
.
-Z[n]
- As for -R, but fill with zeroes if the value is a number
-i
- Lower-case convert the named variables
-u
- Upper-case convert the named variables
-r
- Mark the variables as readonly
-x
- Export the named variables to the enviroment
-ft
- The variables are taken as function names. Turn on execution tracing.
(22)
${#var}
${var%pattern}
${var%%pattern}
${var#pattern}
${var##pattern}
$(( integer expression ))
datapath=/data/public/project/trials/set1/datafile.dat
filename=${datapath##*/}
filename
is set to "datafile.dat" since the longest prefix
pattern matching "*/" is the
leading directory path (compare basename
)
path=${datapath%/*}
path
is set to "/data/public/project/trials/set1" since the shortest suffix
pattern matching "/*" is the
filename in the last directory (compare dirname
)
i=$((i+1))
while
loops
(23)
sh
allow you define shell functions, which are visible only
to the shell script and can be used like any other command. Shell functions take precedence over
external commands if the same name is used. Functions execute in the same process as the caller,
and must be defined before use (appear earlier in the file). They allow a script to be broken
into maintainable chunks, and encourage code reuse between scripts.
identifier() { list; }
function identifier { list; }
A function may read or modify any shell variable that exists in the calling script. Such variables are global.
(ksh and bash only) Functions may also declare local variables in the function using typeset
or
declare
.
Local variables are visible to the current function and any functions called by it.
return [n]
, exit [n]
return
, the function returns when it reaches the end, and the value is the
exit status of the last command it ran.
Example:
die() { # Print an error message and exit with given status # call as: die status "message" ["message" ...] exitstat=$1; shift for i in "$@"; do print -R "$i" done exit $exitstat }
Example:
[ -w $filename ] || \ die 1 "$file not writeable" "check permissions"
Example: Backgrounded function call. ex12 display, text
1: #!/bin/sh 2: 3: background() 4: { 5: sleep 10 6: echo "Background" 7: sleep 10 8: # Function will return here - if backgrounded, the subprocess will exit. 9: } 10: 11: echo "ps before background function" 12: ps 13: background & 14: echo "My PID=$$" 15: echo "Background function PID=$!" 16: echo "ps after background function" 17: ps 18: exit 0
Example:
vprint() { # Print or not depending on global "$verbosity" # Change the verbosity with a single variable. # Arg. 1 is the level for this message. level=$1; shift if [[ $level -le $verbosity ]]; then print -R $* fi } verbosity=2 vprint 1 This message will appear vprint 3 This only appears if verbosity is 3 or higher
.
" operator.
Functions may generate output to stdout, stderr, or any other file or filehandle. Messages to stdout
may be captured by command substitution (`myfunction`
, which provides another way for a function to
return information to the calling script. Beware of side-effects (and reducing reusability)
in functions which perform I/O.
(24)
exec
command.
exec
> outfile < infile
exec
just reassigns the I/O of the current shell.
exec n>outfile
read -u
or print -u
.
>&n
<&n
n>file
n>&1
echo "Error: program failed" >&2Echo always writes to stdout, but stdout can be temporarily reassigned to duplicate stderr (or other file descriptors). Conventionally Unix programs send error messages to stderr to keep them separated from stdout.
print
-u n args
Reading from file descriptors other than stdin:
read
-u n var1 var2 rest
<&-
>&-
I/O redirection operators are evaluated left-to-right. This makes a difference in a
statement like:
">filename 2>&1
". (Many books with example scripts get this wrong)
<< [-]string
1: #!/bin/sh 2: echo "Example of unquoted here document, with variable and command substitution" 3: 4: cat <<EOF 5: This text will be fed to the "cat" program as 6: standard input. It will also have variable 7: and command substitutions performed. 8: I am logged in as $USER and today is `date` 9: EOF 10: echo 11: echo "Example of quoted here document, with no variable or command substitution" 12: # The terminating string must be at the start of a line. 13: cat <<"EndOfInput" 14: This text will be fed to the "cat" program as standard 15: input. Since the text string marking the end was quoted, it does not get 16: variable and command subsitutions. 17: I am logged in as $USER and today is `date` 18: EndOfInput
1: #!/bin/sh 2: # Add in the magic postscript preface to perform 3: # duplex printer control for Xerox docuprint. 4: 5: # To have this script send the files directly to the printer, use 6: # a subshell to collect the output of the two 'cat' commands. 7: 8: ## ( 9: cat << EOP 10: %!PS 11: %%BeginFeature: *Duplex DuplexTumble 12: <</Duplex true /Tumble false>> setpagedevice 13: %%EndFeature 14: EOP 15: cat "$@" 16: ## ) | lpr
(25)
This short test script can be used to generate suitable output.
ex13: display, text
echo "This goes to stdout" echo "This goes to stdout and has foo in the line" echo "This goes to stderr" >&2 exit 99
exec 3>&1
./ex13.sh 2>&1 1>&3 3>&- | sed 's/stderr/STDERR/' 1>&2
We duplicate stdout to another file descriptor (3), then run the first command with stderr redirected to stdout and stdout redirected to the saved descriptor (3). The result is piped into other commands as needed. The output of the pipeline is redirected back to stderr, so that stdout and stderr of the script as a whole are what we expect.
1: #!/bin/sh 2: # Example 14 3: # Take stderr from a command and pass it into a pipe 4: # for further processing. 5: 6: # Uses ex13.sh to generate some output to stderr 7: # stdout of ex13 is processed normally 8: 9: # Save a copy of original stdout 10: exec 3>&1 11: 12: # stdout from ex13.sh is directed to the original stdout (3) 13: # stderr is passed into the pipe for further processing. 14: # stdout from the pipe is redirected back to stderr 15: ./ex13.sh 2>&1 1>&3 3>&- | sed 's/stderr/STDERR/' 1>&2 16: 17: # 3 is closed before running the command, just in case it cares 18: # about inheriting open file descriptors.
exec 3>&1
ex13stat=`((./ex13.sh; echo $? >&4) | grep 'foo' 1>&3) 4>&1`
This script uses nested subshells captured in backtics. Again we first duplicate stdout to another file descriptor (3). The inner subshell runs the first command, then writes the exit status to fd 4. The outer subshell redirects 4 to stdout so that it is captured by the backtics. Standard output from the first command (inner subshell) is passed into the pipeline as normal, but the final output of the pipeline is redirected to 3 so that it appears on the original stdout and is not captured by the backtics.
If any of the commands really care about inheriting open file descriptors that they don't need then a more correct command line closes the descriptors before running the commands.
1: #!/bin/sh 2: # Example 15 3: 4: # Uses ex13.sh to generate some output and give us an 5: # exit status to capture. 6: 7: # Get the exit status of ex13 into $ex13stat. 8: # stdout of ex13 is processed normally 9: 10: # Save a copy of stdout 11: exec 3>&1 12: # Run a subshell, with 4 duplicated to 1 so we get it in stdout. 13: # Capture the output in `` 14: # ex13stat=`( ... ) 4>&1` 15: # Inside the subshell, run another subshell to execute ex13, 16: # and echo the status code to 4 17: # (./ex13.sh; echo $? >&4) 18: # stdout from the inner subshell is processed normally, but the 19: # subsequent output must be directed to 3 so it goes to the 20: # original stdout and not be captured by the `` 21: ex13stat=`((./ex13.sh; echo $? >&4) | grep 'foo' 1>&3) 4>&1` 22: 23: echo Last command status=$? 24: echo ex13stat=$ex13stat 25: 26: # If any of the commands really care about inheriting open file 27: # descriptors that they don't need then a more correct command line 28: # closes the descriptors before running the commands 29: exec 3>&1 30: ex13stat=`((./ex13.sh 3>&- 4>&- ; echo $? >&4) | \ 31: grep 'foo' 1>&3 3>&- 4>&- ) 4>&1` 32: echo Last command status=$? 33: echo ex13stat=$ex13stat
Combine the above two techniques:
exec 3>&1
ex13stat=`((./ex13.sh 2>&1 1>&3 3>&- 4>&- ; echo $? >&4) | \
sed s/err/ERR/ 1>&2 3>&- 4>&- ) 4>&1`
A practical application of this would be running a utility such as1: #!/bin/sh 2: # Example 16 3: 4: # Uses ex13.sh to generate some output and give us an 5: # exit status to capture. 6: 7: # Get the exit status of ex13 into ex13stat. 8: # stderr of ex13 is processed by the pipe, stdout 9: # is left alone. 10: 11: # Save a copy of stdout 12: exec 3>&1 13: 14: # Run a subshell, with 4 copied to 1 so we get it in stdout. 15: # Capture the output in backtics` 16: # ex13stat=`( ) 4>&1` 17: 18: # In the subshell, run another subshell to execute ex13, and 19: # echo the status code to 4 20: # (./ex13.sh; echo $? >&4) 21: 22: # stdout from the inner subshell is directed to the original stdout (3) 23: # stderr is passed into the pipe for further processing. 24: # stdout from the pipe is redirected back to stderr 25: 26: # Close the extra descriptors before running the commands 27: exec 3>&1 28: ex13stat=`((./ex13.sh 2>&1 1>&3 3>&- 4>&- ; echo $? >&4) | \ 29: sed s/err/ERR/ 1>&2 3>&- 4>&- ) 4>&1` 30: 31: echo Last command status=$? 32: echo ex13stat=$ex13stat 33:
dd
where the exit status is important to capture, but the error output is overly chatty and
may need to be filtered before delivering to other parts of a script.
(26)
command &
bgpid=$!
$!
contains the process ID of the last background job
that was started. You can save that and examine the process later
(ps -p $bgpid
) or send it a signal (kill -HUP $bgpid
).
command |&
read
-p var
print
-p args
exec
<&p
exec
>&p
Example: ex9 display, text
A script wants to save a copy of all output in a file, but also wants a copy
to the screen. This is equivalent to always running the script as
script | tee outfile
1: #!/bin/ksh 2: 3: # If we have not redirected standard output, save a copy of 4: # the output of this script into a file, but still send a 5: # copy to the screen. 6: 7: if [[ -t 1 ]] ; then 8: # Only do this if fd 1 (stdout) is still connected 9: # to a terminal 10: 11: # We want the standard output of the "tee" process 12: # to go explicitly to the screen (/dev/tty) 13: # and the second copy goes into a logfile named $0.out 14: 15: tee $0.out >/dev/tty |& 16: 17: # Our stdout all goes into this coprocess 18: exec 1>&p 19: fi 20: 21: # Now generate some output 22: print "User activity snapshot on $(hostname) at $(date)" 23: print 24: who
Example: ex10 display, text
Start a coprocess to look up usernames in some database.
It is faster to run a single process than to run a separate
lookup for each user.
1: #!/bin/ksh 2: # This example uses a locally written tool for Dartmouth Name Directory lookups 3: 4: # Start the dndlookup program as a coprocess 5: # Tell it to output only the canonical full name, and to not print multiple matches 6: dndlookup -fname -u |& 7: 8: # move the input/output streams so we 9: # can use other coprocesses too 10: exec 4>&p 11: exec 5<&p 12: 13: echo "Name file contents:" 14: cat namefile 15: echo 16: 17: # read the names from a file "namefile" 18: while read uname; do 19: print -u4 $uname 20: read -u5 dndname 21: case $dndname in 22: *many\ matches*) 23: # handle case where the name wasn't unique 24: print "Multiple matches to \"$uname\" in DND" 25: ;; 26: *no\ match*) 27: # handle case where the name wasn't found 28: print "No matches to \"$uname\" in DND" 29: ;; 30: *) 31: # we seem to have a hit - process the 32: # canonical named retrieved from dndlookup 33: print "Unique DND match: full name for \"$uname\" is \"$dndname\"" 34: ;; 35: esac 36: sleep 2 37: done < namefile 38: 39: # We've read all the names, but the coprocess 40: # is still running. Close the pipe to tell it 41: # we have finished. 42: exec 4>&-
(27)
ksh distinguishes between numerically indexed (small) arrays, and string indexed (associative) arrays.
bash uses integers for all array indexing, but the integers need not be consecutive and unassigned array elements
do not exist. Arrays must be declared before use, e,g. typeset -A myarray
(ksh associative array), or
typeset -a myarray
(bash).
Array elements are set with the syntax:
myarray[index]=value
and referenced with the syntax ${myarray[index]}
This example shows use of an array indexed by IP addresses, as strings in ksh or as non-consecutive numbers in bash. It also demonstrates use of getopt for options processing
Example: getauthlogs display, text
1: #!/bin/bash 2: # $Header: $ 3: # First attempt at a consolidated auth log collection from kaserver 4: # Timestamps in the raw files are NOT designed for easy sorting. 5: # 6: # Options: 7: # -i -- translate hex IP addresses to dotted-decimal (relatively quick) 8: # -h -- translate hex IP addresses to DNS names (somewhat slower - DNS lookups) 9: # -u user -- filter for the named user before translating addresses 10: 11: hextodec() 12: { 13: # convert the IP address in reverse-hex to dotted-decimal 14: echo $((0x${1:6:2})).$((0x${1:4:2})).$((0x${1:2:2})).$((0x${1:0:2})) 15: } 16: 17: hostlookup() 18: { 19: # Convert a decimal IP to hostname - calls 'host' each time 20: hostname=$(host $1) 21: case $hostname in 22: *\ not\ found*) 23: # Just echo the address we tried to look up 24: echo "$1" 25: ;; 26: *) 27: # The result is word 5. Lower-case it for consistency 28: set $hostname 29: echo "$5" | tr 'A-Z' 'a-z' 30: ;; 31: esac 32: } 33: 34: # Options 35: iptranslate=0 36: gethostnames=0 37: filter=cat 38: while getopts ihu: o ; do 39: case $o in 40: i) iptranslate=1 ;; 41: h) gethostnames=1; iptranslate=1 ;; 42: u) filter="grep $OPTARG" ;; 43: esac 44: done 45: shift $(($OPTIND-1)) 46: 47: # We could get the DB server names from 'fs checkservers', but it isn't obvious what is from our cell. We 48: # could also grep CellServDB. I cop out and hard code one known DB server and get the others from it. 49: masterserver=halley.dartmouth.edu 50: serverlist=$(bos listhosts -server $masterserver| grep 'Host .* is ' | awk '{print $4}') 51: 52: # If we want to filter usernames, it is more efficient to do it inline, before sorting, translation and hostname lookups 53: 54: # Array to hold IP address/name conversions (associative array, ksh only) 55: # ksh - use -A for associative array. bash - use -a and numeric array 56: typeset -a hostnames 57: 58: ( 59: for dbserver in $serverlist; do 60: bos getlog -server $dbserver -file /usr/afs/logs/AuthLog 61: done 62: ) | grep -v 'Fetching log file' | $filter | sed -e 's/^... //' -e 's/ \([1-9]\) / 0\1 /' | sort --month-sort | \ 63: sed '-e s/ \([0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]\)$/ 0\1/' | 64: while read line; do 65: if [[ $iptranslate == 1 ]] ; then 66: # Ugly! 67: # Sometimes we get a 7-digit hex code in the log - the kaserver apparently drops leading zeros. 68: # The second 'sed' in the pipe catches these are fixes them. 69: case $line in 70: *\ from\ [0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f][0-9a-f]) 71: # translate the reverse-hex address 72: iphex=${line##* from } 73: # bash version - index by numeric value only, but can be sparse array -- use the raw IP 74: ipdec=$((0x$iphex)) 75: frontpart=${line% from *} 76: if [[ $gethostnames == 1 ]]; then 77: # ksh - index on hex value as a string (iphex) 78: # bash - index on numeric value (ipdec) 79: index=$ipdec 80: if [[ -z "${hostnames[$index]}" ]]; then 81: hostnames[$index]="$(hostlookup $(hextodec $iphex))" 82: fi 83: echo "$frontpart from ${hostnames[$index]}" 84: else 85: echo "$frontpart from $(hextodec $iphex)" 86: fi 87: ;; 88: *) 89: echo "$line" 90: ;; 91: esac 92: else 93: # No ip translation, just echo the whole line 94: echo "$line" 95: fi 96: done 97:
(28)
kill
.
trap
handler sig ...
-
resets the signals to their default values
''
(null) ignores the signals
EXIT
ERR
(ksh)
DEBUG
(ksh)
Exit handlers can be defined to clean up temporary files or reset the state of devices. This can be useful if the script has multiple possible exit points.1: #!/bin/bash 2: # Try this under bash, ksh and sh 3: 4: trap huphandler HUP 5: trap '' QUIT 6: trap exithandler TERM INT 7: 8: huphandler() 9: { 10: echo 'Received SIGHUP' 11: echo "continuing" 12: } 13: 14: exithandler() 15: { 16: echo 'Received SIGTERM or SIGINT' 17: exit 1 18: } 19: ## Execution starts here - infinite loop until interrupted 20: # Use ":" or "true" for infinite loop 21: # SECONDS is built-in to bash and ksh. It is number of seconds since script started 22: : is like a comment, but it is evaluated for side effects and evaluates to true 23: seconds=0 24: while : ; do 25: # while true; do 26: sleep 5 27: seconds=$((seconds + 5)) 28: echo -n "$SECONDS $seconds - " 29: done
(29)
Most systems don't even allow a script to be made set-UID. It is
impossible (due to inherent race conditions) to ensure that a set-uid script cannot be compromised.
Use wrapper programs like sudo
instead.
$PATH
at the start of a script, so that you know exactly
which external programs will be used.
$TMPDIR
,
and create files safely (e.g. mktemp
).
Often scripts will write to a fixed, or trivially generated temporary filename in /tmp. If the file already exists and you don't have permission to overwrite it, the script will fail. If you do have permission to overwrite it, you will delete the previous contents. Since /tmp is public write, another user may create files in it, or possibly fill it completely.
Example:Environment variable
- A link is created by an unprivileged user in /tmp:
/tmp/scratch -> /vmunix
- A root user runs a script that blindly writes a scratch file to /tmp/scratch, and overwrites the operating system.
$TMPDIR
is often used to indicate a preferred location for temporary files (e.g., a per-user directory). Some systems may use$TMP
or$TEMP
. Safe scratch files can be made by creating a new directory, owned and writeable only by you, then creating files in there.
Example:(umask 077 && mkdir /tmp/tempdir.$$) || exit 1or (deluxe version)tmp=${TMPDIR:-/tmp} tmp=$tmp/tempdir.$RANDOM.$RANDOM.$RANDOM.$$ (umask 077 && mkdir $tmp) || { echo "Could not create temporary directory" 1>&2 exit 1 }Alternatively, many systems havemktemp
to safely create a temporary file and return the filename, which can be used by the script and then deleted.
ls
or find
Example:
Consider the effects of a file named "myfile;cd /;rm *
" if processed,
unquoted, by your script.
One possible way to protect against weirdo characters in file names:# A function to massage a list of filenames # to protect weirdo characters # e.g. find ... | protect_filenames | xargs command # # We are backslash-protecting the characters \'" ?*; protect_filenames() { sed -es/\\\\/\\\\\\\\/g \ -es/\\\'/\\\\\'/g \ -es/\\\"/\\\\\"/g \ -es/\\\;/\\\\\;/g \ -es/\\\?/\\\\\?/g \ -es/\\\*/\\\\\*/g \ -es/\\\ /\\\\\ /g }If using GNUfind
andxargs
, there is a much cleaner option to null-terminate generated pathnames.
(30)
(31)
Download a compressed tar file of all example scripts used in these notes.
$PATH
for possibly conflicting programs.
This entire tutorial was created from individual HTML pages using a content management system written
as ksh scripts (heavily using sed to edit the pages), coordinated by make
.
You can even write an entire web server as a shell script. This one is part of the LEAF (Linux Embedded Appliance Firewall) project. This wouldn't be suitable for much load, but handles occasional queries on static HTML and CGI scripts. (www.nisi.ab.ca/lrp/Packages/weblet.htm)
(32)
Most of these commands will operate on a one or more named files, or will operate on a stream of data from standard input if no files are named.
ls
*
mkdir
; rmdir
*
rm
; cp
; mv
*
touch
*
If the file does not exist, a new zero-byte file is created, which is often useful to signify that an event has occurred.
tee
echo
*
Conflicts sometimes arise over the syntax for echoing a line with no trailing CR/LF. Some use "\c" and some use option "-n". To avoid these problems, ksh also provides the "print" command for output.
cat
*
head
, tail
*
cut
wc
compress
; gzip
, zip
; tar
*
sort
*
grep
*
The name comes from "Global Regular Expression and Print" -- a function from the Unix editors which was used frequently enough to warrant getting its own program.
uniq
*
wc
*
date
*
ps
*
kill
*
id
who
uname
*
mail
*
logger
hostname
test
; [
*
if [ -w logfile ]
In ksh and most newer versions of sh, "[" is replaced with a compatible internal command, but the argument parsing is performed as if it were an external command. Ksh also provides the internal "[[" operator, with simplified syntax.
awk
*
Complex scripts can be written entirely using awk, but it is frequently used just to extract fields from lines of a file (similar to 'cut').
sed
*
Since it makes a single pass through the file, keeping only a few lines in memory at once, it can be used with infinitely large data sets. It is mostly used for global search and replace operations. It is a superset of 'tr', 'grep', and 'cut', but is more complicated to use.
tr
find
*
xargs
*
Xargs is often used in combination with "find" to apply some command to all the files matching certain criteria. Since "find" may result in a very large list of pathnames, using the results directly may overflow command line buffers. Xargs avoids this problem, and is much more efficient than running a command on every pathname individually.
diff
*
basename
pathname
dirname
pathname
expr
*
expr 2 + 1
expr 2 '*' '(' 21 + 3 ')'
sed
.
e.g. expr SP99302L.Z00 : '[A-Z0-9]\{4\}\([0-9]\{3\}\)L\.*'
dc
expr
bc
dc
which provides infix notation and a C-like syntax for
expressions and functions.
paste
join
(33)
sh
and ksh
are quite complete, but not easy to
learn from. The following is a sampling of the many available books on the subject. The Bolsky and Korn
book might be viewed as the standard "reference". The Blinn book is Bourne shell, but everything in it should
work for either shell.