Follow Techotopia on Twitter

On-line Guides
All Guides
eBook Store
iOS / Android
Linux for Beginners
Office Productivity
Linux Installation
Linux Security
Linux Utilities
Linux Virtualization
Linux Kernel
System/Network Admin
Programming
Scripting Languages
Development Tools
Web Development
GUI Toolkits/Desktop
Databases
Mail Systems
openSolaris
Eclipse Documentation
Techotopia.com
Virtuatopia.com
Answertopia.com

How To Guides
Virtualization
General System Admin
Linux Security
Linux Filesystems
Web Servers
Graphics & Desktop
PC Hardware
Windows
Problem Solutions
Privacy Policy

  




 

 

C.2. Awk

Awk is a full-featured text processing language with a syntax reminiscent of C. While it possesses an extensive set of operators and capabilities, we will cover only a couple of these here - the ones most useful for shell scripting.

Awk breaks each line of input passed to it into fields. By default, a field is a string of consecutive characters separated by whitespace, though there are options for changing the delimiter. Awk parses and operates on each separate field. This makes awk ideal for handling structured text files -- especially tables -- data organized into consistent chunks, such as rows and columns.

Strong quoting (single quotes) and curly brackets enclose segments of awk code within a shell script.

echo one two | awk '{print $1}'
# one

echo one two | awk '{print $2}'
# two


awk '{print $3}' $filename
# Prints field #3 of file $filename to stdout.

awk '{print $1 $5 $6}' $filename
# Prints fields #1, #5, and #6 of file $filename.

We have just seen the awk print command in action. The only other feature of awk we need to deal with here is variables. Awk handles variables similarly to shell scripts, though a bit more flexibly.

{ total += ${column_number} }
This adds the value of column_number to the running total of "total". Finally, to print "total", there is an END command block, executed after the script has processed all its input.
END { print total }

Corresponding to the END, there is a BEGIN, for a code block to be performed before awk starts processing its input.

The following example illustrates how awk can add text-parsing tools to a shell script.

Example C-1. Counting Letter Occurrences

#! /bin/sh
# letter-count2.sh: Counting letter occurrences in a text file.
#
# Script by nyal [[email protected]].
# Used with permission.
# Recommented by document author.
# Version 1.1: Modified to work with gawk 3.1.3.
#              (Will still work with earlier versions.)


INIT_TAB_AWK=""
# Parameter to initialize awk script.
count_case=0
FILE_PARSE=$1

E_PARAMERR=65

usage()
{
    echo "Usage: letter-count.sh file letters" 2>&1
    # For example:   ./letter-count2.sh filename.txt a b c
    exit $E_PARAMERR  # Not enough arguments passed to script.
}

if [ ! -f "$1" ] ; then
    echo "$1: No such file." 2>&1
    usage                 # Print usage message and exit.
fi 

if [ -z "$2" ] ; then
    echo "$2: No letters specified." 2>&1
    usage
fi 

shift                      # Letters specified.
for letter in `echo $@`    # For each one . . .
  do
  INIT_TAB_AWK="$INIT_TAB_AWK tab_search[${count_case}] = \"$letter\"; final_tab[${count_case}] = 0; " 
  # Pass as parameter to awk script below.
  count_case=`expr $count_case + 1`
done

# DEBUG:
# echo $INIT_TAB_AWK;

cat $FILE_PARSE |
# Pipe the target file to the following awk script.

# ----------------------------------------------------------------------------------
# Earlier version of script used:
# awk -v tab_search=0 -v final_tab=0 -v tab=0 -v nb_letter=0 -v chara=0 -v chara2=0 \

awk \
"BEGIN { $INIT_TAB_AWK } \
{ split(\$0, tab, \"\"); \
for (chara in tab) \
{ for (chara2 in tab_search) \
{ if (tab_search[chara2] == tab[chara]) { final_tab[chara2]++ } } } } \
END { for (chara in final_tab) \
{ print tab_search[chara] \" => \" final_tab[chara] } }"
# ----------------------------------------------------------------------------------
#  Nothing all that complicated, just . . .
#+ for-loops, if-tests, and a couple of specialized functions.

exit $?

# Compare this script to letter-count.sh.

For simpler examples of awk within shell scripts, see:

  1. Example 11-12

  2. Example 16-8

  3. Example 12-29

  4. Example 33-5

  5. Example 9-23

  6. Example 11-18

  7. Example 27-2

  8. Example 27-3

  9. Example 10-3

  10. Example 12-55

  11. Example 9-28

  12. Example 12-4

  13. Example 9-13

  14. Example 33-16

  15. Example 10-8

  16. Example 33-4

That's all the awk we'll cover here, folks, but there's lots more to learn. See the appropriate references in the Bibliography.

 
 
  Published under the terms of the GNU General Public License Design by Interspire