Introduction to awk
what is awk
- The word awk is derived from the names of its inventors!!!
 - awk is actually Aho Weinberger and Kernighan ;).
 - From the original awk paper published by Bell Labs, awk is
 
“ Awk is a programming language designed to make many common information retrieval and text manipulation tasks easy to state and to perform.”
- Simply put, awk is a programming language designed to search for, match patterns, and perform actions on files.
 
awk versions
- awk – Original Bell Labs awk (Version 7 UNIX, around 1978) + latest POSIX
 
awk.
- nawk – New awk (released with SVR4 around 1989)
 - gawk – GNU implementation of awk standard.
 - mawk – Michael’s awk.
 
……… and the list goes on.
All these are basically same except for some minor differences in features
provided. This presentation will assume the widely used POSIX awk (also
called “awk”).
A few basic things about awk
- awk reads from a file or from its standard input, and outputs to its standard output.
 - awk recognizes the concepts of « file », « record » and « field ».
 - A file consists of records, which by default are the lines of the file. One line becomes one record.
 - awk operates on one record at a time.
 - A record consists of fields, which by default are separated by any number of spaces or tabs.
 - Field number 1 is accessed with $1, field 2 with $2, and so forth. $0 refers to the whole record.
 
Program Structure in Awk
- An awk program is a sequence of statements of the form:
 
pattern { action }
pattern { action }
…
- pattern in front of an action acts as a selector that determines whether the action is to be executed.
 - Patterns can be : regular expressions, arithmetic relational expressions, string-valued expressions, and arbitrary boolean combinations of these.