Pages

Saturday, February 16, 2013

AWK scripting: What is an AWK and how to use it?

This is our first post on AWK, in this we will see some of the basic stuff of AWK like history, advantages syntax and how it works etc.


1. A brief history about AWK

2. Advantages and disadvantages of AWK

3. AWK syntax

4. How AWK works?

AWK is a command/tool available in all the Linux/Unix flavors to do text filtering, manipulation etc. This tool is mainly meant for processing text files and reporting. AWK can be treated as a programming language due to its capabilities such as Arithmetic operations, Binary operations, conditions, loops, functions etc. AWK is an interpreter language. This tool/programming language was developed in 1977 by Alfred V. Aho, Peter J. Weinberger, Brian W. Kernighan. AWK got its name from its creates family names.

Due to its capabilities AWK earned its nick name as “Awk the Swiss army knife of the Unix toolkit”. This is true because it can do text processing with ease when compared to other text parsing tools available in Linux/Unix.

Below are some advantages and disadvantages of AWK tool which I come across when using it.

AWK Advantages

  • Validate data
  • Managing small db files
  • Generating reports
  • Parsing command outputs
  • Parsing log files
  • Can parse more than one file at a time.


AWK Disadvantages


  • Many flavors are there: awk, nawk, gawk,mawk, tawk which makes portability issue.
  • Cannot be full-pledge scripting language like PERL, Python and Ruby.
  • Useful only for data processing most of the time.


The syntax for AWK command is as follows

 When executing an AWK command we are going to use one of the below syntax.

awk options 'awk-code' filename

Unix-command | awk options 'awk-code'

awk options -f awkscript-file filename

How AWK works?


AWK works line by line: As many assume that AWK works columns wise, but it’s not true. AWK works as SED works i.e. in horizontal manner, reading one line after the other.

Awk treats a file as group of columns: AWK when reading lines it will assign data to columns depending on the field separator.

Column number: AWK assigns each column with a number and they are represented as $1, $2, $3 and so on till the last column. The last column number is assigned with NF(Number of Fields). And entire line/record is represented with $0. NF is a built-in variable contains total number of columns. Awk knows the last column number and equals NF. Depend on different conditions AWK works on these columns to get desired output.

In our next post we will see about AWK inbuilt variable. These variable are very much important as they are frequently referred by AWK to do its activity.

0 comments:

Post a Comment