Pages

Thursday, April 25, 2013

AWK Scripting: Learn AWK Built-in variables with examples

AWK inbuilt variables: FS, OFS, RS, ORS, NR, NF, FNR, FILENAME


AWK is supplied with good number of built-in variables which come in handy when working with data files. We will see each AWK built-in variables with one or two examples to familiarize with them. Without these built-in variables it’s very much difficult to write simple AWK code. These variable are used to format output of an AWK command, as input field separator and even we can store current input file name in them for using them with in the script. Some of the AWK concepts already covered are.

AWK scripting: What is an AWK and how to use it?

AWK built-in variables:

  • NR: Current count of the number of input records.
  • NF: Keeps a count of the number of fields
  • FILENAME: The name of the current input-file.
  • FNR: No of records in current filename
  • FS: Contains the "field separator" character
  • RS: Stores the current "record separator" or Row Separator.
  • OFS: Stores the "output field separator".
  • ORS: Stores the "output record separator" or Output RS.
Our sample DB file for this post is db.txt

cat db.txt

John,29,MS,IBM,M,Married
Barbi,45,MD,JHH,F,Single
Mitch,33,BS,BofA,M,Single
Tim,39,Phd,DELL,M,Married
Lisa,22,BS,SmartDrive,F,Married

In order to make it simple we can divide above  inbuilt variables in to groups on basis of their operations.

Group1: FS(input field separator), OFS,
Group2: RS(Row separator) and ORS(Output record separator)
Group3: NR, NF and FNR
Group4: FILENAME variable

Group1: FS(input field separator), OFS


Let us start with FS and OFS built-in variables.

FS AWK variable: This variable is useful in storing the input field separator. By default AWK can understand only spaces, tabs as input and output separators. But if your file contains some other character as separator other than these mention one's, AWK cannot understand them. For example Linux password file which contain ‘:’ as a separator. So in order to mention the input filed separator we use this inbuilt variable.

We will see what issue we face if we don’t mention the field separator for our db.txt.

Example1: Print first column data from db.txt file.

awk '{print $1}' db.txt

Output:

John,29,MS,IBM,M,Married
Barbi,45,MD,JHH,F,Single
Mitch,33,BS,BofA,M,Single
Tim,39,Phd,DELL,M,Married
Lisa,22,BS,SmartDrive,F,Married

If you see entire file is displayed which indicates AWK do not understand db.txt file separator ",". We have to tell AWK what is the field separator.

Example2: List only first column data from db.txt file which have field separator as ‘,’.

awk 'BEGIN{FS=","}{print $1}' db.txt\

Output:

John
Barbi
Mitch
Tim
Lisa

Example3: We can use AWK option –F for mentioning input field separator as shown in below example for printing 4th column.

awk -F',' '{print $4}' db.txt

Output:

IBM
JHH
BofA
DELL
SmartDrive

OFS AWK variable: This variable is useful for mentioning what is your output field separator which separates output data.

Example4: Display only 1st and 4th column and the separator between at output for these columns should be $.

awk 'BEGIN{FS=",";OFS=" $ "}{print $1,$4}' db.txt

Output:

John $ IBM
Barbi $ JHH
Mitch $ BofA
Tim $ DELL
Lisa $ SmartDrive

Note: I given space before and after $ in OFS variable to show better output. You can remove the spaces if required.

I will leave printing only first and fourth columns to readers without using OFS and see the issue.

Group2: RS(Row separator) and ORS(Output record separator)


RS(Row separator) and ORS(Output record separator).

RS AWK Variable: Row Separator is helpful in defining separator between rows in a file. By default AWK takes row separator as new line. We can change this by using RS built-in variable.

Example5: I want to convert a sentence to a word per line. We can use RS variable for doing it.

echo “This is how it works” | awk ‘BEGIN{RS=” ”}{print $0}’

Output:

This
is
how
it
works

ORS(Output Record Separator): This variable is useful for defining the record separator for the AWK command output. By default ORS is set to new line.

Example6: Print all the company names in single line which are in 4th column.

awk -F',' 'BEGIN{ORS=" "}{print $4}' db.txt

Output:

IBM JHH BofA DELL SmartDrive

Group3: NF, NR and FNR

 NF AWK variable: This variable keeps information about total fields in a given row. The final value of a row can be represented with $NF.

Example7: Print number of fields each row in db.txt file.

 awk '{print NF}' db.txt




Output:

5
5
4
5
4

Example8: Print last field in each row of db.txt file.

awk '{print $NF}' db.txt



Output:

77
45
37
95
47

Note: If you observe above two examples We used Just NF for giving us the count of fields in a given row and $NF for displaying last element in each row. $NF will come handy when you are not sure what is your last column number.

NR AWK variable: This variable keeps the value of present line number. This will come handy when you want to print line numbers in a file.

Example9: Print line number for each line in a given file.

awk '{print NR, $0}' db.txt

Output:

1 Jones 2143 78 84 77
2 Gondrol 2321 56 58 45
3 RinRao 2122234 38 37
4 Edwin 253734 87 97 95
5 Dayan 24155 30 47

 This can be treated as cat command -n option for displaying line number for a file.

FNR AWK variable: This variable keeps count of number of lines present in a given file/data. This will come handy when you want to print no of line present in a given file. This command is equivalent to wc -l command.

Example10: Print total number of lines in a given file.

awk 'END{print FNR}' db.txt

Output:

5

From the above output we can conclude that number of lines present in db.txt file is 5.

Group4: FILENAME variable



FILENAME AWK variable: This variable contain file awk command is processing.

Example11: Print filename for each line in a given file.

 awk '{print FILENAME, NR, $0}' abc.txt

Output:

abc.txt 1 Jones 2143 78 84 77
abc.txt 2 Gondrol 2321 56 58 45
abc.txt 3 RinRao 2122234 38 37
abc.txt 4 Edwin 253734 87 97 95
abc.txt 5 Dayan 24155 30 47

In our next post we will see how to use ARRAY's in AWK scripting.

0 comments:

Post a Comment