learnxinyminutes-docs/awk.html.markdown

---
language: awk
filename: learnawk.awk
contributors:
     - ["Marshall Mason", "http://github.com/marshallmason"]

---

AWK is a standard tool on every POSIX-compliant UNIX system. It's like
flex/lex, from the command-line, perfect for text-processing tasks and
other scripting needs. It has a C-like syntax, but without mandatory
semicolons (although, you should use them anyway, because they are required
when you're writing one-liners, something AWK excells at), manual memory
management, or static typing. It excels at text processing. You can call to
it from a shell script, or you can use it as a stand-alone scripting language.

Why use AWK instead of Perl? Readability. AWK is easier to read
than Perl. For simple text-processing scripts, particularly ones that read
files line by line and split on delimiters, AWK is probably the right tool for
the job.

```awk
#!/usr/bin/awk -f

# Comments are like this


# AWK programs consist of a collection of patterns and actions.
pattern1 { action; } # just like lex
pattern2 { action; }

# There is an implied loop and AWK automatically reads and parses each
# record of each file supplied. Each record is split by the FS delimiter,
# which defaults to white-space (multiple spaces,tabs count as one)
# You cann assign FS either on the command line (-F C) or in your BEGIN
# pattern

# One of the special patterns is BEGIN. The BEGIN pattern is true
# BEFORE any of the files are read. The END pattern is true after
# an End-of-file from the last file (or standard-in if no files specified)
# There is also an output field separator (OFS) that you can assign, which
# defaults to a single space

BEGIN {

    # BEGIN will run at the beginning of the program. It's where you put all
    # the preliminary set-up code, before you process any text files. If you
    # have no text files, then think of BEGIN as the main entry point.

    # Variables are global. Just set them or use them, no need to declare..
    count = 0;

    # Operators just like in C and friends
    a = count + 1;
    b = count - 1;
    c = count * 1;
    d = count / 1; # integer division
    e = count % 1; # modulus
    f = count ^ 1; # exponentiation

    a += 1;
    b -= 1;
    c *= 1;
    d /= 1;
    e %= 1;
    f ^= 1;

    # Incrementing and decrementing by one
    a++;
    b--;

    # As a prefix operator, it returns the incremented value
    ++a;
    --b;

    # Notice, also, no punctuation such as semicolons to terminate statements

    # Control statements
    if (count == 0)
        print "Starting with count of 0";
    else
        print "Huh?";

    # Or you could use the ternary operator
    print (count == 0) ? "Starting with count of 0" : "Huh?";

    # Blocks consisting of multiple lines use braces
    while (a < 10) {
        print "String concatenation is done" " with a series" " of"
            " space-separated strings";
        print a;

        a++;
    }

    for (i = 0; i < 10; i++)
        print "Good ol' for loop";

    # As for comparisons, they're the standards:
    # a < b   # Less than
    # a <= b  # Less than or equal
    # a != b  # Not equal
    # a == b  # Equal
    # a > b   # Greater than
    # a >= b  # Greater than or equal

    # Logical operators as well
    # a && b  # AND
    # a || b  # OR

    # In addition, there's the super useful regular expression match
    if ("foo" ~ "^fo+$")
        print "Fooey!";
    if ("boo" !~ "^fo+$")
        print "Boo!";

    # Arrays
    arr[0] = "foo";
    arr[1] = "bar";
    
    # You can also initialize an array with the built-in function split()
    
    n = split("foo:bar:baz", arr, ":");
   
    # You also have associative arrays (actually, they're all associative arrays)
    assoc["foo"] = "bar";
    assoc["bar"] = "baz";

    # And multi-dimensional arrays, with some limitations I won't mention here
    multidim[0,0] = "foo";
    multidim[0,1] = "bar";
    multidim[1,0] = "baz";
    multidim[1,1] = "boo";

    # You can test for array membership
    if ("foo" in assoc)
        print "Fooey!";

    # You can also use the 'in' operator to traverse the keys of an array
    for (key in assoc)
        print assoc[key];

    # The command line is in a special array called ARGV
    for (argnum in ARGV)
        print ARGV[argnum];

    # You can remove elements of an array
    # This is particularly useful to prevent AWK from assuming the arguments
    # are files for it to process
    delete ARGV[1];

    # The number of command line arguments is in a variable called ARGC
    print ARGC;

    # AWK has several built-in functions. They fall into three categories. I'll
    # demonstrate each of them in their own functions, defined later.

    return_value = arithmetic_functions(a, b, c);
    string_functions();
    io_functions();
}

# Here's how you define a function
function arithmetic_functions(a, b, c,     d) {

    # Probably the most annoying part of AWK is that there are no local
    # variables. Everything is global. For short scripts, this is fine, even
    # useful, but for longer scripts, this can be a problem.

    # There is a work-around (ahem, hack). Function arguments are local to the
    # function, and AWK allows you to define more function arguments than it
    # needs. So just stick local variable in the function declaration, like I
    # did above. As a convention, stick in some extra whitespace to distinguish
    # between actual function parameters and local variables. In this example,
    # a, b, and c are actual parameters, while d is merely a local variable.

    # Now, to demonstrate the arithmetic functions

    # Most AWK implementations have some standard trig functions
    localvar = sin(a);
    localvar = cos(a);
    localvar = atan2(b, a); # arc tangent of b / a

    # And logarithmic stuff
    localvar = exp(a);
    localvar = log(a);

    # Square root
    localvar = sqrt(a);

    # Truncate floating point to integer
    localvar = int(5.34); # localvar => 5

    # Random numbers
    srand(); # Supply a seed as an argument. By default, it uses the time of day
    localvar = rand(); # Random number between 0 and 1.

    # Here's how to return a value
    return localvar;
}

function string_functions(    localvar, arr) {

    # AWK, being a string-processing language, has several string-related
    # functions, many of which rely heavily on regular expressions.

    # Search and replace, first instance (sub) or all instances (gsub)
    # Both return number of matches replaced
    localvar = "fooooobar";
    sub("fo+", "Meet me at the ", localvar); # localvar => "Meet me at the bar"
    gsub("e+", ".", localvar); # localvar => "m..t m. at th. bar"

    # Search for a string that matches a regular expression
    # index() does the same thing, but doesn't allow a regular expression
    match(localvar, "t"); # => 4, since the 't' is the fourth character

    # Split on a delimiter
    n = split("foo-bar-baz", arr, "-"); # a[1] = "foo"; a[2] = "bar"; a[3] = "baz"; n = 3

    # Other useful stuff
    sprintf("%s %d %d %d", "Testing", 1, 2, 3); # => "Testing 1 2 3"
    substr("foobar", 2, 3); # => "oob"
    substr("foobar", 4); # => "bar"
    length("foo"); # => 3
    tolower("FOO"); # => "foo"
    toupper("foo"); # => "FOO"
}

function io_functions(    localvar) {

    # You've already seen print
    print "Hello world";

    # There's also printf
    printf("%s %d %d %d\n", "Testing", 1, 2, 3);

    # AWK doesn't have file handles, per se. It will automatically open a file
    # handle for you when you use something that needs one. The string you used
    # for this can be treated as a file handle, for purposes of I/O. This makes
    # it feel sort of like shell scripting, but to get the same output, the string
    # must match exactly, so use a vaiable:
    
    outfile = "/tmp/foobar.txt";

    print "foobar" > outfile;

    # Now the string outfile is a file handle. You can close it:
    close(outfile);

    # Here's how you run something in the shell
    system("echo foobar"); # => prints foobar

    # Reads a line from standard input and stores in localvar
    getline localvar;

    # Reads a line from a pipe (again, use a string so you close it properly)
    cmd = "echo foobar";
    cmd | getline localvar; # localvar => "foobar"
    close(cmd);

    # Reads a line from a file and stores in localvar
    infile = "/tmp/foobar.txt";
    getline localvar < infile; 
    close(infile);
}

# As I said at the beginning, AWK programs consist of a collection of patterns
# and actions. You've already seen the BEGIN pattern. Other
# patterns are used only if you're processing lines from files or standard
# input.
#
# When you pass arguments to AWK, they are treated as file names to process.
# It will process them all, in order. Think of it like an implicit for loop,
# iterating over the lines in these files. these patterns and actions are like
# switch statements inside the loop. 

/^fo+bar$/ {
    
    # This action will execute for every line that matches the regular
    # expression, /^fo+bar$/, and will be skipped for any line that fails to
    # match it. Let's just print the line:

    print;

    # Whoa, no argument! That's because print has a default argument: $0.
    # $0 is the name of the current line being processed. It is created
    # automatically for you.

    # You can probably guess there are other $ variables. Every line is
    # implicitly split before every action is called, much like the shell
    # does. And, like the shell, each field can be access with a dollar sign

    # This will print the second and fourth fields in the line
    print $2, $4;

    # AWK automatically defines many other variables to help you inspect and
    # process each line. The most important one is NF

    # Prints the number of fields on this line
    print NF;

    # Print the last field on this line
    print $NF;
}

# Every pattern is actually a true/false test. The regular expression in the
# last pattern is also a true/false test, but part of it was hidden. If you
# don't give it a string to test, it will assume $0, the line that it's
# currently processing. Thus, the complete version of it is this:

$0 ~ /^fo+bar$/ {
    print "Equivalent to the last pattern";
}

a > 0 {
    # This will execute once for each line, as long as a is positive
}

# You get the idea. Processing text files, reading in a line at a time, and
# doing something with it, particularly splitting on a delimiter, is so common
# in UNIX that AWK is a scripting language that does all of it for you, without
# you needing to ask. All you have to do is write the patterns and actions
# based on what you expect of the input, and what you want to do with it.

# Here's a quick example of a simple script, the sort of thing AWK is perfect
# for. It will read a name from standard input and then will print the average
# age of everyone with that first name. Let's say you supply as an argument the
# name of a this data file:
#
# Bob Jones 32
# Jane Doe 22
# Steve Stevens 83
# Bob Smith 29
# Bob Barker 72
#
# Here's the script:

BEGIN {

    # First, ask the user for the name
    print "What name would you like the average age for?";

    # Get a line from standard input, not from files on the command line
    getline name < "/dev/stdin";
}

# Now, match every line whose first field is the given name
$1 == name {

    # Inside here, we have access to a number of useful variables, already
    # pre-loaded for us:
    # $0 is the entire line
    # $3 is the third field, the age, which is what we're interested in here
    # NF is the number of fields, which should be 3
    # NR is the number of records (lines) seen so far
    # FILENAME is the name of the file being processed
    # FS is the field separator being used, which is " " here
    # ...etc. There are plenty more, documented in the man page.

    # Keep track of a running total and how many lines matched
    sum += $3;
    nlines++;
}

# Another special pattern is called END. It will run after processing all the
# text files. Unlike BEGIN, it will only run if you've given it input to
# process. It will run after all the files have been read and processed
# according to the rules and actions you've provided. The purpose of it is
# usually to output some kind of final report, or do something with the
# aggregate of the data you've accumulated over the course of the script.

END {
    if (nlines)
        print "The average age for " name " is " sum / nlines;
}

```
Further Reading:

* [Awk tutorial](http://www.grymoire.com/Unix/Awk.html)
* [Awk man page](https://linux.die.net/man/1/awk)
* [The GNU Awk User's Guide](https://www.gnu.org/software/gawk/manual/gawk.html) GNU Awk is found on most Linux systems.
* [AWK one-liner collection](http://tuxgraphics.org/~guido/scripts/awk-one-liner.html)
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`---`
			`language: awk`
			`filename: learnawk.awk`
			`contributors:`
fix language code suffix(#2832) 2017-08-25 11:19:05 +03:00			`- ["Marshall Mason", "http://github.com/marshallmason"]`

[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`---`

Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`AWK is a standard tool on every POSIX-compliant UNIX system. It's like`
			`flex/lex, from the command-line, perfect for text-processing tasks and`
			`other scripting needs. It has a C-like syntax, but without mandatory`
			`semicolons (although, you should use them anyway, because they are required`
			`when you're writing one-liners, something AWK excells at), manual memory`
			`management, or static typing. It excels at text processing. You can call to`
			`it from a shell script, or you can use it as a stand-alone scripting language.`

			`Why use AWK instead of Perl? Readability. AWK is easier to read`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`than Perl. For simple text-processing scripts, particularly ones that read`
			`files line by line and split on delimiters, AWK is probably the right tool for`
			`the job.`

			```awk
			`#!/usr/bin/awk -f`

			`# Comments are like this`

Update awk.html.markdown 2018-09-11 23:52:30 +03:00
			`# AWK programs consist of a collection of patterns and actions.`
			`pattern1 { action; } # just like lex`
			`pattern2 { action; }`

			`# There is an implied loop and AWK automatically reads and parses each`
			`# record of each file supplied. Each record is split by the FS delimiter,`
			`# which defaults to white-space (multiple spaces,tabs count as one)`
			`# You cann assign FS either on the command line (-F C) or in your BEGIN`
			`# pattern`

			`# One of the special patterns is BEGIN. The BEGIN pattern is true`
			`# BEFORE any of the files are read. The END pattern is true after`
			`# an End-of-file from the last file (or standard-in if no files specified)`
			`# There is also an output field separator (OFS) that you can assign, which`
			`# defaults to a single space`

[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`BEGIN {`

			`# BEGIN will run at the beginning of the program. It's where you put all`
			`# the preliminary set-up code, before you process any text files. If you`
			`# have no text files, then think of BEGIN as the main entry point.`

			`# Variables are global. Just set them or use them, no need to declare..`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`count = 0;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Operators just like in C and friends`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`a = count + 1;`
			`b = count - 1;`
			`c = count * 1;`
			`d = count / 1; # integer division`
			`e = count % 1; # modulus`
			`f = count ^ 1; # exponentiation`

			`a += 1;`
			`b -= 1;`
			`c *= 1;`
			`d /= 1;`
			`e %= 1;`
			`f ^= 1;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Incrementing and decrementing by one`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`a++;`
			`b--;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# As a prefix operator, it returns the incremented value`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`++a;`
			`--b;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Notice, also, no punctuation such as semicolons to terminate statements`

			`# Control statements`
			`if (count == 0)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Starting with count of 0";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`else`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Huh?";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Or you could use the ternary operator`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print (count == 0) ? "Starting with count of 0" : "Huh?";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Blocks consisting of multiple lines use braces`
			`while (a < 10) {`
			`print "String concatenation is done" " with a series" " of"`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`" space-separated strings";`
			`print a;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`a++;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`for (i = 0; i < 10; i++)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Good ol' for loop";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# As for comparisons, they're the standards:`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# a < b # Less than`
			`# a <= b # Less than or equal`
			`# a != b # Not equal`
			`# a == b # Equal`
			`# a > b # Greater than`
			`# a >= b # Greater than or equal`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Logical operators as well`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# a && b # AND`
			`# a \|\| b # OR`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# In addition, there's the super useful regular expression match`
			`if ("foo" ~ "^fo+$")`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Fooey!";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`if ("boo" !~ "^fo+$")`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Boo!";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Arrays`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`arr[0] = "foo";`
			`arr[1] = "bar";`

			`# You can also initialize an array with the built-in function split()`

			`n = split("foo:bar:baz", arr, ":");`

			`# You also have associative arrays (actually, they're all associative arrays)`
			`assoc["foo"] = "bar";`
			`assoc["bar"] = "baz";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# And multi-dimensional arrays, with some limitations I won't mention here`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`multidim[0,0] = "foo";`
			`multidim[0,1] = "bar";`
			`multidim[1,0] = "baz";`
			`multidim[1,1] = "boo";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# You can test for array membership`
			`if ("foo" in assoc)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Fooey!";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# You can also use the 'in' operator to traverse the keys of an array`
			`for (key in assoc)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print assoc[key];`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# The command line is in a special array called ARGV`
			`for (argnum in ARGV)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print ARGV[argnum];`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# You can remove elements of an array`
			`# This is particularly useful to prevent AWK from assuming the arguments`
			`# are files for it to process`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`delete ARGV[1];`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# The number of command line arguments is in a variable called ARGC`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print ARGC;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# AWK has several built-in functions. They fall into three categories. I'll`
			`# demonstrate each of them in their own functions, defined later.`

Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`return_value = arithmetic_functions(a, b, c);`
			`string_functions();`
			`io_functions();`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`# Here's how you define a function`
fix for #2859 2017-10-14 16:01:05 +03:00			`function arithmetic_functions(a, b, c, d) {`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Probably the most annoying part of AWK is that there are no local`
			`# variables. Everything is global. For short scripts, this is fine, even`
			`# useful, but for longer scripts, this can be a problem.`

			`# There is a work-around (ahem, hack). Function arguments are local to the`
			`# function, and AWK allows you to define more function arguments than it`
			`# needs. So just stick local variable in the function declaration, like I`
			`# did above. As a convention, stick in some extra whitespace to distinguish`
			`# between actual function parameters and local variables. In this example,`
			`# a, b, and c are actual parameters, while d is merely a local variable.`

			`# Now, to demonstrate the arithmetic functions`

			`# Most AWK implementations have some standard trig functions`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`localvar = sin(a);`
			`localvar = cos(a);`
			`localvar = atan2(b, a); # arc tangent of b / a`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# And logarithmic stuff`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`localvar = exp(a);`
			`localvar = log(a);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Square root`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`localvar = sqrt(a);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Truncate floating point to integer`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`localvar = int(5.34); # localvar => 5`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Random numbers`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`srand(); # Supply a seed as an argument. By default, it uses the time of day`
			`localvar = rand(); # Random number between 0 and 1.`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Here's how to return a value`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`return localvar;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`function string_functions( localvar, arr) {`

			`# AWK, being a string-processing language, has several string-related`
			`# functions, many of which rely heavily on regular expressions.`

			`# Search and replace, first instance (sub) or all instances (gsub)`
			`# Both return number of matches replaced`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`localvar = "fooooobar";`
			`sub("fo+", "Meet me at the ", localvar); # localvar => "Meet me at the bar"`
			`gsub("e+", ".", localvar); # localvar => "m..t m. at th. bar"`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Search for a string that matches a regular expression`
			`# index() does the same thing, but doesn't allow a regular expression`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`match(localvar, "t"); # => 4, since the 't' is the fourth character`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Split on a delimiter`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`n = split("foo-bar-baz", arr, "-"); # a[1] = "foo"; a[2] = "bar"; a[3] = "baz"; n = 3`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Other useful stuff`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`sprintf("%s %d %d %d", "Testing", 1, 2, 3); # => "Testing 1 2 3"`
			`substr("foobar", 2, 3); # => "oob"`
			`substr("foobar", 4); # => "bar"`
			`length("foo"); # => 3`
			`tolower("FOO"); # => "foo"`
			`toupper("foo"); # => "FOO"`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`function io_functions( localvar) {`

			`# You've already seen print`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Hello world";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# There's also printf`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`printf("%s %d %d %d\n", "Testing", 1, 2, 3);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# AWK doesn't have file handles, per se. It will automatically open a file`
			`# handle for you when you use something that needs one. The string you used`
			`# for this can be treated as a file handle, for purposes of I/O. This makes`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# it feel sort of like shell scripting, but to get the same output, the string`
			`# must match exactly, so use a vaiable:`

			`outfile = "/tmp/foobar.txt";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "foobar" > outfile;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# Now the string outfile is a file handle. You can close it:`
			`close(outfile);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Here's how you run something in the shell`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`system("echo foobar"); # => prints foobar`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Reads a line from standard input and stores in localvar`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`getline localvar;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# Reads a line from a pipe (again, use a string so you close it properly)`
			`cmd = "echo foobar";`
			`cmd \| getline localvar; # localvar => "foobar"`
			`close(cmd);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Reads a line from a file and stores in localvar`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`infile = "/tmp/foobar.txt";`
			`getline localvar < infile;`
			`close(infile);`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`# As I said at the beginning, AWK programs consist of a collection of patterns`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`# and actions. You've already seen the BEGIN pattern. Other`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`# patterns are used only if you're processing lines from files or standard`
			`# input.`
			`#`
			`# When you pass arguments to AWK, they are treated as file names to process.`
			`# It will process them all, in order. Think of it like an implicit for loop,`
			`# iterating over the lines in these files. these patterns and actions are like`
			`# switch statements inside the loop.`

			`/^fo+bar$/ {`

			`# This action will execute for every line that matches the regular`
			`# expression, /^fo+bar$/, and will be skipped for any line that fails to`
			`# match it. Let's just print the line:`

Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Whoa, no argument! That's because print has a default argument: $0.`
			`# $0 is the name of the current line being processed. It is created`
			`# automatically for you.`

			`# You can probably guess there are other $ variables. Every line is`
Fix a bunch of typos 2017-08-23 11:14:39 +03:00			`# implicitly split before every action is called, much like the shell`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`# does. And, like the shell, each field can be access with a dollar sign`

			`# This will print the second and fourth fields in the line`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print $2, $4;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# AWK automatically defines many other variables to help you inspect and`
			`# process each line. The most important one is NF`

			`# Prints the number of fields on this line`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print NF;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Print the last field on this line`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print $NF;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`# Every pattern is actually a true/false test. The regular expression in the`
			`# last pattern is also a true/false test, but part of it was hidden. If you`
			`# don't give it a string to test, it will assume $0, the line that it's`
			`# currently processing. Thus, the complete version of it is this:`

			`$0 ~ /^fo+bar$/ {`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "Equivalent to the last pattern";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`a > 0 {`
			`# This will execute once for each line, as long as a is positive`
			`}`

			`# You get the idea. Processing text files, reading in a line at a time, and`
			`# doing something with it, particularly splitting on a delimiter, is so common`
			`# in UNIX that AWK is a scripting language that does all of it for you, without`
			`# you needing to ask. All you have to do is write the patterns and actions`
			`# based on what you expect of the input, and what you want to do with it.`

			`# Here's a quick example of a simple script, the sort of thing AWK is perfect`
			`# for. It will read a name from standard input and then will print the average`
			`# age of everyone with that first name. Let's say you supply as an argument the`
			`# name of a this data file:`
			`#`
			`# Bob Jones 32`
			`# Jane Doe 22`
			`# Steve Stevens 83`
			`# Bob Smith 29`
			`# Bob Barker 72`
			`#`
			`# Here's the script:`

			`BEGIN {`

			`# First, ask the user for the name`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "What name would you like the average age for?";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00
			`# Get a line from standard input, not from files on the command line`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`getline name < "/dev/stdin";`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`# Now, match every line whose first field is the given name`
			`$1 == name {`

			`# Inside here, we have access to a number of useful variables, already`
			`# pre-loaded for us:`
			`# $0 is the entire line`
			`# $3 is the third field, the age, which is what we're interested in here`
			`# NF is the number of fields, which should be 3`
			`# NR is the number of records (lines) seen so far`
			`# FILENAME is the name of the file being processed`
			`# FS is the field separator being used, which is " " here`
			`# ...etc. There are plenty more, documented in the man page.`

			`# Keep track of a running total and how many lines matched`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`sum += $3;`
			`nlines++;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			`# Another special pattern is called END. It will run after processing all the`
			`# text files. Unlike BEGIN, it will only run if you've given it input to`
			`# process. It will run after all the files have been read and processed`
			`# according to the rules and actions you've provided. The purpose of it is`
			`# usually to output some kind of final report, or do something with the`
			`# aggregate of the data you've accumulated over the course of the script.`

			`END {`
			`if (nlines)`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`print "The average age for " name " is " sum / nlines;`
[awk/en] Added tutorial for AWK (#2716) 2017-05-09 17:18:46 +03:00			`}`

			```
			`Further Reading:`

			`* [Awk tutorial](http://www.grymoire.com/Unix/Awk.html)`
			`* [Awk man page](https://linux.die.net/man/1/awk)`
			`* [The GNU Awk User's Guide](https://www.gnu.org/software/gawk/manual/gawk.html) GNU Awk is found on most Linux systems.`
Update awk.html.markdown 2018-09-11 23:52:30 +03:00			`* [AWK one-liner collection](http://tuxgraphics.org/~guido/scripts/awk-one-liner.html)`