The awk
command is a powerful tool for text processing, and central to its functionality is the print
statement. This statement is your primary way to generate output from awk
, allowing you to display data extracted and manipulated from your input. While seemingly simple, Awk Print
offers flexibility in how you format and present your information. This guide will explore various examples of the print
statement, demonstrating its capabilities and common usage scenarios.
Understanding the Basics of awk print
At its core, the print
statement in awk
is designed to output lines of text. A fundamental characteristic of print
is that each execution inherently adds a newline character at the end of its output. This means every print
command will, by default, start a new line in your output. However, the versatility of print
extends beyond single-line outputs.
Consider strings that already contain newline characters embedded within them. When print
encounters such a string, it intelligently outputs each newline as part of the string, effectively producing multi-line output from a single print
statement. The newline character is often represented by the escape sequence n
.
$ awk 'BEGIN { print "line onenline twonline three" }'
line one
line two
line three
Example of awk
command printing strings with embedded newlines, resulting in multi-line output.
In this example, we use the BEGIN
pattern to execute the print
statement before processing any input. The string provided to print
includes n
to represent newlines, causing awk
to output the string across three separate lines.
Printing Fields with awk print
Beyond strings, awk
is particularly adept at processing structured data, often organized into fields within records (lines). The print
statement is crucial for extracting and displaying specific fields from this data.
Let’s examine how to print fields using an example with the inventory-shipped
file. Suppose this file contains records where the first field ($1
) represents the month and the second field ($2
) represents the number of crates shipped. To print these two fields, separated by a space, you would use a comma within the print
statement:
$ awk '{ print $1, $2 }' inventory-shipped
Jan 13
Feb 15
Mar 15
...
Demonstration of awk
printing the first and second fields from ‘inventory-shipped’ file, separated by a comma which results in a space in the output.
The comma between $1
and $2
in the print
statement is important. It tells awk
to separate the output of these two fields with the output field separator (OFS), which is a space by default.
The Impact of Omitting the Comma
A common point of confusion arises when the comma is omitted between items in a print
statement. Without the comma, awk
interprets the items as string concatenation rather than separate fields with a separator.
Consider the same example without the comma:
$ awk '{ print $1 $2 }' inventory-shipped
Jan13
Feb15
Mar15
...
Example of awk
printing the first and second fields without a comma, leading to concatenation of the fields in the output without any space.
As you can see, the output now lacks the space, and the month and crate numbers are directly joined together. This is because awk
concatenates $1
and $2
as strings, resulting in a single string without any intervening space.
Enhancing Output with Headers using BEGIN
When dealing with tabular data, adding headers significantly improves readability and understanding. The BEGIN
rule in awk
is perfect for this. As we saw earlier, BEGIN
allows you to execute actions before awk
processes any input lines. This makes it ideal for printing header lines at the start of your output.
Let’s enhance our previous example to include headers “Month” and “Crates”:
awk 'BEGIN { print "Month Crates"; print "----- ------" } { print $1, $2 }' inventory-shipped
Month Crates
----- ------
Jan 13
Feb 15
Mar 15
...
When executed, this awk
script first prints the header line “Month Crates” and a separator line “—- ——” due to the BEGIN
rule. Then, for each line in inventory-shipped
, it prints the month and crates as before.
Column Alignment Considerations
While adding headers is a step forward, you might notice that in the previous example, the columns are not perfectly aligned. Simply adding spaces in the print
statement to attempt alignment can become cumbersome, especially with multiple columns and varying data lengths.
Let’s try to add spaces to align the columns in our header example:
awk 'BEGIN { print "Month Crates"; print "----- ------" } { print $1, " ", $2 }' inventory-shipped
Month Crates
----- ------
Jan 13
Feb 15
Mar 15
...
As the original text points out, managing column alignment using spaces becomes increasingly complex as the number of columns grows. For more sophisticated formatting and precise column alignment, awk
provides the printf
statement. printf
(discussed in detail in “Using printf
Statements for Fancier Printing“) offers powerful formatting capabilities, including specifying field widths and alignment, making it a more robust solution for creating well-structured output.
Note: You can break long print
or printf
statements across multiple lines for readability by inserting a newline after a comma. This is helpful for complex output formatting.
In summary, the awk print
statement is a fundamental tool for outputting data in awk
. Understanding its behavior with newlines, field separators, and concatenation is crucial for effectively using awk
to process and present text data. While basic alignment with spaces is possible, for more intricate formatting needs, exploring printf
is the next step.