awk
Summary
awk is a text-processing language designed for structured data. It reads input line by line, splits each line into
fields (separated by whitespace by default), and runs programs against them. Think of it as a programmable grep
that can also compute and reformat output.
Usage
The basic invocation pattern is:
awk 'pattern { action }' file- pattern — a condition (or regex) that selects lines; omit it to match every line.
- action — what to do with matching lines; omit it to print the whole line.
Examples
Print a specific column
$1, $2, … refer to whitespace-delimited fields. $0 is the whole line.
# Print the second column of /etc/passwd (historically the password hash)
awk -F: '{ print $1, $3 }' /etc/passwd
root 0
daemon 1
bin 2-F: sets the field separator to :.
Sum a column of numbers
ls -l | awk '{ total += $5 } END { print "Total bytes:", total }'
Total bytes: 204800The END block runs once after all input is consumed.
Filter lines by pattern
# Show only lines where the third field is greater than 1000
awk '$3 > 1000 { print $1, $3 }' /etc/passwd
nobody 65534Print line count (like wc -l)
awk 'END { print NR }' /var/log/syslogNR is the built-in record (line) counter.
Reformat CSV output
# Convert space-separated process list to CSV
ps aux | awk 'NR > 1 { print $1 "," $2 "," $11 }' | head -5
root,1,/sbin/init
root,2,kthreadd
root,3,rcu_gpOne-liner: extract JSON-ish values from logs
# Pull request IDs out of a log where lines look like: ... requestId=abc123 ...
grep 'requestId' app.log | awk -F'requestId=' '{ print $2 }' | awk '{ print $1 }'Multi-field conditional
# Show processes consuming more than 10% CPU
ps aux | awk '$3 > 10 { print $2, $3, $11 }'BEGIN and END blocks
awk 'BEGIN { print "PID\tCOMMAND" }
NR > 1 { print $2 "\t" $11 }
END { print "---\nDone." }' <(ps aux)BEGIN runs before any input is read; useful for printing headers or initializing variables.
Common Built-in Variables
| Variable | Meaning |
|---|---|
NR |
Current line number (total records read) |
NF |
Number of fields on the current line |
FS |
Input field separator (default: whitespace) |
OFS |
Output field separator (default: space) |
RS |
Input record separator (default: newline) |
$0 |
Entire current line |
$1…$NF |
Individual fields |