Linux text processing tools: grep (search), awk (fields and programs), sed (stream editor), cut / sort / uniq / xargs — the building blocks of any pipe pipeline.
grep
grep flags
| Flag | Description |
|---|---|
| -i | Case-insensitive match |
| -r / -R | Recursive search (R follows symlinks) |
| -l | Print only filenames with matches |
| -L | Files WITHOUT matches |
| -n | Show line numbers |
| -c | Count matching lines |
| -v | Invert match |
| -w | Match whole word |
| -x | Match whole line |
| -E | Extended regex (egrep) |
| -P | Perl-compatible regex (PCRE) |
| -F | Fixed string (no regex) |
| -o | Print only the matching part of the line |
| -A N | N lines after each match |
| -B N | N lines before each match |
| -C N | N lines around each match |
| -m N | Stop after N matches |
| --include="*.py" | Search only in .py files |
| --exclude-dir=".git" | Exclude a directory |
grep examples
| Command | Description |
|---|---|
| grep -rn "TODO" src/ --include="*.py" | Find TODO in Python files |
| grep -E "^(ERROR|WARN)" app.log | Lines starting with ERROR or WARN |
| grep -oP "(?<=Host: )\S+" access.log | Extract Host headers |
| grep -v "^#" /etc/ssh/sshd_config | grep -v "^$" | Config without comments and blank lines |
| grep -c "ERROR" app.log | Count error lines |
awk
Core constructs
| Expression | Description |
|---|---|
| awk '{print $1}' | First field (whitespace delimiter) |
| awk '{print $NF}' | Last field |
| awk '{print $1, $3}' | Fields 1 and 3 separated by space |
| awk -F: '{print $1}' /etc/passwd | Use : as delimiter |
| awk 'NR==5' | Print line 5 |
| awk 'NR>=3 && NR<=7' | Lines 3–7 |
| awk '/pattern/' | Lines matching pattern |
| awk '!/pattern/' | Lines NOT matching |
| awk '$3 > 100 {print}' | Lines where field 3 > 100 |
| awk '{sum+=$1} END{print sum}' | Sum first column |
| awk 'END{print NR}' | Line count (like wc -l) |
| awk '{gsub(/old/,"new"); print}' | Global substitution on each line |
| awk '!seen[$0]++' | Remove duplicates (preserve order) |
| awk 'BEGIN{FS=":"; OFS="\t"} {print $1,$3}' | Input and output field separators |
| awk '{a[$1]+=$2} END{for(k in a) print k,a[k]}' | Group by key with sum |
Built-in awk variables: NR (line number) · NF (field count) · FS (input separator) · OFS (output separator) · RS (record separator) · ORS (output record separator) · FILENAME
sed
sed commands
| Expression | Description |
|---|---|
| sed 's/old/new/' | Replace first occurrence per line |
| sed 's/old/new/g' | Replace all occurrences |
| sed 's/old/new/gi' | Case-insensitive replacement |
| sed -i 's/old/new/g' file | In-place replacement |
| sed -i.bak 's/.../.../' file | In-place with .bak backup |
| sed -n '5p' | Print only line 5 |
| sed -n '3,7p' | Lines 3–7 |
| sed -n '/pattern/p' | Lines matching pattern |
| sed -n '/start/,/end/p' | Block between two patterns |
| sed '3d' | Delete line 3 |
| sed '/pattern/d' | Delete lines matching pattern |
| sed '/^#/d; /^$/d' | Remove comments and blank lines |
| sed '5a\new line' | Append line after line 5 |
| sed '5i\new line' | Insert line before line 5 |
| sed 'y/abc/ABC/' | Transliterate characters |
| sed 'G' | Add blank line after every line |
| sed -e 's/a/b/' -e 's/c/d/' | Multiple commands |
cut, sort, uniq, xargs
cut — field extraction
| Command | Description |
|---|---|
| cut -d: -f1 /etc/passwd | Field 1 with : delimiter |
| cut -d, -f2-4 | Fields 2, 3, 4 |
| cut -d: -f1,3 | Fields 1 and 3 |
| cut -c1-10 | Characters 1–10 |
| cut -c-5 | First 5 characters |
| cut -c10- | From character 10 to end |
sort
| Command | Description |
|---|---|
| sort | Alphabetical sort |
| sort -n | Numeric sort |
| sort -rn | Reverse numeric sort |
| sort -u | Unique lines |
| sort -k2,2n | Sort by field 2 numerically |
| sort -t: -k3,3n /etc/passwd | Sort passwd by UID |
| sort -h | Human-readable numbers (1K, 2M) |
| sort -R | Random shuffle |
uniq
| Command | Description |
|---|---|
| uniq | Remove consecutive duplicates (requires sort first) |
| uniq -c | Count occurrences |
| uniq -d | Duplicates only |
| uniq -u | Unique lines only |
| sort | uniq -c | sort -rn | Top frequent lines |
xargs — argument passing
| Command | Description |
|---|---|
| find . -name "*.log" | xargs rm | Delete all .log files |
| find . -name "*.py" | xargs grep "TODO" | grep across found files |
| cat hosts.txt | xargs -I{} ping -c1 {} | Ping each host |
| echo "a b c" | xargs -n1 | One argument at a time |
| xargs -P4 -I{} cmd {} | 4 parallel processes |
| find . -print0 | xargs -0 rm | Null delimiters (files with spaces) |
| xargs -n3 echo | 3 arguments per invocation |