Linux Basics
Searching & Filtering
You have a log file with hundreds or thousands of lines. You are not going to scroll through it line by line hoping to spot the problem. That is not how anyone works in production. You search. This section is where Linux becomes truly powerful — where you go from reading files to interrogating them.
After this page, you should be able to:
- Use
grepto search for specific patterns in files and command output - Chain commands together using the pipe operator (
|) to filter and transform output - Count lines, words, and characters with
wc - Locate files anywhere on the system using
find - Combine multiple commands into a single line to answer real troubleshooting questions
grep — Finding What Matters
grep is one of the most-used commands in Linux. It searches through text and returns every line that matches a pattern you specify. If you only learn one search tool, make it this one.
Basic Usage
The simplest form is grep "pattern" filename. It scans the file line by line and prints every line containing that pattern.
grep "ERROR" app.log
This returns every line in app.log that contains the word "ERROR." Simple and insanely useful. If you created the app.log file in the previous step, try this now.
Case-Insensitive Search: grep -i
By default, grep is case-sensitive — "ERROR" won't match "error" or "Error." Add -i to ignore case.
grep -i "error" app.log
This catches "ERROR," "Error," "error," and any other capitalization. Use this when you are not sure how the log formats its messages.
Show Line Numbers: grep -n
When you find a match, you often need to know where in the file it is. The -n flag adds line numbers to the output.
grep -n "ERROR" app.log
Output looks like 14:2024-01-15 10:03:22 ERROR Database connection timeout — the 14 tells you it is on line 14.
Recursive Search: grep -r
Search through every file in a directory and all its subdirectories. This is critical when you know a string exists somewhere but do not know which file it is in.
grep -r "database" /etc/
This searches every file under /etc/ for the word "database." Incredibly useful for tracking down configuration values.
Invert Match: grep -v
Sometimes you want everything except lines matching a pattern. The -v flag inverts the match — it shows every line that does NOT contain the pattern.
grep -v "INFO" app.log
This filters out all the routine INFO messages and shows you only the interesting stuff — warnings, errors, and anything unexpected.
Count Matches: grep -c
Instead of printing the matching lines, -c just tells you how many lines matched.
grep -c "ERROR" app.log
Returns a single number — the count of lines containing "ERROR." Quick way to gauge how bad things are.
Combining Flags
You can combine flags together. Want a case-insensitive search with line numbers? Stack them.
grep -in "error" app.log
Case-insensitive and shows line numbers. Flags can be combined in any order: -in, -ni, or -i -n all do the same thing.
The Pipe Operator — Chaining Commands
The pipe character | is arguably the most important concept in this entire lab. It is the thing that makes Linux feel like a superpower instead of a collection of disconnected tools.
Here is what it does: the pipe takes the output of the command on the left and feeds it as input to the command on the right. That is it. One command produces text, the next command consumes it.
Simple Example
cat app.log | grep "ERROR"
cat reads the file and outputs its contents. grep receives that output and filters it down to only lines containing "ERROR." Two commands, working together.
The Power Move: Live Filtered Monitoring
tail -f app.log | grep "ERROR"
tail -f streams the end of the file in real time — new lines appear as they are written. The pipe sends each new line through grep, which only lets through the ones containing "ERROR."
This is real-world troubleshooting. You are watching a live application log and only seeing errors as they happen. No noise, no scrolling, just the problems. Press Ctrl+C to stop.
Triple Chain: Read, Filter, Count
cat app.log | grep "ERROR" | wc -l
Three commands, one line, instant answer. cat reads the file. grep filters for errors. wc -l counts the remaining lines. You get back a single number: the total error count.
Find Files in a Directory
ls -la | grep ".log"
ls -la lists every file with details. The pipe sends that list through grep, which filters for anything containing ".log." Quick way to find log files without scanning a long directory listing.
Filter and Sort
grep "ERROR" app.log | sort
Find all errors, then sort them alphabetically. This groups identical error messages together so you can see which ones repeat.
Deduplicate Results
grep "ERROR" app.log | sort | uniq -c | sort -rn
This is a classic chain. Find all errors, sort them, count unique occurrences, then sort by count (highest first). The output tells you which error message occurs most frequently. This is the kind of command that makes people say "wait, you can do that?"
Narrowing Down with Multiple Pipes
cat app.log | grep "ERROR" | grep "database" | wc -l
You can pipe through grep multiple times. This finds lines containing "ERROR" AND "database" — narrowing your results with each step. Then wc -l counts the results.
This is what separates people who use Linux from people who memorize it
Once pipes click, you will start chaining commands everywhere. Instead of memorizing a specialized tool for every task, you combine simple tools into exactly what you need. Each command does one thing well. Pipes let you assemble them into something powerful. That philosophy is the heart of Linux.
wc — Counting Things
wc stands for "word count," but it does more than count words. It counts lines, words, and characters. You will use it most often with pipes to count the results of a filtered command.
wc -l app.log # Count total lines in the file wc -w app.log # Count total words in the file wc -c app.log # Count total bytes (characters) in the file
The most common usage by far is wc -l at the end of a pipe chain. "How many errors?" "How many config files?" "How many lines match this pattern?" The answer is always | wc -l.
grep "WARNING" app.log | wc -l # How many warnings? find /var -name "*.log" | wc -l # How many log files under /var? ls /etc/ | wc -l # How many items in /etc/?
find — Locating Files
grep searches inside files. find searches for the files themselves. When you know a file exists somewhere but do not know where, find is the answer.
Find by Name
find / -name "*.log"
Scans the entire file system (starting from /) for any file ending in .log. This can take a moment on a large system, and you will see "Permission denied" errors for directories you cannot access.
Search a Specific Directory
find /var -name "*.conf" find /etc -name "*.conf"
Narrow your search to a specific directory tree. Searching /var or /etc is much faster than searching the entire system from /.
Files Only vs. Directories Only
find . -type f -name "*.txt" # Files only find . -type d -name "config" # Directories only
The -type flag lets you specify what you are looking for. f means regular files, d means directories. The . means "start from the current directory."
Find Recently Modified Files
find . -mtime -1 # Modified in the last 1 day find . -mtime -7 # Modified in the last 7 days find /var/log -mtime 0 # Modified today
The -mtime flag filters by modification time. -1 means "less than 1 day ago." This is useful when troubleshooting — "what changed recently?"
Suppressing Permission Errors: 2>/dev/null
find / -name "*.log" 2>/dev/null
When you search from /, you will hit directories you do not have permission to read. Each one generates a "Permission denied" error that clutters your output. Adding 2>/dev/null at the end suppresses those errors.
What does 2>/dev/null mean? In Linux, every command has two output streams: stdout (stream 1, normal output) and stderr (stream 2, error messages). 2>/dev/null redirects stderr to /dev/null, which is a special file that discards everything written to it. In plain English: "throw away the error messages and just show me the results."
Combining It All — Real Troubleshooting
Here is where everything comes together. On the job, you do not use these commands in isolation — you combine them to answer specific questions quickly.
Scenario: How many errors happened in a specific time window?
Your app is throwing errors. You need to know how many errors occurred during the 10:00 AM hour.
grep "ERROR" app.log | grep "10:0" | wc -l
First grep finds all error lines. Second grep narrows to lines with timestamps starting with "10:0" (10:00-10:09). Then wc -l counts them. Three commands, one answer.
Scenario: Find all config files that mention a specific hostname
You are migrating a database server and need to find every configuration file that references the old hostname.
grep -r "db-host-old" /etc/ 2>/dev/null
Recursively searches every file under /etc/ for the hostname string. The 2>/dev/null suppresses permission errors for files you cannot read.
Scenario: What are the most common error types?
You see a lot of errors in the log. You want to know which error messages occur most frequently.
grep "ERROR" app.log | awk '{print $4, $5, $6, $7}' | sort | uniq -c | sort -rnThis extracts the error description portion, sorts it, counts duplicates, and sorts by frequency. The most common error appears first. Do not worry about the awk syntax for now — the concept of chaining commands is what matters.
Scenario: Find large log files consuming disk space
find /var/log -name "*.log" -type f -exec ls -lh {} \; 2>/dev/null | sort -k5 -h -rFinds all .log files under /var/log, gets their sizes, and sorts by size (largest first). This is how you track down what is eating your disk space.
Try It — Search and Filter Exercise
Work through these steps on your Linux VM. If you still have your app.log file from the previous step, use that. If not, create one first.
- Expand your log file. Use
nanoorechowith>>to add more lines toapp.loguntil you have 15-20 lines total. Mix in ERROR, WARNING, and INFO messages with different timestamps.echo "2024-01-15 10:05:33 ERROR Disk space critically low" >> app.log echo "2024-01-15 10:06:01 WARNING Memory usage above 80%" >> app.log echo "2024-01-15 10:07:15 INFO Scheduled backup started" >> app.log echo "2024-01-15 11:01:44 ERROR Connection refused on port 5432" >> app.log
- Search for errors. Run
grep "ERROR" app.logand observe the output. - Count the errors. Run
grep "ERROR" app.log | wc -lto get a total count. - Search for warnings. Run
grep "WARNING" app.logto see all warning lines. - Show line numbers. Run
grep -n "ERROR" app.logto see which line numbers the errors are on. - Exclude INFO messages. Run
grep -v "INFO" app.logto see everything except the routine messages. - Find log files. Run
find ~ -name "*.log" 2>/dev/nullto locate all .log files under your home directory. - Try a multi-pipe chain. Run
cat app.log | grep "ERROR" | grep "10:" | wc -lto count errors that occurred during the 10:00 hour. - Find and count. Run
find / -name "*.log" 2>/dev/null | wc -lto count how many .log files exist on the entire system. - Experiment. Try combining different commands with pipes. There is no wrong answer here — the goal is to get comfortable chaining things together.
Checkpoint
Before moving on, make sure you can do all of the following:
- Use
grepto find specific text in a file - Use
grep -ifor case-insensitive searching - Use
grep -vto exclude lines matching a pattern - Use
grep -rto search recursively through directories - Explain what the pipe operator (
|) does — in your own words - Chain at least two commands together with a pipe
- Use
wc -lto count lines in a pipe chain - Use
findto locate files by name - Explain what
2>/dev/nulldoes - Build a multi-step pipe chain to answer a specific question (e.g., "how many errors contain the word database?")
If you can check off all of these, you have the searching and filtering skills you need. Pipes especially will come up constantly from here on out — they are not a nice-to-have, they are fundamental.