Linux Basics

Searching & Filtering

You have a log file with hundreds or thousands of lines. You are not going to scroll through it line by line hoping to spot the problem. That is not how anyone works in production. You search. This section is where Linux becomes truly powerful — where you go from reading files to interrogating them.

After this page, you should be able to:

Use grep to search for specific patterns in files and command output
Chain commands together using the pipe operator (|) to filter and transform output
Count lines, words, and characters with wc
Locate files anywhere on the system using find
Combine multiple commands into a single line to answer real troubleshooting questions

grep — Finding What Matters

grep is one of the most-used commands in Linux. It searches through text and returns every line that matches a pattern you specify. If you only learn one search tool, make it this one.

Basic Usage

The simplest form is grep "pattern" filename. It scans the file line by line and prints every line containing that pattern.

grep "ERROR" app.log

This returns every line in app.logthat contains the word "ERROR." Simple and insanely useful. If you created the app.log file in the previous step, try this now.

Case-Insensitive Search: grep -i

By default, grep is case-sensitive — "ERROR" won't match "error" or "Error." Add -i to ignore case.

grep -i "error" app.log

This catches "ERROR," "Error," "error," and any other capitalization. Use this when you are not sure how the log formats its messages.

Show Line Numbers: grep -n

When you find a match, you often need to know where in the file it is. The -n flag adds line numbers to the output.

grep -n "ERROR" app.log

Output looks like 14:2024-01-15 10:03:22 ERROR Database connection timeout — the 14 tells you it is on line 14.

Recursive Search: grep -r

Search through every file in a directory and all its subdirectories. This is critical when you know a string exists somewhere but do not know which file it is in.

grep -r "database" /etc/

This searches every file under /etc/for the word "database." Incredibly useful for tracking down configuration values.

Invert Match: grep -v

Sometimes you want everything except lines matching a pattern. The -v flag inverts the match — it shows every line that does NOT contain the pattern.

grep -v "INFO" app.log

This filters out all the routine INFO messages and shows you only the interesting stuff — warnings, errors, and anything unexpected.

Count Matches: grep -c

Instead of printing the matching lines, -c just tells you how many lines matched.

grep -c "ERROR" app.log

Returns a single number — the count of lines containing "ERROR." Quick way to gauge how bad things are.

Combining Flags

You can combine flags together. Want a case-insensitive search with line numbers? Stack them.

grep -in "error" app.log

Case-insensitive and shows line numbers. Flags can be combined in any order: -in, -ni, or -i -n all do the same thing.

The Pipe Operator — Chaining Commands

The pipe character | is arguably the most important concept in this entire lab. It is the thing that makes Linux feel like a superpower instead of a collection of disconnected tools.

Here is what it does: the pipe takes the output of the command on the left and feeds it as input to the command on the right. That is it. One command produces text, the next command consumes it.

Simple Example

cat app.log | grep "ERROR"

cat reads the file and outputs its contents. grepreceives that output and filters it down to only lines containing "ERROR." Two commands, working together.

The Power Move: Live Filtered Monitoring

tail -f app.log | grep "ERROR"

tail -f streams the end of the file in real time — new lines appear as they are written. The pipe sends each new line through grep, which only lets through the ones containing "ERROR."

This is real-world troubleshooting. You are watching a live application log and only seeing errors as they happen. No noise, no scrolling, just the problems. Press Ctrl+C to stop.

Triple Chain: Read, Filter, Count

cat app.log | grep "ERROR" | wc -l

Three commands, one line, instant answer. cat reads the file. grep filters for errors. wc -l counts the remaining lines. You get back a single number: the total error count.

Find Files in a Directory

ls -la | grep ".log"

ls -la lists every file with details. The pipe sends that list through grep, which filters for anything containing ".log." Quick way to find log files without scanning a long directory listing.

Filter and Sort

grep "ERROR" app.log | sort

Find all errors, then sort them alphabetically. This groups identical error messages together so you can see which ones repeat.

Deduplicate Results

grep "ERROR" app.log | sort | uniq -c | sort -rn

This is a classic chain. Find all errors, sort them, count unique occurrences, then sort by count (highest first). The output tells you which error message occurs most frequently. This is the kind of command that makes people say "wait, you can do that?"

Narrowing Down with Multiple Pipes

cat app.log | grep "ERROR" | grep "database" | wc -l

You can pipe through grepmultiple times. This finds lines containing "ERROR" AND "database" — narrowing your results with each step. Then wc -l counts the results.

ℹ️

This is what separates people who use Linux from people who memorize it

Once pipes click, you will start chaining commands everywhere. Instead of memorizing a specialized tool for every task, you combine simple tools into exactly what you need. Each command does one thing well. Pipes let you assemble them into something powerful. That philosophy is the heart of Linux.

wc — Counting Things

wcstands for "word count," but it does more than count words. It counts lines, words, and characters. You will use it most often with pipes to count the results of a filtered command.

wc -l app.log        # Count total lines in the file
wc -w app.log        # Count total words in the file
wc -c app.log        # Count total bytes (characters) in the file

The most common usage by far is wc -lat the end of a pipe chain. "How many errors?" "How many config files?" "How many lines match this pattern?" The answer is always | wc -l.

grep "WARNING" app.log | wc -l    # How many warnings?
find /var -name "*.log" | wc -l    # How many log files under /var?
ls /etc/ | wc -l                   # How many items in /etc/?

find — Locating Files

grep searches inside files. find searches for the files themselves. When you know a file exists somewhere but do not know where, find is the answer.

Find by Name

find / -name "*.log"

Scans the entire file system (starting from /) for any file ending in .log. This can take a moment on a large system, and you will see "Permission denied" errors for directories you cannot access.

Search a Specific Directory

find /var -name "*.conf"
find /etc -name "*.conf"

Narrow your search to a specific directory tree. Searching /var or /etc is much faster than searching the entire system from /.

Files Only vs. Directories Only

find . -type f -name "*.txt"      # Files only
find . -type d -name "config"     # Directories only

The -type flag lets you specify what you are looking for. f means regular files, d means directories. The .means "start from the current directory."

Find Recently Modified Files

find . -mtime -1       # Modified in the last 1 day
find . -mtime -7       # Modified in the last 7 days
find /var/log -mtime 0 # Modified today

The -mtime flag filters by modification time. -1means "less than 1 day ago." This is useful when troubleshooting — "what changed recently?"

Suppressing Permission Errors: 2>/dev/null

find / -name "*.log" 2>/dev/null

When you search from /, you will hit directories you do not have permission to read. Each one generates a "Permission denied" error that clutters your output. Adding 2>/dev/null at the end suppresses those errors.

What does 2>/dev/null mean? In Linux, every command has two output streams: stdout (stream 1, normal output) and stderr (stream 2, error messages). 2>/dev/null redirects stderr to /dev/null, which is a special file that discards everything written to it. In plain English: "throw away the error messages and just show me the results."

Combining It All — Real Troubleshooting

Here is where everything comes together. On the job, you do not use these commands in isolation — you combine them to answer specific questions quickly.

Scenario: How many errors happened in a specific time window?

Your app is throwing errors. You need to know how many errors occurred during the 10:00 AM hour.

grep "ERROR" app.log | grep "10:0" | wc -l

First grep finds all error lines. Second grep narrows to lines with timestamps starting with "10:0" (10:00-10:09). Then wc -l counts them. Three commands, one answer.

Scenario: Find all config files that mention a specific hostname

You are migrating a database server and need to find every configuration file that references the old hostname.

grep -r "db-host-old" /etc/ 2>/dev/null

Recursively searches every file under /etc/ for the hostname string. The 2>/dev/null suppresses permission errors for files you cannot read.

Scenario: What are the most common error types?

You see a lot of errors in the log. You want to know which error messages occur most frequently.

grep "ERROR" app.log | awk '{print $4, $5, $6, $7}' | sort | uniq -c | sort -rn

This extracts the error description portion, sorts it, counts duplicates, and sorts by frequency. The most common error appears first. Do not worry about the awk syntax for now — the concept of chaining commands is what matters.

Scenario: Find large log files consuming disk space

find /var/log -name "*.log" -type f -exec ls -lh {} \; 2>/dev/null | sort -k5 -h -r

Finds all .log files under /var/log, gets their sizes, and sorts by size (largest first). This is how you track down what is eating your disk space.

Try It — Search and Filter Exercise

Work through these steps on your Linux VM. If you still have your app.log file from the previous step, use that. If not, create one first.

Expand your log file. Use nano or echo with >> to add more lines to app.log until you have 15-20 lines total. Mix in ERROR, WARNING, and INFO messages with different timestamps.

echo "2024-01-15 10:05:33 ERROR Disk space critically low" >> app.log
echo "2024-01-15 10:06:01 WARNING Memory usage above 80%" >> app.log
echo "2024-01-15 10:07:15 INFO Scheduled backup started" >> app.log
echo "2024-01-15 11:01:44 ERROR Connection refused on port 5432" >> app.log

Search for errors. Run grep "ERROR" app.log and observe the output.
Count the errors. Run grep "ERROR" app.log | wc -l to get a total count.
Search for warnings. Run grep "WARNING" app.log to see all warning lines.
Show line numbers. Run grep -n "ERROR" app.log to see which line numbers the errors are on.
Exclude INFO messages. Run grep -v "INFO" app.log to see everything except the routine messages.
Find log files. Run find ~ -name "*.log" 2>/dev/null to locate all .log files under your home directory.
Try a multi-pipe chain. Run cat app.log | grep "ERROR" | grep "10:" | wc -l to count errors that occurred during the 10:00 hour.
Find and count. Run find / -name "*.log" 2>/dev/null | wc -l to count how many .log files exist on the entire system.
Experiment. Try combining different commands with pipes. There is no wrong answer here — the goal is to get comfortable chaining things together.

Checkpoint

Before moving on, make sure you can do all of the following:

Use grep to find specific text in a file
Use grep -i for case-insensitive searching
Use grep -v to exclude lines matching a pattern
Use grep -r to search recursively through directories
Explain what the pipe operator (|) does — in your own words
Chain at least two commands together with a pipe
Use wc -l to count lines in a pipe chain
Use find to locate files by name
Explain what 2>/dev/null does
Build a multi-step pipe chain to answer a specific question (e.g., "how many errors contain the word database?")

If you can check off all of these, you have the searching and filtering skills you need. Pipes especially will come up constantly from here on out — they are not a nice-to-have, they are fundamental.

PreviousStep 5 of 7Next