Lesson 06: I/O and Redirection

Lesson 06: I/O and Redirection

Difficulty: ⭐⭐⭐

Previous: 05_Functions_and_Libraries.md | Next: 07_String_Processing.md


1. File Descriptors

File descriptors (FDs) are integers that reference open files or I/O streams. Understanding them is fundamental to mastering I/O redirection in bash.

1.1 Standard File Descriptors

Every process has three standard file descriptors:

FD Name Purpose Default
0 stdin Standard input Keyboard
1 stdout Standard output Terminal
2 stderr Standard error Terminal
#!/bin/bash

# Read from stdin (FD 0)
read -p "Enter your name: " name
echo "Hello, $name"

# Write to stdout (FD 1)
echo "This goes to stdout" >&1  # Explicit (same as just echo)

# Write to stderr (FD 2)
echo "This is an error message" >&2

1.2 Custom File Descriptors

You can create custom file descriptors (3-9 are commonly used):

#!/bin/bash

# Open file for reading on FD 3
exec 3< input.txt

# Read from FD 3
while read -u 3 line; do
    echo "Line: $line"
done

# Close FD 3
exec 3<&-

# Open file for writing on FD 4
exec 4> output.txt

# Write to FD 4
echo "First line" >&4
echo "Second line" >&4

# Close FD 4
exec 4>&-

1.3 Opening File Descriptors for Read/Write

#!/bin/bash

# Open file for both reading and writing on FD 5
exec 5<> datafile.txt

# Read current content
while read -u 5 line; do
    echo "Read: $line"
done

# Write new content (appends)
echo "New data" >&5

# Close FD 5
exec 5>&-

1.4 Duplicating File Descriptors

#!/bin/bash

# Duplicate stdout (FD 1) to FD 3
exec 3>&1

# Now redirect stdout to a file
exec 1> output.log

# This goes to output.log
echo "Logging to file"

# This still goes to terminal (via FD 3)
echo "Direct to terminal" >&3

# Restore stdout from FD 3
exec 1>&3

# Close FD 3
exec 3>&-

# Now back to terminal
echo "Back to normal stdout"

1.5 File Descriptor Inspection

#!/bin/bash

# View file descriptors for current shell
ls -l /dev/fd/
# or
ls -l /proc/self/fd/

# Check if FD is open
if [[ -e /dev/fd/3 ]]; then
    echo "FD 3 is open"
else
    echo "FD 3 is closed"
fi

# Get information about FD
exec 5> myfile.txt
readlink /proc/self/fd/5  # Shows the file path
exec 5>&-

2. Advanced Redirection

Beyond basic > and <, bash offers powerful redirection operators.

2.1 Redirecting Stderr Separately

#!/bin/bash

# Redirect stdout to file1, stderr to file2
command > stdout.log 2> stderr.log

# Example: compile C program
gcc program.c -o program > compile_output.txt 2> compile_errors.txt

# Check if compilation had errors
if [[ -s compile_errors.txt ]]; then
    echo "Compilation failed:"
    cat compile_errors.txt
else
    echo "Compilation successful!"
fi

2.2 Merging Stdout and Stderr

#!/bin/bash

# Method 1: Redirect stderr to stdout
command > output.log 2>&1

# Method 2: Shorthand (Bash 4+)
command &> output.log

# Method 3: Append both
command >> output.log 2>&1

# Example: run test suite
./run_tests.sh &> test_results.log

# This is WRONG (order matters):
command 2>&1 > output.log  # stderr still goes to terminal!
# Correct:
command > output.log 2>&1  # stderr follows stdout to file

2.3 Discarding Output

#!/bin/bash

# Discard stdout
command > /dev/null

# Discard stderr
command 2> /dev/null

# Discard both
command &> /dev/null

# Example: silent operation
if some_command &> /dev/null; then
    echo "Command succeeded (silently)"
fi

# Keep stderr, discard stdout
command > /dev/null

# Example: check if command exists
if command -v python3 > /dev/null 2>&1; then
    echo "python3 is installed"
fi

2.4 Swapping Stdout and Stderr

#!/bin/bash

# Swap stdout and stderr
command 3>&1 1>&2 2>&3 3>&-

# Explanation:
# 3>&1  - Save stdout to FD 3
# 1>&2  - Redirect stdout to stderr
# 2>&3  - Redirect stderr to FD 3 (original stdout)
# 3>&-  - Close FD 3

# Practical example: error messages to stdout, normal output to stderr
swap_outputs() {
    "$@" 3>&1 1>&2 2>&3 3>&-
}

# Now errors appear on stdout (can be captured)
errors=$(swap_outputs some_command)

2.5 Saving and Restoring File Descriptors

#!/bin/bash

# Save original stdout and stderr
exec 3>&1 4>&2

# Redirect stdout and stderr to files
exec 1> output.log 2> error.log

# Commands here write to log files
echo "This goes to output.log"
echo "This is an error" >&2

# Restore original stdout and stderr
exec 1>&3 2>&4

# Close backup FDs
exec 3>&- 4>&-

# Now back to terminal
echo "Back to terminal"

2.6 Appending vs Truncating

#!/bin/bash

# Truncate file (overwrite)
echo "New content" > file.txt

# Append to file
echo "Additional content" >> file.txt

# Append stderr
command 2>> error.log

# Append both stdout and stderr
command &>> output.log

3. Here Documents and Here Strings

Here documents provide multi-line input to commands without creating temporary files.

3.1 Basic Here Document

#!/bin/bash

# Basic here document
cat <<EOF
This is a multi-line
here document.
It can contain variables: $HOME
And command substitution: $(date)
EOF

# With indentation (<<- removes leading tabs, not spaces)
cat <<-EOF
    This is indented with tabs
    The tabs will be removed
    But the text stays aligned
EOF

3.2 Here Document Without Variable Expansion

#!/bin/bash

# Quote the delimiter to prevent expansion
cat <<'EOF'
Variables are literal: $HOME
Command substitution is literal: $(date)
This is useful for generating scripts or code.
EOF

# Example: generate a bash script
cat <<'SCRIPT' > myscript.sh
#!/bin/bash
echo "Hello from generated script"
echo "Current directory: $PWD"
SCRIPT

chmod +x myscript.sh

3.3 Here Document to Variables

#!/bin/bash

# Assign here document to variable
read -r -d '' sql_query <<EOF
SELECT u.name, u.email, o.order_id
FROM users u
JOIN orders o ON u.id = o.user_id
WHERE o.status = 'pending'
ORDER BY o.created_at DESC
LIMIT 10;
EOF

echo "Executing query:"
echo "$sql_query"

# Alternative method (using command substitution)
json_data=$(cat <<EOF
{
    "name": "John Doe",
    "email": "john@example.com",
    "age": 30,
    "roles": ["admin", "user"]
}
EOF
)

echo "$json_data"

3.4 Here Document with Command Input

#!/bin/bash

# Send multi-line input to a command
mysql -u root -p <<SQL
USE mydb;
CREATE TABLE IF NOT EXISTS users (
    id INT PRIMARY KEY AUTO_INCREMENT,
    username VARCHAR(50) NOT NULL,
    email VARCHAR(100) NOT NULL,
    created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
INSERT INTO users (username, email) VALUES ('alice', 'alice@example.com');
SQL

# Python script execution
python3 <<PYTHON
import sys
import json

data = {
    'message': 'Hello from Python',
    'version': sys.version
}

print(json.dumps(data, indent=2))
PYTHON

3.5 Here Strings

#!/bin/bash

# Here string: single-line input
grep "pattern" <<< "This is a test pattern string"

# Useful for piping variables
while read -r word; do
    echo "Word: $word"
done <<< "one two three four five"

# Example: parse CSV line
IFS=',' read -r name age city <<< "John,30,NYC"
echo "Name: $name, Age: $age, City: $city"

# Base64 encode a string
encoded=$(base64 <<< "Secret message")
echo "Encoded: $encoded"

# Decode it back
decoded=$(base64 -d <<< "$encoded")
echo "Decoded: $decoded"

3.6 Practical Template Generation

#!/bin/bash

generate_html() {
    local title=$1
    local content=$2

    cat <<HTML
<!DOCTYPE html>
<html lang="en">
<head>
    <meta charset="UTF-8">
    <title>$title</title>
</head>
<body>
    <h1>$title</h1>
    <p>$content</p>
    <footer>Generated on $(date)</footer>
</body>
</html>
HTML
}

# Generate HTML page
generate_html "My Page" "Welcome to my website!" > index.html

# Generate configuration file
generate_config() {
    local host=$1
    local port=$2

    cat <<CONFIG > app.conf
# Application Configuration
# Generated: $(date)

[server]
host = $host
port = $port
workers = 4

[database]
host = localhost
port = 5432
name = myapp

[logging]
level = INFO
file = /var/log/myapp.log
CONFIG
}

generate_config "0.0.0.0" "8080"

4. Process Substitution

Process substitution creates temporary named pipes for command output, allowing commands to be used where files are expected.

4.1 Input Process Substitution

#!/bin/bash

# Compare output of two commands
diff <(ls dir1) <(ls dir2)

# More complex example: compare sorted lists
diff <(sort file1.txt) <(sort file2.txt)

# Compare running processes on two systems
diff <(ssh server1 ps aux | sort) <(ssh server2 ps aux | sort)

# Example: find common lines in two command outputs
comm -12 <(sort list1.txt) <(sort list2.txt)

4.2 Output Process Substitution

#!/bin/bash

# Write to multiple files simultaneously
tee >(grep "ERROR" > errors.log) \
    >(grep "WARN" > warnings.log) \
    >(grep "INFO" > info.log) \
    < application.log > /dev/null

# Example: split log by severity
process_logs() {
    local logfile=$1

    cat "$logfile" | tee \
        >(grep "ERROR" > errors.log) \
        >(grep "WARN" > warnings.log) \
        > all.log
}

4.3 Avoiding Subshell Variable Scope Issues

#!/bin/bash

# PROBLEM: Variables in pipeline are in subshell
count=0
cat file.txt | while read line; do
    ((count++))
done
echo "Lines: $count"  # Output: 0 (variable not modified!)

# SOLUTION 1: Process substitution
count=0
while read line; do
    ((count++))
done < <(cat file.txt)
echo "Lines: $count"  # Correct count

# SOLUTION 2: Use here string with command substitution (for small files)
count=0
while read line; do
    ((count++))
done <<< "$(cat file.txt)"
echo "Lines: $count"  # Correct count

4.4 Multiple Input Streams

#!/bin/bash

# Read from multiple files in parallel
paste <(cut -d',' -f1 file1.csv) \
      <(cut -d',' -f2 file2.csv) \
      <(cut -d',' -f3 file3.csv)

# Example: merge data from multiple sources
while read -u 3 name && read -u 4 age && read -u 5 city; do
    echo "$name is $age years old and lives in $city"
done 3< <(cut -d',' -f1 data.csv) \
     4< <(cut -d',' -f2 data.csv) \
     5< <(cut -d',' -f3 data.csv)

4.5 Practical Examples

#!/bin/bash

# Example 1: Find files modified in last 24 hours that contain pattern
grep "TODO" <(find . -type f -mtime -1 -exec cat {} \;)

# Example 2: Monitor log file and send alerts
while read line; do
    if [[ $line == *"CRITICAL"* ]]; then
        echo "Alert: $line" | mail -s "Critical Error" admin@example.com
    fi
done < <(tail -f /var/log/app.log)

# Example 3: Process compressed file without extracting
while read line; do
    echo "Processing: $line"
done < <(gunzip -c data.txt.gz)

# Example 4: Create temporary file list for processing
tar czf backup.tar.gz -T <(find /data -type f -mtime -7)

5. Named Pipes (FIFOs)

Named pipes allow inter-process communication through the filesystem.

5.1 Creating and Using FIFOs

#!/bin/bash

# Create named pipe
mkfifo mypipe

# Producer (background process)
{
    for i in {1..10}; do
        echo "Message $i"
        sleep 1
    done > mypipe
} &

# Consumer
while read line; do
    echo "Received: $line"
done < mypipe

# Cleanup
rm mypipe

5.2 Producer-Consumer Pattern

#!/bin/bash

PIPE="/tmp/data_pipe_$$"

# Create pipe and set trap for cleanup
mkfifo "$PIPE"
trap "rm -f '$PIPE'" EXIT

# Producer: generate data
producer() {
    local pipe=$1
    echo "Producer starting..."

    for i in {1..100}; do
        echo "Data item $i: $(date +%s)"
        sleep 0.1
    done > "$pipe"

    echo "Producer finished"
}

# Consumer: process data
consumer() {
    local pipe=$1
    echo "Consumer starting..."

    local count=0
    while read line; do
        ((count++))
        # Process data (simulate work)
        [[ $((count % 10)) -eq 0 ]] && echo "Processed $count items"
    done < "$pipe"

    echo "Consumer finished: $count items processed"
}

# Run producer in background
producer "$PIPE" &
producer_pid=$!

# Run consumer in foreground
consumer "$PIPE"

# Wait for producer to finish
wait $producer_pid

5.3 Bidirectional Communication

#!/bin/bash

REQUEST_PIPE="/tmp/request_$$"
RESPONSE_PIPE="/tmp/response_$$"

# Create pipes
mkfifo "$REQUEST_PIPE" "$RESPONSE_PIPE"
trap "rm -f '$REQUEST_PIPE' '$RESPONSE_PIPE'" EXIT

# Server process
server() {
    echo "Server started"

    while true; do
        # Read request
        read request < "$REQUEST_PIPE"

        # Process request
        case $request in
            "PING")
                echo "PONG" > "$RESPONSE_PIPE"
                ;;
            "TIME")
                date > "$RESPONSE_PIPE"
                ;;
            "QUIT")
                echo "BYE" > "$RESPONSE_PIPE"
                break
                ;;
            *)
                echo "ERROR: Unknown command" > "$RESPONSE_PIPE"
                ;;
        esac
    done

    echo "Server stopped"
}

# Client function
client() {
    local command=$1

    # Send request
    echo "$command" > "$REQUEST_PIPE"

    # Read response
    read response < "$RESPONSE_PIPE"
    echo "Response: $response"
}

# Start server in background
server &
server_pid=$!

sleep 1  # Give server time to start

# Send requests
client "PING"
client "TIME"
client "QUIT"

# Wait for server
wait $server_pid

5.4 When to Use FIFOs vs Process Substitution

Feature FIFO Process Substitution
Persistence Yes (until deleted) No (automatic cleanup)
Multiple readers/writers Yes No
Explicit synchronization Yes No
Use in background Easy Complex
Cleanup required Manual Automatic
Best for Long-running IPC One-time operations
#!/bin/bash

# Use process substitution for one-time comparison
diff <(command1) <(command2)

# Use FIFO for persistent communication
mkfifo /tmp/logpipe
tail -f /var/log/app.log > /tmp/logpipe &
while read line; do
    process_log_line "$line"
done < /tmp/logpipe

6. Pipe Pitfalls and Solutions

6.1 Subshell Variable Scope Loss

#!/bin/bash

# PROBLEM: Last command in pipeline runs in subshell
total=0
cat numbers.txt | while read num; do
    ((total += num))
done
echo "Total: $total"  # Output: 0 (not modified!)

# SOLUTION 1: Process substitution
total=0
while read num; do
    ((total += num))
done < <(cat numbers.txt)
echo "Total: $total"  # Correct

# SOLUTION 2: Use lastpipe (Bash 4.2+, only in scripts)
shopt -s lastpipe
total=0
cat numbers.txt | while read num; do
    ((total += num))
done
echo "Total: $total"  # Correct

# SOLUTION 3: Temporary file
tmpfile=$(mktemp)
cat numbers.txt > "$tmpfile"
total=0
while read num; do
    ((total += num))
done < "$tmpfile"
rm "$tmpfile"
echo "Total: $total"  # Correct

6.2 PIPESTATUS Array

#!/bin/bash

# Check exit status of all pipeline commands
command1 | command2 | command3

# PIPESTATUS contains exit codes of all commands
echo "Exit codes: ${PIPESTATUS[@]}"

# Example: detect failure anywhere in pipeline
false | true | true
if [[ ${PIPESTATUS[0]} -ne 0 ]]; then
    echo "First command failed"
fi

# Practical example: database pipeline
{
    mysql -u root -p mydb -e "SELECT * FROM users" | \
    grep "active" | \
    sort -k2
} 2>/dev/null

pipeline_status=("${PIPESTATUS[@]}")
if [[ ${pipeline_status[0]} -ne 0 ]]; then
    echo "Database query failed"
elif [[ ${pipeline_status[1]} -ne 0 ]]; then
    echo "Grep failed"
elif [[ ${pipeline_status[2]} -ne 0 ]]; then
    echo "Sort failed"
else
    echo "Pipeline succeeded"
fi

6.3 Pipeline Error Handling

#!/bin/bash

# Enable pipefail: pipeline fails if any command fails
set -o pipefail

# Now the pipeline returns non-zero if any command fails
if command1 | command2 | command3; then
    echo "Pipeline succeeded"
else
    echo "Pipeline failed"
fi

# Practical example: safe data processing
set -euo pipefail  # Exit on error, undefined variable, or pipeline failure

process_data() {
    local input=$1
    local output=$2

    cat "$input" | \
        grep -v "^#" | \
        sort -u | \
        sed 's/foo/bar/g' \
        > "$output"

    # If any command fails, script exits
}

# Error handling with pipefail
set -o pipefail
if ! tar czf backup.tar.gz --exclude="*.tmp" -T <(find /data -type f); then
    echo "Backup failed" >&2
    exit 1
fi

6.4 Named Pipe Deadlock Prevention

#!/bin/bash

# PROBLEM: Deadlock if reader/writer not coordinated
mkfifo mypipe
echo "data" > mypipe  # BLOCKS forever (no reader)!

# SOLUTION 1: Open pipe for reading and writing
mkfifo mypipe
exec 3<> mypipe  # Open for read/write

echo "data" >&3  # Write
read line <&3    # Read
exec 3>&-        # Close

rm mypipe

# SOLUTION 2: Background processes with proper synchronization
mkfifo mypipe
trap "rm -f mypipe" EXIT

# Reader in background
cat < mypipe &
reader_pid=$!

# Writer
echo "data" > mypipe

# Wait for reader
wait $reader_pid

7. Practical I/O Patterns

7.1 Tee to Multiple Destinations

#!/bin/bash

# Basic tee: write to file and stdout
echo "Important message" | tee log.txt

# Multiple files
echo "Message" | tee file1.txt file2.txt file3.txt

# Append mode
echo "New entry" | tee -a logfile.txt

# Complex example: split processing
cat data.txt | tee \
    >(grep "ERROR" > errors.log) \
    >(grep "WARN" > warnings.log) \
    >(wc -l > linecount.txt) \
    | grep "INFO" > info.log

7.2 Logging to Console and File

#!/bin/bash

# Setup logging
LOGFILE="application.log"

log() {
    local level=$1
    shift
    local message="$@"
    local timestamp=$(date '+%Y-%m-%d %H:%M:%S')

    # Log to both console and file
    echo "[$timestamp] [$level] $message" | tee -a "$LOGFILE"
}

# Usage
log "INFO" "Application started"
log "WARN" "Configuration file not found, using defaults"
log "ERROR" "Failed to connect to database"

# Alternative: redirect all output
exec > >(tee -a "$LOGFILE")
exec 2>&1

# Now all output goes to both console and file
echo "This appears in both places"
ls /nonexistent  # Error also logged

7.3 Reading and Writing Same File Safely

#!/bin/bash

# WRONG: This truncates the file before reading!
sort file.txt > file.txt  # file.txt becomes empty!

# SOLUTION 1: Use sponge (from moreutils)
sort file.txt | sponge file.txt

# SOLUTION 2: Use temporary file
sort file.txt > file.txt.tmp && mv file.txt.tmp file.txt

# SOLUTION 3: In-place edit with -i flag (if supported)
sed -i 's/foo/bar/g' file.txt

# Atomic file replacement
update_config() {
    local config_file=$1
    local tmpfile=$(mktemp)

    # Process file
    process_config < "$config_file" > "$tmpfile"

    # Atomic replacement
    mv "$tmpfile" "$config_file"
}

7.4 Atomic File Writes

#!/bin/bash

# Atomic write pattern: write to temp, then move
atomic_write() {
    local target_file=$1
    local content=$2

    local tmpfile=$(mktemp "${target_file}.XXXXXX")

    # Write to temporary file
    echo "$content" > "$tmpfile"

    # Verify write succeeded
    if [[ $? -eq 0 ]]; then
        # Atomic move (on same filesystem)
        mv "$tmpfile" "$target_file"
    else
        rm -f "$tmpfile"
        return 1
    fi
}

# Usage
atomic_write "config.json" '{"setting": "value"}'

# Complex example: update critical file
update_critical_file() {
    local file=$1
    local tmpfile=$(mktemp)

    # Set trap for cleanup
    trap "rm -f '$tmpfile'" RETURN

    # Generate new content
    if ! generate_content > "$tmpfile"; then
        echo "Error: Failed to generate content" >&2
        return 1
    fi

    # Validate new content
    if ! validate_content "$tmpfile"; then
        echo "Error: Content validation failed" >&2
        return 1
    fi

    # Set same permissions as original
    chmod --reference="$file" "$tmpfile" 2>/dev/null

    # Atomic replacement
    mv "$tmpfile" "$file"
}

7.5 File Locking for Safe Concurrent Access

#!/bin/bash

# Use flock for file locking
update_counter() {
    local counter_file="counter.txt"
    local lockfile="counter.lock"

    # Acquire exclusive lock (FD 200)
    {
        flock -x 200

        # Read current value
        local count=0
        [[ -f $counter_file ]] && count=$(cat "$counter_file")

        # Increment
        ((count++))

        # Write back
        echo "$count" > "$counter_file"

        echo "Counter updated to: $count"

    } 200>"$lockfile"
}

# Multiple processes can safely call this
for i in {1..10}; do
    update_counter &
done
wait

# Final value
echo "Final count: $(cat counter.txt)"

# Alternative: inline locking
{
    flock -x 200

    # Critical section
    echo "Exclusive access to resource"
    sleep 2

} 200>/tmp/mylock

7.6 Progress Indication with FIFOs

#!/bin/bash

# Create progress pipe
mkfifo /tmp/progress_$$
trap "rm -f /tmp/progress_$$" EXIT

# Progress monitor (background)
{
    while read percent message; do
        printf "\r[%-50s] %d%% %s" \
            "$(printf '#%.0s' $(seq 1 $((percent / 2))))" \
            "$percent" \
            "$message"
    done < /tmp/progress_$$
    echo
} &
monitor_pid=$!

# Worker process
{
    total=100
    for i in $(seq 1 $total); do
        # Simulate work
        sleep 0.05

        # Report progress
        percent=$((i * 100 / total))
        echo "$percent Processing item $i" > /tmp/progress_$$
    done
} &
worker_pid=$!

# Wait for completion
wait $worker_pid
wait $monitor_pid

Practice Problems

Problem 1: Multi-Target Logger

Create a logging system that: - Accepts log level (DEBUG, INFO, WARN, ERROR) and message - Writes all logs to all.log - Writes ERROR logs to error.log - Writes WARN and ERROR to important.log - Displays ERROR and WARN on stderr, others on stdout - Adds timestamp and hostname to each log entry - Implements log rotation when file exceeds 10MB

Problem 2: Pipeline Monitor

Write a script that: - Runs a multi-stage pipeline (e.g., download | decompress | process | upload) - Monitors the exit status of each stage using PIPESTATUS - Logs progress to a file using process substitution - Implements retry logic for failed stages - Reports which stage failed and why - Calculates total time and throughput

Problem 3: FIFO-Based Queue System

Implement a simple job queue using named pipes: - Create job_submit command that sends jobs to a queue - Create job_worker that processes jobs from the queue - Support multiple concurrent workers - Implement job status tracking (pending, running, completed, failed) - Handle worker crashes gracefully - Provide job_status command to check queue state

Problem 4: Configuration Validator

Build a tool that: - Reads configuration file from stdin or file argument - Validates syntax using a validation command - If valid, atomically replaces the old config - If invalid, shows errors on stderr and keeps old config - Creates backup before replacement (keep last 5 backups) - Logs all changes with timestamp - Supports dry-run mode (validate without replacing)

Problem 5: Stream Processor with FDs

Create a stream processing framework: - Opens 3 input streams on FDs 3, 4, 5 - Merges streams with timestamps - Filters based on regex pattern - Splits output to different files based on content - Maintains statistics (lines processed per stream, matches, errors) - Uses process substitution for real-time monitoring - Handles stream termination gracefully

Previous: 05_Functions_and_Libraries.md | Next: 07_String_Processing.md

to navigate between lessons