Daily Hack #day87 - System Monitoring and Maintenance Script

Daily Hack #day87 - System Monitoring and Maintenance Script

System Monitoring and Maintenance Script

#!/bin/bash

# Configuration
LOG_FILE="/var/log/system_maintenance.log"

# Function to log messages
log_message() {
    echo "$(date +'%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
}

# Function to display system usage
show_system_usage() {
    log_message "System Usage:"
    echo "----------------------------------" | tee -a $LOG_FILE
    echo "CPU Usage:" | tee -a $LOG_FILE
    mpstat | tee -a $LOG_FILE
    echo "----------------------------------" | tee -a $LOG_FILE
    echo "Memory Usage:" | tee -a $LOG_FILE
    free -h | tee -a $LOG_FILE
    echo "----------------------------------" | tee -a $LOG_FILE
    echo "Disk Usage:" | tee -a $LOG_FILE
    df -h | tee -a $LOG_FILE
    echo "----------------------------------" | tee -a $LOG_FILE
    echo "System Load:" | tee -a $LOG_FILE
    uptime | tee -a $LOG_FILE
    echo "----------------------------------" | tee -a $LOG_FILE
}

# Function to clear cache
clear_cache() {
    log_message "Clearing system cache"
    sync; echo 3 > /proc/sys/vm/drop_caches
    log_message "System cache cleared"
}

# Function to update the system
update_system() {
    log_message "Updating the system"
    apt update && apt upgrade -y
    log_message "System update completed"
}

# Function to check for failed services
check_failed_services() {
    log_message "Checking for failed services"
    systemctl --failed | tee -a $LOG_FILE
    echo "----------------------------------" | tee -a $LOG_FILE
}

# Function to restart a failed service
restart_failed_services() {
    log_message "Restarting failed services"
    FAILED_SERVICES=$(systemctl --failed | grep "●" | awk '{print $2}')
    for SERVICE in $FAILED_SERVICES; do
        log_message "Restarting service: $SERVICE"
        systemctl restart $SERVICE
        if [ $? -eq 0 ]; then
            log_message "Service $SERVICE restarted successfully"
        else
            log_message "Failed to restart service $SERVICE"
        fi
    done
}

# Main script logic
echo "----------------------------------" | tee -a $LOG_FILE
log_message "Starting System Monitoring and Maintenance"

# Show system usage
show_system_usage

# Clear system cache
clear_cache

# Update the system
update_system

# Check for failed services
check_failed_services

# Restart failed services
restart_failed_services

log_message "System Monitoring and Maintenance completed"
echo "----------------------------------" | tee -a $LOG_FILE

exit 0

Explanation

  1. Configuration:

    • LOG_FILE: Path to the log file where the script's output will be logged.
  2. Logging Function:

    • log_message(): Logs messages with timestamps to the log file.
  3. System Usage Function:

    • show_system_usage(): Displays CPU usage, memory usage, disk usage, and system load.
  4. Clear Cache Function:

    • clear_cache(): Clears the system cache using sync and echo commands.
  5. Update System Function:

    • update_system(): Updates the system using apt update and apt upgrade.
  6. Check Failed Services Function:

    • check_failed_services(): Checks for failed services using systemctl --failed.
  7. Restart Failed Services Function:

    • restart_failed_services(): Attempts to restart any failed services.
  8. Main Script Logic:

    • The script logs the start time, performs all maintenance tasks, and logs the completion time.

Usage

  1. Make the script executable:

     chmod +x system_maintenance.sh
    
  2. Run the script as root:

     sudo ./system_maintenance.sh
    

Scheduling with Cron

To automate the maintenance tasks, you can schedule the script using cron:

  1. Edit the crontab:

     crontab -e
    
  2. Add a cron job to run the script daily at 2 AM:

     0 2 * * * /path/to/system_maintenance.sh
    

Make sure to replace /path/to/system_maintenance.sh with the actual path to your script.

This script helps automate essential system monitoring and maintenance tasks, making it easier for system administrators to manage their systems effectively. This is just a super basic example, feel free to copy and modify it to suit your needs.

Improvements

Here are some improvements and enhancements to make the script more robust, efficient, and versatile:

1. Improved Logging

  • Include logging levels (INFO, ERROR, etc.)

  • Log to syslog for centralized logging

2. Configuration Section

  • Use variables for frequently used commands and paths

  • Allow configuration through a separate configuration file

3. Error Handling

  • Improve error handling for commands

  • Include retries for critical operations

4. Modular Functions

  • Separate functions into individual scripts for better modularity and maintainability

5. Notifications

  • Send email or Slack notifications for critical issues or after the script runs

6. Enhanced System Monitoring

  • Include additional checks like network status, temperature sensors, etc.

  • Use tools like iostat and vmstat for more detailed statistics

7. Optional Arguments

  • Allow the script to accept arguments for selective execution of tasks

8. Improved Security

  • Use sudo only where necessary

  • Ensure temporary files are securely handled

9. System Compatibility

  • Add compatibility checks for different Linux distributions

Did you find this article valuable?

Support Cloud Tuned by becoming a sponsor. Any amount is appreciated!