System Monitoring and Maintenance Script
#!/bin/bash
# Configuration
LOG_FILE="/var/log/system_maintenance.log"
# Function to log messages
log_message() {
echo "$(date +'%Y-%m-%d %H:%M:%S') - $1" | tee -a $LOG_FILE
}
# Function to display system usage
show_system_usage() {
log_message "System Usage:"
echo "----------------------------------" | tee -a $LOG_FILE
echo "CPU Usage:" | tee -a $LOG_FILE
mpstat | tee -a $LOG_FILE
echo "----------------------------------" | tee -a $LOG_FILE
echo "Memory Usage:" | tee -a $LOG_FILE
free -h | tee -a $LOG_FILE
echo "----------------------------------" | tee -a $LOG_FILE
echo "Disk Usage:" | tee -a $LOG_FILE
df -h | tee -a $LOG_FILE
echo "----------------------------------" | tee -a $LOG_FILE
echo "System Load:" | tee -a $LOG_FILE
uptime | tee -a $LOG_FILE
echo "----------------------------------" | tee -a $LOG_FILE
}
# Function to clear cache
clear_cache() {
log_message "Clearing system cache"
sync; echo 3 > /proc/sys/vm/drop_caches
log_message "System cache cleared"
}
# Function to update the system
update_system() {
log_message "Updating the system"
apt update && apt upgrade -y
log_message "System update completed"
}
# Function to check for failed services
check_failed_services() {
log_message "Checking for failed services"
systemctl --failed | tee -a $LOG_FILE
echo "----------------------------------" | tee -a $LOG_FILE
}
# Function to restart a failed service
restart_failed_services() {
log_message "Restarting failed services"
FAILED_SERVICES=$(systemctl --failed | grep "●" | awk '{print $2}')
for SERVICE in $FAILED_SERVICES; do
log_message "Restarting service: $SERVICE"
systemctl restart $SERVICE
if [ $? -eq 0 ]; then
log_message "Service $SERVICE restarted successfully"
else
log_message "Failed to restart service $SERVICE"
fi
done
}
# Main script logic
echo "----------------------------------" | tee -a $LOG_FILE
log_message "Starting System Monitoring and Maintenance"
# Show system usage
show_system_usage
# Clear system cache
clear_cache
# Update the system
update_system
# Check for failed services
check_failed_services
# Restart failed services
restart_failed_services
log_message "System Monitoring and Maintenance completed"
echo "----------------------------------" | tee -a $LOG_FILE
exit 0
Explanation
Configuration:
LOG_FILE
: Path to the log file where the script's output will be logged.
Logging Function:
log_message()
: Logs messages with timestamps to the log file.
System Usage Function:
show_system_usage()
: Displays CPU usage, memory usage, disk usage, and system load.
Clear Cache Function:
clear_cache()
: Clears the system cache usingsync
andecho
commands.
Update System Function:
update_system()
: Updates the system usingapt update
andapt upgrade
.
Check Failed Services Function:
check_failed_services()
: Checks for failed services usingsystemctl --failed
.
Restart Failed Services Function:
restart_failed_services()
: Attempts to restart any failed services.
Main Script Logic:
- The script logs the start time, performs all maintenance tasks, and logs the completion time.
Usage
Make the script executable:
chmod +x system_maintenance.sh
Run the script as root:
sudo ./system_maintenance.sh
Scheduling with Cron
To automate the maintenance tasks, you can schedule the script using cron:
Edit the crontab:
crontab -e
Add a cron job to run the script daily at 2 AM:
0 2 * * * /path/to/system_maintenance.sh
Make sure to replace /path/to/system_
maintenance.sh
with the actual path to your script.
This script helps automate essential system monitoring and maintenance tasks, making it easier for system administrators to manage their systems effectively. This is just a super basic example, feel free to copy and modify it to suit your needs.
Improvements
Here are some improvements and enhancements to make the script more robust, efficient, and versatile:
1. Improved Logging
Include logging levels (INFO, ERROR, etc.)
Log to syslog for centralized logging
2. Configuration Section
Use variables for frequently used commands and paths
Allow configuration through a separate configuration file
3. Error Handling
Improve error handling for commands
Include retries for critical operations
4. Modular Functions
- Separate functions into individual scripts for better modularity and maintainability
5. Notifications
- Send email or Slack notifications for critical issues or after the script runs
6. Enhanced System Monitoring
Include additional checks like network status, temperature sensors, etc.
Use tools like
iostat
andvmstat
for more detailed statistics
7. Optional Arguments
- Allow the script to accept arguments for selective execution of tasks
8. Improved Security
Use sudo only where necessary
Ensure temporary files are securely handled
9. System Compatibility
- Add compatibility checks for different Linux distributions