Monitoring notifications
An overview of the pro-active monitioring notifcations
Our advanced monitoring and alerting system allows our support engineers to quickly and pro-actively start acting when needed. The following table shows the monitoring alerts that we've implemented. This list is constantly extending, as we're implementing new notifications as part of our efforts to cover the technology stacks used by our customers.
Notification | Description | Action |
---|---|---|
bounced-percent-warning | 20% of all sent mails are returned to the server and do not arrive correctly at the destination email address | Investigate where this problem comes from and inform the customer to take action. |
mysql-error-warning | The MySQL Log contains error messages from the last 3 days | Investigate problem and inform customer / partner with the error message or problem details |
mysql-slave-running-critical | MySQL master / slave setup does not run properly or slave runs behind for longer than 1 minute | Restore master/slave sync |
nginx-running-critical | NGINX is no longer running | Start Nginx and investigate why it stopped running |
nginx-test-critical | NGINX test (NGINX reload) fails due to error messages in the configuration | Investigate faulty configuration and inform the customer / partner about this |
redis-evicted-keys-info | Redis instance experiences Evicted-Keys because of memory limits | Investigate which instance runs out of memory and inform the customer / partner about this |
ssl-expire-info | SSL certificate expires in 3 days | Inform customer / partner |
ssl-expire-warning | SSL certificate expires in 1 day | Inform customer / partner |
ssl-expire-critical | SSL certificate expires in 1 hour | Inform customer / partner |
supervisorctl-service | Supervisor process is not running | Investigate why supervisor does not run and inform customer / partner about this |
system-cpu-temp-critical | The temperature of the CPU is too high | Make sure the CPU fan gets replaced |
system-disk-smart-disabled-warning | SMART on disk is not available or is disabled | Investigate why SMART is not available. Inform customers / partners if needed |
system-raid-health-critical | Disk is broken or performance does not meet requirements | Organize disk replacement with customer / partner |
system-disk-pending-sector-critical | Disc contains pending sectors and must be replaced | Organize disk replacement with customer / partner |
system-disk-reallocated-sector-count-warning | Disk contains bad sectors and must be replaced | Organize disk replacement with customer / partner |
system-disk-inodes-warning | 95% of the inodes are in use | Approach customer / partner to clean up inodes |
system-disk-inodes-critical | 99% of the inodes are in use | Clean inodes if possible and contact customer / partner about it |
system-disk-usage-warning | Disk use is 95% | Approach customer / partner to clean up disk usage |
system-disk-usage-critical | Disk use is 99% | Clean up disk usage if possible and contact customer / partner about it |
system-load-critical | Server is overloaded | Investigate / resolve the problem and contact customer / partner about it |
system-memory-usage-warning | Server uses 95% of the available memory | Investigate / resolve the problem and contact customer / partner about it |
system-memory-usage-critical | Server uses 99% of the available memory | Investigate / resolve the problem and contact customer / partner about it |