aboutsummaryrefslogtreecommitdiffstats
path: root/roles/nagios_server/templates/etc
Commit message (Collapse)AuthorAgeFilesLines
* nagios_server: minor config fixesStonewall Jackson2023-03-033-1/+11
|
* set max_retries to UINT32_MAX for asterisk registrationsStonewall Jackson2023-02-062-0/+14
| | | | | | | | | | | | | | | After a brief internet outage, I noticed that asterisk had given up trying to reconnect to my upstream SIP server (looks like the default value of max_retries is 10). Although the asterisk "registration" object was disconnected, the "endpoint" object still reported being up, so the check_asterisk_endpoints nagios plugin did not alert me to the problem. This commit sets max_retries to UNIT32_MAX for asterisk registrations by default. It also adds a new nagios plugin, check_asterisk_registrations. Unfortunately, the ARI does not expose registrations via the REST API, so I had to write a hacky bash script to parse the asterisk CLI output.
* don't perform nagios checks during reboot windowStonewall Jackson2023-02-042-3/+3
| | | | | | | | | | | | | Intially, I configured nagios to suspend notifications during the reboot window, to avoid alert noise while hosts were doing automated reboots. Since our nagios only sends a single notification for state changes, this results in lost alerts when a "real" problem occurs during the window. This commit switches the default host template to suspend the checks themselves, rather than the notifications, during the daily reboot window. If the problem still exists once the reboot window passes, we'll get the notification.
* initial commitStonewall Jackson2023-02-0412-0/+954