|
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
After a brief internet outage, I noticed that asterisk had given up
trying to reconnect to my upstream SIP server (looks like the default
value of max_retries is 10).
Although the asterisk "registration" object was disconnected, the
"endpoint" object still reported being up, so the
check_asterisk_endpoints nagios plugin did not alert me to the problem.
This commit sets max_retries to UNIT32_MAX for asterisk registrations by
default. It also adds a new nagios plugin, check_asterisk_registrations.
Unfortunately, the ARI does not expose registrations via the REST API,
so I had to write a hacky bash script to parse the asterisk CLI output.
|
|
|
|
|
|
|
|
|
|
|
|
|
| |
Intially, I configured nagios to suspend notifications during the reboot
window, to avoid alert noise while hosts were doing automated reboots.
Since our nagios only sends a single notification for state changes,
this results in lost alerts when a "real" problem occurs during the
window.
This commit switches the default host template to suspend the checks
themselves, rather than the notifications, during the daily reboot
window. If the problem still exists once the reboot window passes,
we'll get the notification.
|