notify_devilry acts as a router between local server notification sources and remote service destinations.
There was a typo in the word delivery, but it remains as a part of the name now :) .
notify_devilry helps to:
- Route server notifications (monitring alerts, events etc) to humans via chats and to services via APIs.
- Decentralize server monitoring (all servers send to chats independently with no central hub to avoid central monitoring service failure or misconfiguration).
- Centralize server monitoring alerts and events in central monitoring service.
- Apply rule chains if notification keys match or not, see notify_devilry.yaml.example.
- Rule chains can apply actions on notifications:
- Set notification key values, e.g. to set severity for specific resources or clients.
- Rate limit notifications (useful for human destinations to avoid noise in chats).
- Suppress notifications.
- Send notifications to several destinations including other rules.
- YAML config is rendered with Jinja2 each notification, for example to modify flow for non working hours.
All sysadmws-utils components use notify_devilry to send notifications.
notify_devilry mostly follows alerta Design Principles
and Conventions to be compatible with alerta as one of possible API endpoints.
Local server notifications sources should send notifications as JSON via stdin.
Remote destination services supported:
* - mandatory keys
severity* - according toalertaseverity mapfatalsecuritycriticalmajorminorwarningoknormalclearedindeterminateinformationaldebugtraceunknown
client- could be used to choose differentalertadestinations with different per customer keys, differentchat_ids intelegram, default from config will apply if ommitedenvironment- default from config will apply if ommitedprodstagingdevinfralegacy
service(list inalerta, string - here) andresource*serversrv1.example.com
disksrv1.example.com:/srv1.example.com:/mnt/partition
databasesrv1.example.com:mysql
heartbeatsrv1.example.com
websitehttps://example.com/
dnsexample.com
event*notify_devilry_testnotify_devilry_oknotify_devilry_criticalcmd_check_alert_cmd_okcmd_check_alert_cmd_retcode_not_zerocmd_check_alert_cmd_timeoutcmd_check_alert_time_limit_okcmd_check_alert_time_limit_warning
value30s60s99%5d10d
group- we use this alerta ui filter for host selectionsrv1.example.com- host fqdn here for host specific alertsdns- something more generic for non-host specific alerts
originheartbeat_mesh/receiver.pydisk_alert.shmysql_replica_checker.shwebsite_checker
attributes- any key value pair with additional datadatetime:1970-01-01 00:00:00 +0000 UTClocation:Hetzner
texthost heartbeat lost for 30 secondsmysql slave is 60 seconds behind masterpartition is 99% full, 100 Mb availablecertificate will expire in 5 daysdomain will expire in 10 days
timeout- override default alert timeout valuetype- foralertasysadmws-utils
correlate- foralertaCorrelation- [
notify_devilry_test,notify_devilry_critical,notify_devilry_ok]
- [
force_send- set to True if program run with--force-send, can be used in match filter
notify_devilry can rate limit notifications with rate_limit.
It has no practical sense for routing to alerta as it has its own De-Duplication mechanisms,
but helps humans to receive less messages via chats.
It uses environment, resource, event and severity notification keys to detect similiar notifications.
chains:
# All chains are processed on new notification, specific when targeted with `jump`
chain1:
# Chain of rules, rules are processed one by one in this list, modifications of the notifications on each rule persist
- name: rule example 1 # required, just a name to reference the rule and describe it
entrypoint: True # optional, catch new incoming notification if True and in first rule in chain
match: # optional, apply this rule only if notification matches
key1: # key name in notification to match
in|not_in: # match type
- value1 # values list to match
- value2
key2:
...
set: # optional, set notification key to value
key1: value1
key2: value2
jump: # optional, list of chains to process the notification, modifications of the notifications in one chain do not persist to next chain in this list
- chain2
- chain3
send: # optional
alerta: # send alert to `alerta` aliases
- alerta_alias1
- alerta_alias2
telegram: # send message to `telegram` aliases
- telegram_alias1
- telegram_alias2
suppress: True # optional, do nothing actually
chain_break: True # optional, stop processing rules in this chain
Each rule must have only one item of set, jump, send, suppress.