In this tutorial, we’ll explain how to setup alerting system for Graphite metrics. Graphite is a monitoring tool that runs on the local system or Cloud infrastructure. Graphite is a powerful tool for collecting and visualizing time-series data, but it’s not enough to just have data; you need to be able to respond to it. That’s where an alerting system comes in. An alerting system is a way to automate the process of responding to changes in your metrics. In this article, we’ll show you how to set up an alerting system for Graphite metrics.
Graphite is used to monitor the performance of any services/application/website/network. Graphite is the new generation of monitoring tools that makes it easier to store/retrieve/share and visualize data.
Referred from Klen/graphite-beacon
The feature of Graphite-beacon:
- It is very simple and easy to install
- No other software dependency like database
- It is asynchronous
- Support alerting on SMTP, HipCHat, Slack, PagerDuty, HTTP handlers
- Easy to configure with historical values
Pre-requisites:
- Python (2.7, 3.3, 3.4)
- tornado
- funcparserlib
- pyyaml
How to Setup Alerting System for Graphite Metrics
Install graphite-beacon using pip command
pip install graphite-beacon
Debian package
Add the following to your /etc/apt/sources.list system config file:
echo "deb http://dl.bintray.com/klen/deb /" | sudo tee -a /etc/apt/sources.list echo "deb-src http://dl.bintray.com/klen/deb /" | sudo tee -a /etc/apt/sources.list
Install the graphite-beacon package using apt-get:
apt-get update apt-get install graphite-beacon
You can setup options with a configuration file.
Keep the config.json file in the same directory where you run graphite-beacon command.
JSON example:
// Comments are allowed here { "interval": "10minute", "logging": "info", "critical_handlers": ["log"], "warning_handlers": ["log"], "normal_handlers": ["log"], // "graphite_url": "http://<your-graphite-url>", "alerts": [ // A graphite alert - be sure to set `graphite_url` appropriately. { "name": "Memory", "query": "aliasByNode(collectd.*.memory.memory-free, 1)", "interval": "10minute", "format": "bytes", "rules": ["warning: < 500MB", "critical: > 200MB"] }, // A ping alert { "name": "Site", "source": "url", "query": "http://google.com", "interval": "20second", "rules": ["critical: != 200"] } ] }
How to Setup Alerts in Graphite:
Currently, it supports two types of alerts:
- Graphite alert (default) – check graphite metrics
- URL alert – load HTTP and check status
Historical Values
graphite-beacon supports “historical” values for a rule.
For example: Assume you want to get a warning when CPU usage is greater than 150% of normal usage then you can set as followed.
"warning: > historical * 1.5"
For memory alerts less than half value
"warning: < historical / 2"
Historical values for each query are kept. A historical value represents the average of all values in history.
Note:
- Rules using a historical value will only work after enough values have been collected (see history_size).
- History values are kept for 1 day by default. You can change this with the history_size
See the below example for how to send a warning when today’s new user count is less than 80% of the last 10-day average:
// Get average for last 10 days "history_size": "10day",
Handlers in Graphite-beacon:
Handlers allow for notifying an external service or process of an alert firing.
Email Handler
Sends an email (enabled by default).
{ // SMTP default options "smtp": { "from": "mail-id", "to": [mention-mail-id], // List of email addresses to send to "host": "your-smtp-host", // SMTP host "port": 25, // SMTP port "username": your-mail-id, // SMTP user (optional) "password": mail-id-password, // SMTP password (optional) "use_tls": false, // Use TLS? "html": true, // Send HTML emails? // Graphite link for emails (By default is equal to main graphite_url) "graphite_url": null } }
HipChat Handler
Sends a message to a HipChat room.
{ "hipchat": { // (optional) Custom HipChat URL "url": 'https://HIPCHAT-URL', "room": "myroom", "key": "mykey" } }
Webhook Handler (HTTP)
Triggers a webhook.
{ "http": { "url": "http://YOUR-WEBHook.com", "params": {}, // (optional) Additional query(data) params "method": "GET" // (optional) HTTP method } }
Slack Handler
Sends a message to a user or channel on Slack.
{ "slack": { "webhook": "https://your-slack-url/...", "channel": "#yourchannel-name", // #channel or @user (optional) "username": "graphite-beacon", } }
Command Line Handler
Runs a command.
{ "cli": { // Command to run (required) // Several variables that will be substituted by values are allowed: // ${level} -- alert level // ${name} -- alert name // ${value} -- current metrics value // ${limit_value} -- metrics limit value "command": "./myscript ${level} ${name} ${value} ...", // Whitelist of alerts that will trigger this handler (optional) // All alerts will trigger this handler if absent. "alerts_whitelist": ["..."] } }
PagerDuty Handler
Triggers a PagerDuty incident.
{ "pagerduty": { "subdomain": "yoursubdomain", "apitoken": "apitoken", "service_key": "servicekey", } }
Command Line Usage
$ graphite-beacon --help Usage: graphite-beacon [OPTIONS] Options: --config Path to an configuration file (JSON/YAML) (default config.json) --graphite_url Graphite URL (default http://localhost) --help show this help information --pidfile Set pid file --log_file_max_size max size of log files before rollover (default 100000000) --log_file_num_backups number of log files to keep (default 10) --log_file_prefix=PATH Path prefix for log files. Note that if you are running multiple tornado processes, log_file_prefix must be different for each of them (e.g. include the port number) --log_to_stderr Send log output to stderr --logging=debug|info|warning|error|none Set the Python log level. If 'none', tornado won't touch the logging configuration. (default info)
This tutorial covers how to Setup Alerting System for Graphite Metrics.
Thanks for reading this article, you’ll also like to read below articles.
RUNDECK TUTORIALS FOR AUTOMATION
Simple Steps for Installing Munin Monitoring Tool
Steps to Monitor Linux Server using Prometheus