This is archived documentation for InfluxData product versions that are no longer maintained. For newer documentation, see the latest InfluxData documentation.
An AlertNode can trigger an event of varying severity levels, and pass the event to alert handlers. The criteria for triggering an alert is specified via a lambda expression. See AlertNode.Info, AlertNode.Warn, and AlertNode.Crit below.
Different event handlers can be configured for each AlertNode. Some handlers like Email, HipChat, Sensu, Slack, OpsGenie, VictorOps, PagerDuty and Talk have a configuration option 'global' that indicates that all alerts implicitly use the handler.
Available event handlers:
- log – log alert data to file.
- post – HTTP POST data to a specified URL.
- email – Send and email with alert data.
- exec – Execute a command passing alert data over STDIN.
- HipChat – Post alert message to HipChat room.
- Alerta – Post alert message to Alerta.
- Sensu – Post alert message to Sensu client.
- Slack – Post alert message to Slack channel.
- OpsGenie – Send alert to OpsGenie.
- VictorOps – Send alert to VictorOps.
- PagerDuty – Send alert to PagerDuty.
- Talk – Post alert message to Talk client.
See below for more details on configuring each handler.
Each event that gets sent to a handler contains the following alert data:
- ID – the ID of the alert, user defined.
- Message – the alert message, user defined.
- Details – the alert details, user defined HTML content.
- Time – the time the alert occurred.
- Level – one of OK, INFO, WARNING or CRITICAL.
- Data – influxql.Result containing the data that triggered the alert.
Events are sent to handlers if the alert is in a state other than 'OK' or the alert just changed to the 'OK' state from a non 'OK' state (a.k.a. the alert recovered). Using the AlertNode.StateChangesOnly property events will only be sent to handlers if the alert changed state.
It is valid to configure multiple alert handlers, even with the same type.
Example:
stream
.groupBy('service')
|alert()
.id('kapacitor/{{ index .Tags "service" }}')
.message('{{ .ID }} is {{ .Level }} value:{{ index .Fields "value" }}')
.info(lambda: "value" > 10)
.warn(lambda: "value" > 20)
.crit(lambda: "value" > 30)
.post("http://example.com/api/alert")
.post("http://another.example.com/api/alert")
.email().to('oncall@example.com')
It is assumed that each successive level filters a subset of the previous level. As a result, the filter will only be applied if a data point passed the previous level. In the above example, if value = 15 then the INFO and WARNING expressions would be evaluated, but not the CRITICAL expression. Each expression maintains its own state.
Index
Properties
- Alerta
- Crit
- Details
- Exec
- Flapping
- HipChat
- History
- Id
- Info
- Log
- Message
- OpsGenie
- PagerDuty
- Post
- Sensu
- Slack
- StateChangesOnly
- Talk
- VictorOps
- Warn
Chaining Methods
Properties
Property methods modify state on the calling node.
They do not add another node to the pipeline, and always return a reference to the calling node.
Property methods are marked using the .
operator.
Alerta
Send the alert to Alerta.
Example:
[alerta]
enabled = true
url = "https://alerta.yourdomain"
token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
environment = "Production"
origin = "Kapacitor"
In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are sent to Alerta.
Send alerts to Alerta. The resource and event properties are required.
Example:
stream
|alert()
.alerta()
.resource('Hostname or service')
.event('Something went wrong')
Alerta also accepts optional alert information.
Example:
stream
|alert()
.alerta()
.resource('Hostname or service')
.event('Something went wrong')
.environment('Development')
.group('Dev. Servers')
NOTE: Alerta cannot be configured globally because of its required properties.
node.alerta()
Alerta Environment
Alerta environment. Can be a template and has access to the same data as the AlertNode.Details property. Defaut is set from the configuration.
node.alerta()
.environment(value string)
Alerta Group
Alerta group. Can be a template and has access to the same data as the AlertNode.Details property. Default: {{ .Group }}
node.alerta()
.group(value string)
Alerta Origin
Alerta origin. If empty uses the origin from the configuration.
node.alerta()
.origin(value string)
Alerta Resource
Alerta resource. Can be a template and has access to the same data as the AlertNode.Details property. Default: {{ .Name }}
node.alerta()
.resource(value string)
Alerta Token
Alerta authentication token. If empty uses the token from the configuration.
node.alerta()
.token(value string)
Alerta Value
Alerta value. Can be a template and has access to the same data as the AlertNode.Details property. Default is an empty string.
node.alerta()
.value(value string)
Crit
Filter expression for the CRITICAL alert level. An empty value indicates the level is invalid and is skipped.
node.crit(value tick.Node)
Details
Template for constructing a detailed HTML message for the alert. The same template data is available as the AlertNode.Message property, in addition to a Message field that contains the rendered Message value.
The intent is that the Message property be a single line summary while the Details property is a more detailed message possibly spanning multiple lines, and containing HTML formatting.
This template is rendered using the html/template package in Go so that safe and valid HTML can be generated.
The json
method is available within the template to convert any variable to a valid
JSON string.
Example:
|alert()
.id('{{ .Name }}')
.details('''
<h1>{{ .ID }}</h1>
<b>{{ .Message }}</b>
Value: {{ index .Fields "value" }}
''')
.email()
Default: {{ json . }}
node.details(value string)
Email the alert data.
If the To list is empty, the To addresses from the configuration are used. The email subject is the AlertNode.Message property. The email body is the AlertNode.Details property. The emails are sent as HTML emails and so the body can contain html markup.
If the 'smtp' section in the configuration has the option: global = true then all alerts are sent via email without the need to explicitly state it in the TICKscript.
Example:
|alert()
.id('{{ .Name }}')
// Email subject
.message('{{ .ID }}:{{ .Level }}')
//Email body as HTML
.details('''
<h1>{{ .ID }}</h1>
<b>{{ .Message }}</b>
Value: {{ index .Fields "value" }}
''')
.email()
Send an email with custom subject and body.
Example:
[smtp]
enabled = true
host = "localhost"
port = 25
username = ""
password = ""
from = "kapacitor@example.com"
to = ["oncall@example.com"]
# Set global to true so all alert trigger emails.
global = true
state-changes-only = true
Example:
stream
|alert()
Send email to 'oncall@example.com' from 'kapacitor@example.com'
node.email(to ...string)
Exec
Execute a command whenever an alert is triggered and pass the alert data over STDIN in JSON format.
node.exec(executable string, args ...string)
Flapping
Perform flap detection on the alerts. The method used is similar method to Nagios: https://assets.nagios.com/downloads/nagioscore/docs/nagioscore/3/en/flapping.html
Each different alerting level is considered a different state.
The low and high thresholds are inverted thresholds of a percentage of state changes.
Meaning that if the percentage of state changes goes above the high
threshold, the alert enters a flapping state. The alert remains in the flapping state
until the percentage of state changes goes below the low
threshold.
Typical values are low: 0.25 and high: 0.5. The percentage values represent the number state changes
over the total possible number of state changes. A percentage change of 0.5 means that the alert changed
state in half of the recorded history, and remained the same in the other half of the history.
node.flapping(low float64, high float64)
HipChat
If the 'hipchat' section in the configuration has the option: global = true then all alerts are sent to HipChat without the need to explicitly state it in the TICKscript.
Example:
[hipchat]
enabled = true
url = "https://orgname.hipchat.com/v2/room"
room = "Test Room"
token = "9hiWoDOZ9IbmHsOTeST123ABciWTIqXQVFDo63h9"
global = true
state-changes-only = true
Example:
stream
|alert()
Send alert to HipChat using default room 'Test Room'.
node.hipChat()
HipChat Room
HipChat room in which to post messages. If empty uses the channel from the configuration.
node.hipChat()
.room(value string)
HipChat Token
HipChat authentication token. If empty uses the token from the configuration.
node.hipChat()
.token(value string)
History
Number of previous states to remember when computing flapping levels and checking for state changes. Minimum value is 2 in order to keep track of current and previous states.
Default: 21
node.history(value int64)
Id
Template for constructing a unique ID for a given alert.
Available template data:
- Name – Measurement name.
- TaskName – The name of the task
- Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
- Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.
Example:
stream
|from()
.measurement('cpu')
.groupBy('cpu')
|alert()
.id('kapacitor/{{ .Name }}/{{ .Group }}')
ID: kapacitor/cpu/cpu=cpu0,
Example:
stream
|from()
.measurement('cpu')
.groupBy('service')
|alert()
.id('kapacitor/{{ index .Tags "service" }}')
ID: kapacitor/authentication
Example:
stream
|from()
.measurement('cpu')
.groupBy('service', 'host')
|alert()
.id('kapacitor/{{ index .Tags "service" }}/{{ index .Tags "host" }}')
ID: kapacitor/authentication/auth001.example.com
Default: {{ .Name }}:{{ .Group }}
node.id(value string)
Info
Filter expression for the INFO alert level. An empty value indicates the level is invalid and is skipped.
node.info(value tick.Node)
Log
Log JSON alert data to file. One event per line. Must specify the absolute path to the log file. It will be created if it does not exist. Example:
stream
|alert()
.log('/tmp/alert')
Example:
stream
|alert()
.log('/tmp/alert')
.mode(0644)
node.log(filepath string)
Log Mode
File's mode and permissions, default is 0600
node.log(filepath string)
.mode(value int64)
Message
Template for constructing a meaningful message for the alert.
Available template data:
- ID – The ID of the alert.
- Name – Measurement name.
- TaskName – The name of the task
- Group – Concatenation of all group-by tags of the form [key=value,]+. If no groupBy is performed equal to literal 'nil'.
- Tags – Map of tags. Use '{{ index .Tags "key" }}' to get a specific tag value.
- Level – Alert Level, one of: INFO, WARNING, CRITICAL.
- Fields – Map of fields. Use '{{ index .Fields "key" }}' to get a specific field value.
- Time – The time of the point that triggered the event.
Example:
stream
|from()
.measurement('cpu')
.groupBy('service', 'host')
|alert()
.id('{{ index .Tags "service" }}/{{ index .Tags "host" }}')
.message('{{ .ID }} is {{ .Level}} value: {{ index .Fields "value" }}')
Message: authentication/auth001.example.com is CRITICAL value:42
Default: {{ .ID }} is {{ .Level }}
node.message(value string)
OpsGenie
Send alert to OpsGenie. To use OpsGenie alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of OpsGenie. Then place the API key from the URL into the 'opsgenie' section of the Kapacitor configuration.
Example:
[opsgenie]
enabled = true
api-key = "xxxxx"
teams = ["everyone"]
recipients = ["jim", "bob"]
With the correct configuration you can now use OpsGenie in TICKscripts.
Example:
stream
|alert()
.opsGenie()
Send alerts to OpsGenie using the teams and recipients in the configuration file.
Example:
stream
|alert()
.opsGenie()
.teams('team_rocket','team_test')
Send alerts to OpsGenie with team set to 'team_rocket' and 'team_test'
If the 'opsgenie' section in the configuration has the option: global = true then all alerts are sent to OpsGenie without the need to explicitly state it in the TICKscript.
Example:
[opsgenie]
enabled = true
api-key = "xxxxx"
recipients = ["johndoe"]
global = true
Example:
stream
|alert()
Send alert to OpsGenie using the default recipients, found in the configuration.
node.opsGenie()
OpsGenie Recipients
The list of recipients to be alerted. If empty defaults to the recipients from the configuration.
node.opsGenie()
.recipients(recipients ...string)
OpsGenie Teams
The list of teams to be alerted. If empty defaults to the teams from the configuration.
node.opsGenie()
.teams(teams ...string)
PagerDuty
Send the alert to PagerDuty. To use PagerDuty alerting you must first follow the steps to enable a new 'Generic API' service.
From https://developer.pagerduty.com/documentation/integration/events
- In your account, under the Services tab, click "Add New Service".
- Enter a name for the service and select an escalation policy. Then, select "Generic API" for the Service Type.
- Click the "Add Service" button.
- Once the service is created, you'll be taken to the service page. On this page, you'll see the "Service key", which is needed to access the API
Place the 'service key' into the 'pagerduty' section of the Kapacitor configuration as the option 'service-key'.
Example:
[pagerduty]
enabled = true
service-key = "xxxxxxxxx"
With the correct configuration you can now use PagerDuty in TICKscripts.
Example:
stream
|alert()
.pagerDuty()
If the 'pagerduty' section in the configuration has the option: global = true then all alerts are sent to PagerDuty without the need to explicitly state it in the TICKscript.
Example:
[pagerduty]
enabled = true
service-key = "xxxxxxxxx"
global = true
Example:
stream
|alert()
Send alert to PagerDuty.
node.pagerDuty()
Post
HTTP POST JSON alert data to a specified URL.
node.post(url string)
Sensu
Send the alert to Sensu.
Example:
[sensu]
enabled = true
url = "http://sensu:3030"
source = "Kapacitor"
Example:
stream
|alert()
.sensu()
Send alerts to Sensu client.
node.sensu()
Slack
Send the alert to Slack. To allow Kapacitor to post to Slack, go to the URL https://slack.com/services/new/incoming-webhook and create a new incoming webhook and place the generated URL in the 'slack' configuration section.
Example:
[slack]
enabled = true
url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
channel = "#general"
In order to not post a message every alert interval use AlertNode.StateChangesOnly so that only events where the alert changed state are posted to the channel.
Example:
stream
|alert()
.slack()
Send alerts to Slack channel in the configuration file.
Example:
stream
|alert()
.slack()
.channel('#alerts')
Send alerts to Slack channel '#alerts'
Example:
stream
|alert()
.slack()
.channel('@jsmith')
Send alert to user '@jsmith'
If the 'slack' section in the configuration has the option: global = true then all alerts are sent to Slack without the need to explicitly state it in the TICKscript.
Example:
[slack]
enabled = true
url = "https://hooks.slack.com/services/xxxxxxxxx/xxxxxxxxx/xxxxxxxxxxxxxxxxxxxxxxxx"
channel = "#general"
global = true
state-changes-only = true
Example:
stream
|alert()
Send alert to Slack using default channel '#general'.
node.slack()
Slack Channel
Slack channel in which to post messages. If empty uses the channel from the configuration.
node.slack()
.channel(value string)
StateChangesOnly
Only sends events where the state changed. Each different alert level OK, INFO, WARNING, and CRITICAL are considered different states.
Example:
stream
|from()
.measurement('cpu')
|window()
.period(10s)
.every(10s)
|alert()
.crit(lambda: "value" > 10)
.stateChangesOnly()
.slack()
If the "value" is greater than 10 for a total of 60s, then only two events will be sent. First, when the value crosses the threshold, and second, when it falls back into an OK state. Without stateChangesOnly, the alert would have triggered 7 times: 6 times for each 10s period where the condition was met and once more for the recovery.
node.stateChangesOnly()
Talk
Send the alert to Talk. To use Talk alerting you must first follow the steps to create a new incoming webhook.
- Go to the URL https:/account.jianliao.com/signin.
- Sign in with you account. under the Team tab, click "Integrations".
- Select "Customize service", click incoming Webhook "Add" button.
- After choose the topic to connect with "xxx", click "Confirm Add" button.
- Once the service is created, you'll see the "Generate Webhook url".
Place the 'Generate Webhook url' into the 'Talk' section of the Kapacitor configuration as the option 'url'.
Example:
[talk]
enabled = true
url = "https://jianliao.com/v2/services/webhook/uuid"
author_name = "Kapacitor"
Example:
stream
|alert()
.talk()
Send alerts to Talk client.
node.talk()
VictorOps
Send alert to VictorOps. To use VictorOps alerting you must first enable the 'Alert Ingestion API' in the 'Integrations' section of VictorOps. Then place the API key from the URL into the 'victorops' section of the Kapacitor configuration.
Example:
[victorops]
enabled = true
api-key = "xxxxx"
routing-key = "everyone"
With the correct configuration you can now use VictorOps in TICKscripts.
Example:
stream
|alert()
.victorOps()
Send alerts to VictorOps using the routing key in the configuration file.
Example:
stream
|alert()
.victorOps()
.routingKey('team_rocket')
Send alerts to VictorOps with routing key 'team_rocket'
If the 'victorops' section in the configuration has the option: global = true then all alerts are sent to VictorOps without the need to explicitly state it in the TICKscript.
Example:
[victorops]
enabled = true
api-key = "xxxxx"
routing-key = "everyone"
global = true
Example:
stream
|alert()
Send alert to VictorOps using the default routing key, found in the configuration.
node.victorOps()
VictorOps RoutingKey
The routing key to use for the alert. Defaults to the value in the configuration if empty.
node.victorOps()
.routingKey(value string)
Warn
Filter expression for the WARNING alert level. An empty value indicates the level is invalid and is skipped.
node.warn(value tick.Node)
Chaining Methods
Chaining methods create a new node in the pipeline as a child of the calling node.
They do not modify the calling node.
Chaining methods are marked using the |
operator.
Deadman
Helper function for creating an alert on low throughput, aka deadman's switch.
- Threshold – trigger alert if throughput drops below threshold in points/interval.
- Interval – how often to check the throughput.
- Expressions – optional list of expressions to also evaluate. Useful for time of day alerting.
Example:
var data = stream
|from()...
// Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
data
|deadman(100.0, 10s)
//Do normal processing of data
data...
The above is equivalent to this Example:
var data = stream
|from()...
// Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
data
|stats(10s)
|derivative('emitted')
.unit(10s)
.nonNegative()
|alert()
.id('node \'stream0\' in task \'{{ .TaskName }}\'')
.message('{{ .ID }} is {{ if eq .Level "OK" }}alive{{ else }}dead{{ end }}: {{ index .Fields "emitted" | printf "%0.3f" }} points/10s.')
.crit(lamdba: "emitted" <= 100.0)
//Do normal processing of data
data...
The id
and message
alert properties can be configured globally via the 'deadman' configuration section.
Since the AlertNode is the last piece it can be further modified as normal. Example:
var data = stream
|from()...
// Trigger critical alert if the throughput drops below 100 points per 1s and checked every 10s.
data
|deadman(100.0, 10s)
.slack()
.channel('#dead_tasks')
//Do normal processing of data
data...
You can specify additional lambda expressions to further constrain when the deadman's switch is triggered. Example:
var data = stream
|from()...
// Trigger critical alert if the throughput drops below 100 points per 10s and checked every 10s.
// Only trigger the alert if the time of day is between 8am-5pm.
data
|deadman(100.0, 10s, lambda: hour("time") >= 8 AND hour("time") <= 17)
//Do normal processing of data
data...
node|deadman(threshold float64, interval time.Duration, expr ...tick.Node)
Returns: AlertNode
Stats
Create a new stream of data that contains the internal statistics of the node. The interval represents how often to emit the statistics based on real time. This means the interval time is independent of the times of the data points the source node is receiving.
node|stats(interval time.Duration)
Returns: StatsNode