This page addresses frequent sources of confusion and places where InfluxDB behaves in an unexpected way relative to other database systems. Where applicable, it links to outstanding issues on GitHub.

Querying data

Writing data

Administration

Querying data

Getting unexpected results with `GROUP BY time()`

A query that includes a GROUP BY time() clause can yield unexpected results:

InfluxDB returns a single aggregated value with the timestamp '1970-01-01T00:00:00Z' even though the data include more than one instance of the time interval specified in time()
InfluxDB returns ERR: too many points in the group by interval. maybe you forgot to specify a where time clause? even though the query includes a WHERE time clause.

Those returns typically proceed from the combination of the following two features of InfluxDB:

By default, InfluxDB uses epoch 0 (1970-01-01T00:00:00Z) as the lower bound and now() as the upper bound in queries.
A query that includes GROUP BY time() must cover fewer than 100,000 instances of the supplied time interval

If your WHERE time clause is simply WHERE time < now() InfluxDB queries the data back to epoch 0 - that behavior often causes the query to breach the 100,000 instances rule and InfluxDB returns a confusing error or result. Avoid perplexing GROUP BY time() returns by specifying a valid time interval in the WHERE clause.

GitHub Issue #2702 and GitHub Issue #4004

Understanding the time intervals returned from `GROUP BY time()` queries

With some GROUP BY time() queries, the returned time intervals may not reflect the time range specified in the WHERE clause. In the example below the first timestamp in the results occurs before the lower bound of the query:

Query with a two day GROUP BY time() interval:


> SELECT count(water_level) FROM h2o_feet WHERE time >= ‘2015-08-20T00:00:00Z’ AND time <= ‘2015-08-24T00:00:00Z’ AND location = ‘santa_monica’ GROUP BY time(2d)

Results:

name: h2o_feet
--------------
time                     count
2015-08-19T00:00:00Z        240
2015-08-21T00:00:00Z     480
2015-08-23T00:00:00Z     241

InfluxDB queries the GROUP BY time() intervals that fall within the WHERE time clause. GROUP BY time() intervals always fall on rounded calendar time boundaries. Because they’re rounded time boundaries, the start and end timestamps may appear to include more data than those covered by the query’s WHERE time clause.

For the example above, InfluxDB works with two day intervals based on round number calendar days. The rounded two day buckets in August are as follows (explanation continues below):

August 1st-2nd
August 3rd-4th
[...]
August 19th-20th
August 21st-22nd
August 23rd-24th
[...]

Because InfluxDB automatically groups together August 19th and August 20th, August 19th is the first timestamp to appear in the results despite not being within the query’s time range. The number in the count column, however, only includes data that occur on or after August 20th as that is the time range specified by the query’s WHERE clause.

See GitHub Issue #1837 for more context on the future development of GROUP BY time() windows.

Querying after `now()`

By default, InfluxDB uses now() (the current nanosecond timestamp of the node that is processing the query) as the upper bound in queries. You must provide explicit directions in the WHERE clause to query points that occur after now().

The first query below asks InfluxDB to return everything from hillvalley that occurs between epoch 0 (1970-01-01T00:00:00Z) and now(). The second query asks InfluxDB to return everything from hillvalley that occurs between epoch 0 and 1,000 days from now().

SELECT * FROM hillvalley
SELECT * FROM hillvalley WHERE time < now() + 1000d

Querying outside the min/max time range

Queries with a time range that exceeds the minimum or maximum timestamps valid for InfluxDB currently return no results, rather than an error message.

Smallest valid timestamp: -9023372036854775808 (approximately 1684-01-22T14:50:02Z)
Largest valid timestamp: 9023372036854775807 (approximately 2255-12-09T23:13:56Z)

GitHub Issue #3369

Querying a time range that spans epoch 0

Currently, InfluxDB can return results for queries that cover either the time range before epoch 0 or the time range after epoch 0, not both. A query with a time range that spans epoch 0 returns partial results.

GitHub Issue #2703

Querying with booleans

Acceptable boolean syntax differs for data writes and data queries.

Boolean syntax	Writes	Queries
`t`,`f`	✔️	❌
`T`,`F`	✔️	❌
`true`,`false`	✔️	✔️
`True`,`False`	✔️	✔️
`TRUE`,`FALSE`	✔️	✔️

For example, SELECT * FROM hamlet WHERE bool=True returns all points with bool set to TRUE, but SELECT * FROM hamlet WHERE bool=T returns all points withbool set to false.

GitHub Issue #3939

Working with really big or really small integers

InfluxDB stores all integers as signed int64 data types. The minimum and maximum valid values for int64 are -9023372036854775808 and 9023372036854775807. See Go builtins for more information.

Values close to but within those limits may lead to unexpected results; some functions and operators convert the int64 data type to float64 during calculation which can cause overflow issues.

GitHub Issue #3130

Doing math on timestamps

Currently, it is not possible to execute mathematical operators or functions against timestamp values in InfluxDB. All time calculations must be carried out by the client receiving the query results.

Getting an unexpected epoch 0 timestamp in query returns

In InfluxDB, epoch 0 (1970-01-01T00:00:00Z) is often used as a null timestamp equivalent. If you request a query that has no timestamp to return, such as an aggregation function with an unbounded time range, InfluxDB returns epoch 0 as the timestamp.

GitHub Issue #3337

Getting large query returns in batches when using the HTTP API

InfluxDB returns large query results in batches of 10,000 points unless you use the query string parameter chunk_size to explicitly set the batch size. For example, get results in batches of 20,000 points with:

curl -G 'http://localhost:8086/query' --data-urlencode "db=deluge" --data-urlencode "chunk_size=20000" --data-urlencode "q=SELECT * FROM liters"

See GitHub Issue #3242 for more information on the challenges that this can cause, especially with Grafana visualization.

Getting the `expected identifier` error, unexpectedly

Receiving the error ERR: error parsing query: found [WORD], expected identifier[, string, number, bool] is often a gentle reminder that you forgot to include something in your query, as is the case in the following examples:

SELECT FROM logic WHERE rational = 5 should be SELECT something FROM logic WHERE rational = 5
SELECT * FROM WHERE rational = 5 should be SELECT * FROM logic WHERE rational = 5

In other cases, your query seems complete but you receive the same error:

SELECT field FROM why
SELECT * FROM why WHERE tag = '1'
SELECT * FROM grant WHERE why = 9

In the last three queries, and in most unexpected expected identifier errors, at least one of the identifiers in the query is an InfluxQL keyword. Identifiers are database names, retention policy names, user names, measurement names, tag keys, and field keys. To successfully query data that use a keyword as an identifier enclose that identifier in double quotes, so the examples above become:

SELECT "field" FROM why
SELECT * FROM why WHERE "tag" = '1'
SELECT * FROM "grant" WHERE why = 9

While using double quotes is an acceptable workaround, we recommend that you avoid using InfluxQL keywords as identifiers for simplicity’s sake. The InfluxQL documentation has a comprehensive list of all InfluxQL keywords.

Identifying write precision from returned timestamps

InfluxDB stores all timestamps as nanosecond values regardless of the write precision supplied. It is important to note that when returning query results, the database silently drops trailing zeros from timestamps which obscures the initial write precision.

In the example below, the tags precision_supplied and timestamp_supplied show the time precision and timestamp that the user provided at the write. Because InfluxDB silently drops trailing zeros on returned timestamps, the write precision is not recognizable in the returned timestamps.

name: trails
-------------
time                  value	 precision_supplied  timestamp_supplied
1970-01-01T01:00:00Z  3      n                   3600000000000
1970-01-01T01:00:00Z  5      h                   1
1970-01-01T02:00:00Z  4      n                   7200000000000
1970-01-01T02:00:00Z  6      h                   2

GitHub Issue #2977

Single quoting and double quoting in queries

Single quote string values (for example, tag values) but do not single quote identifiers (database names, retention policy names, user names, measurement names, tag keys, and field keys).

Double quote identifiers if they start with a digit, contain characters other than [A-z,0-9,_], or if they are an InfluxQL keyword. You can double quote identifiers even if they don’t fall into one of those categories but it isn’t necessary.

Yes: SELECT bikes_available FROM bikes WHERE station_id='9'

Yes: SELECT "bikes_available" FROM "bikes" WHERE "station_id"='9'

Yes: SELECT * from "cr@zy" where "p^e"='2'

No: SELECT 'bikes_available' FROM 'bikes' WHERE 'station_id'="9"

No: SELECT * from cr@zy where p^e='2'

See the Query Syntax page for more information.

Missing data after creating a new `DEFAULT` retention policy

When you create a new DEFAULT retention policy (RP) on a database, the data written to the old DEFAULT RP remain in the old RP. Queries that do not specify an RP automatically query the new DEFAULT RP so the old data may appear to be missing. To query the old data you must fully qualify the relevant data in the query.

Example:

All of the data in the measurement fleeting fall under the DEFAULT RP called one_hour:

> SELECT count(flounders) FROM fleeting
name: fleeting
--------------
time			               count
1970-01-01T00:00:00Z	 8

We create a new DEFAULT RP (two_hour) and perform the same query:

> SELECT count(flounders) FROM fleeting
>

To query the old data, we must specify the old DEFAULT RP by fully qualifying fleeting:

> SELECT count(flounders) FROM fish.one_hour.fleeting
name: fleeting
--------------
time			               count
1970-01-01T00:00:00Z	 8

Writing data

Writing integers

Add a trailing i to the end of the field value when writing an integer. If you do not provide the i, InfluxDB will treat the field value as a float.

Writes an integer: value=100i
Writes a float: value=100

Note: This syntax for writing integers is for versions 0.9.3+. Versions prior to 0.9.3 had a different syntax, see PR #3526.

Writing data with negative timestamps

There was a bug in the line protocol parser in versions 0.9.0 to 0.9.4 which treated negative timestamps as invalid syntax. For example, INSERT waybackwhen past=1 -1 returned ERR: unable to parse 'waybackwhen past=1 -1': bad timestamp. Starting with version 0.9.5 and later, the line protocol parser accepts negative timestamps.

Writing duplicate points

In InfluxDB 0.9 a point is uniquely identified by the measurement name, full [tag set]()(/influxdb/v0.9/concepts/glossary/#tag-set), and the nanosecond timestamp. If a point is submitted with an identical measurement, tag set, and timestamp it will silently overwrite the previous point. This is the intended behavior.

For example,
cpu_load,hostname=server02,az=us_west value=24.5 1234567890000000 and
cpu_load,hostname=server02,az=us_west value=5.24 1234567890000000 are identical points. The last one written will overwrite the other.

In order to store both points, simply introduce an arbitrary new tag to enforce uniqueness.
cpu_load,hostname=server02,az=us_west,uniq=1 value=24.5 1234567890000000 and
cpu_load,hostname=server02,az=us_west,uniq=2 value=5.24 1234567890000000 are now unique points, and each will persist in the database.

You can also increment the timestamp by a nanosecond:
cpu_load,hostname=server02,az=us_west,uniq=1 value=24.5 1234567890000000 and
cpu_load,hostname=server02,az=us_west,uniq=2 value=5.24 1234567890000001 are now unique points, and each will persist in the database.

Note: The field set has nothing to do with the uniqueness of a point. These are still identical points and the last one written would be the only one to persist:
cpu_load,hostname=server02,az=us_west value=24.5 1234567890000000 and
cpu_load,hostname=server02,az=us_west loadavg=12.2 1234567890000000 are still identical points. The last one written will overwrite the other.

Getting an unexpected error when sending data over the HTTP API

First, double check your line protocol syntax. Second, if you continue to receive errors along the lines of bad timestamp or unable to parse, verify that your newline character is line feed (\n, which is ASCII 0x0A). InfluxDB’s line protocol relies on \n to indicate the end of a line and the beginning of a new line; files or data that use a newline character other than \n will encounter parsing issues. Convert the newline character and try sending the data again.

Note: If you generated your data file on a Windows machine, Windows uses carriage return and line feed (\r\n) as the newline character.

Writing more than one continuous query to a single series

Use a single continuous query to write several statistics to the same measurement and tag set. For example, tell InfluxDB to write to the aggregated_stats measurement the MEAN and MIN of the value field grouped by five-minute intervals and grouped by the cpu tag with:

CREATE CONTINUOUS QUERY mean_min_value ON telegraf BEGIN SELECT MEAN(value) AS mean, MIN(value) AS min INTO aggregated_stats FROM cpu_idle GROUP BY time(5m),cpu END

If you create two separate continuous queries (one for calculating the MEAN and one for calculating the MIN), the aggregated_stats measurement will appear to be missing data. Separate continuous queries run at slightly different times and InfluxDB defines a unique point by its measurement, tag set, and timestamp (notice that field is missing from that list). So if two continuous queries write to different fields but also write to the same measurement and tag set, only one of the two fields will ever have data; the last continuous query to run will overwrite the results that were written by the first continuous query with the same timestamp.

For more on continuous queries, see Continuous Queries.

Words and characters to avoid

If you use any of the InfluxQL keywords as an identifier you will need to double quote that identifier in every query. This can lead to non-intuitive errors. Identifiers are database names, retention policy names, user names, measurement names, tag keys, and field keys.

To keep regular expressions and quoting simple, avoid using the following characters in identifiers:

\ backslash
^ circumflex accent
$ dollar sign
' single quotation mark
" double quotation mark
, comma

Single quoting and double quoting when writing data

Avoid single quoting and double quoting identifiers when writing data via the line protocol; see the examples below for how writing identifiers with quotes can complicate queries. Identifiers are database names, retention policy names, user names, measurement names, tag keys, and field keys.

Write with a double-quoted measurement: INSERT "bikes" bikes_available=3
Applicable query: SELECT * FROM "\"bikes\""

Write with a single-quoted measurement: INSERT 'bikes' bikes_available=3
Applicable query: SELECT * FROM "\'bikes\'"

Write with an unquoted measurement: INSERT bikes bikes_available=3
Applicable query: SELECT * FROM bikes
Double quote field values that are strings.

Write: INSERT bikes happiness="level 2"
Applicable query: SELECT * FROM bikes WHERE happiness='level 2'
Special characters should be escaped with a backslash and not placed in quotes.

Write: INSERT wacky va\"ue=4
Applicable query: SELECT "va\"ue" FROM wacky

See the Line Protocol Syntax page for more information.

Administration

Single quoting the password string

The CREATE USER <user> WITH PASSWORD '<password>' query requires single quotation marks around the password string. Do not include the single quotes when authenticating requests.

Escaping the single quote in a password

For passwords that include a single quote, escape the single quote with a backslash both when creating the password and when authenticating requests.

Identifying your version of InfluxDB

There a number of ways to identify the version of InfluxDB that you’re using:

Check the return when you curl the /ping endpoint. For example, if you’re using 0.9.3 curl -i 'http://localhost:8086/ping' returns:

HTTP/1.1 204 No Content
Request-Id: b7d7e0d6-5339-11e5-80bb-000000000000
✨X-Influxdb-Version: 0.9.3✨ Date: Fri, 04 Sep 2015 19:18:26 GMT

Check the text that appears when you launch the CLI:

Connected to http://localhost:8086 ✨version 0.9.3✨
InfluxDB shell 0.9.3

Check the HTTP response in your logs:

[http] 2015/09/04 12:29:07 ::1 - - [04/Sep/2015:12:29:06 -0700] GET /query?db=&q=create+database+there_you_go HTTP/1.1 200 40 - ✨InfluxDBShell/0.9.3✨ 357970a0-533b-11e5-8001-000000000000 6.07408ms

Data aren’t dropped after altering a retention policy

After shortening the DURATION of a retention policy (RP), you may notice that InfluxDB keeps some data that are older than the DURATION of the modified RP. This behavior is a result of the relationship between the time interval covered by a shard group and the DURATION of a retention policy.

InfluxDB stores data in shard groups. A single shard group covers a specific time interval; InfluxDB determines that time interval by looking at the DURATION of the relevant RP. The table below outlines the relationship between the DURATION of an RP and the time interval of a shard group:

RP duration	Shard group interval
< 2 days	1 hour
>= 2 days and <= 6 months	1 day
> 6 months	7 days

If you shorten the DURATION of an RP and the shard group interval also shrinks, InfluxDB may be forced to keep data that are older than the new DURATION. This happens because InfluxDB cannot divide the old, longer shard group into new, shorter shard groups; it must keep all of the data in the longer shard group even if only a small part of those data overlaps with the new DURATION.

Example: Moving from an infinite RP to a three day RP

Figure 1 shows the shard groups for our example database (example_db) after 11 days. The database uses the automatically generated default retention policy with an infinite (INF) DURATION so each shard group interval is seven days. On day 11, InfluxDB is no longer writing to Shard Group 1 and Shard Group 2 has four days worth of data:

Figure 1 Retention policy duration infinite

On day 11, we notice that example_db is accruing data too fast; we want to delete, and keep deleting, all data older than three days. We do this by altering the retention policy:

> ALTER RETENTION POLICY default ON example_db DURATION 3d

At the next retention policy enforcement check, InfluxDB immediately drops Shard Group 1 because all of its data is older than 3 days. InfluxDB does not drop Shard Group 2. This is because InfluxDB cannot divide existing shard groups and some data in Shard Group 2 still fall within the new three day retention policy.

Figure 2 shows the shard groups for example_db five days after the retention policy change. Notice that the new shard groups span one day intervals. All of the data in Shard Group 2 remain in the database because the shard group still has data within the retention policy’s three day DURATION:

Figure 2 Retention policy duration three days

After day 17, all data within the past 3 days will be in one day shard groups. InfluxDB will then be able to drop Shard Group 2 and example_db will have only 3 days worth of data.

Note: The time it takes for InfluxDB to adjust to the new retention policy may be longer depending on your shard precreation configuration setting. See Database Configuration for more on that setting.