This is archived documentation for InfluxData product versions that are no longer maintained. For newer documentation, see the latest InfluxData documentation.
Every InfluxDB use case is special and your schema will reflect that uniqueness. There are, however, general guidelines to follow and pitfalls to avoid when designing your schema.
Encouraged Schema Design
In no particular order, we recommend that you:
Encode meta data in tags
Tags are indexed and fields are not indexed. This means that queries on tags are more performant than those on fields.
In general, your queries should guide what gets stored as a tag and what gets stored as a field:
- Store data in tags if they’re commonly-queried meta data
- Store data in tags if you plan to use them with
GROUP BY()
- Store data in fields if you plan to use them with an InfluxQL function
- Store data in fields if you need them to be something other than a string - tag values are always interpreted as strings
Avoid using InfluxQL Keywords as identifier names
This isn’t necessary, but it simplifies writing queries; you won’t have to wrap those identifiers in double quotes. Identifiers are are database names, retention policy names, user names, measurement names, tag keys, and field keys. See InfluxQL Keywords for words to avoid.
Note that you will also need to wrap identifiers in double quotes in queries if they contain characters other than
[A-z,_]
.
Discouraged Schema Design
In no particular order, we recommend that you:
Don’t have too many series
See Hardware Sizing Guidelines for series cardinality recommendations based on your hardware.
Tags that specify highly variable information like UUIDs, hashes, and random strings can increase your series cardinality to uncomfortable levels. If you need that information in your database, consider storing the high-cardinality data as a field rather than a tag (note that query performance will be slower).
Don’t differentiate data with measurement names
In general, taking this step will simplify your queries. InfluxDB queries merge data that fall within the same measurement; it’s better to differentiate data with tags than with detailed measurement names.
Example:
Schema 1 Schema 2 Measurement: blueberries.field-1.region-north
Measurement: blueberries
; Tags:field = 1
andregion = north
Measurement: blueberries.field-2.region-midwest
Measurement: blueberries
; Tags:field = 2
andregion = midwest
Assume that each measurement contains a single field key called
value
. The following queries calculate the average ofvalue
across all fields and all regions. Notice that, even at this small scale, this is harder to do under Schema 1.Schema 1
> SELECT mean(value) FROM /^blueberries/ name: blueberries.field-1.region-north -------------------------------------- time mean 1970-01-01T00:00:00Z 444 name: blueberries.field-2.region-midwest ---------------------------------------- time mean 1970-01-01T00:00:00Z 33766.666666666664
Then calculate the mean yourself.
Schema 2
> SELECT mean(value) FROM blueberries name: blueberries ----------------- time mean 1970-01-01T00:00:00Z 17105.333333333332
Don’t put more than one piece of information in one tag
Similar to the point above, taking this step will simplify your queries. It will reduce your need for regular expressions.
Example:
Tagset 1 Tagset 2 location = field-1.region-north
field = 1
andregion = north
location = field-2.region-north
field = 2
andregion = north
location = field-2.region-midwest
field = 2
andregion = midwest
Assume that each tag set falls in the measurement
blueberries
and is associated with a field calledvalue
. The following queries calculate the average ofvalue
for blueberries that fall in thenorth
. While both queries are relatively simple, you can imagine that the regex could get much more complicated if Schema 1 contained a more complex tag value.Schema 1
> SELECT mean(value) FROM blueberries WHERE location =~ /north/
Schema 2
> SELECT mean(value) FROM blueberries WHERE region = 'north'
Don’t use the same name for a field key and tag key
You won’t be able to query the tag key if the tag key is the same as a field key in your schema. Be sure to differentiate your tag keys and field keys.
See GitHub Issue [#4630](https://github.com/influxdata/influxdb/issues/4630) for more information.