Data Streams
Learn How to use Data Streams for time series data
Since v7 Elasticsearch offers the ability to use Data streams for time series or rotational data. A data stream uses an Index Template to automatically create backing indices for the data. Depending on the lifecycle configuration, data may be rotated out to separate indices, snapshots or deleted altogether. Because the creation of a data stream requires a number of dependencies to be created, the process of implementing is a bit more complex. The steps, in order would be:
Create a lifecycle policy for your data stream
Create one or more component templates for your data stream index mappings/settings. You may also use pre-configured system templates, like log settings, when applicable.
Create an index template using your component template(s)
Create your data stream - either manually, or by pushing data
Let's take a look at what this might look like for a new logging data stream. For a more detailed example, you can take a look at the built in Log appender for this module.
Create a lifecycle policy
We will create a policy to:
Start with 2 shards in the "hot" phase ( new data ) and force rollover of old data if any shard grows to more than 1GB
Transition to the "warm" phase at 7 days, shrink to 1 shard and make the backing index for the phase read-only.
Delete the data after 60 days
Create a component template
Next we'll create a component template to
handle custom fields and settings in our logs
assign future index templates which use it to the lifecycle policy we created
Create an index template
Now we'll create an index template to use our component template in addition to Elasticsearch's built-in logging templates:
Create our data stream
Now we can create our data stream in one of two ways. We can either send data to an index matching the pattern or we can create it manually. Let's do both:
Create a data stream manually without data:
This will create the data stream and backing indices for the data stream named my-index-foo
.
Create a data stream by adding data:
This will create the data stream and backing indices for the data stream named my-index-bar
.
Data streams can be a powerful way to ensure that time series data remains relevant and purges itself when there is no longer a need for it. You can also use data streams for a mapping change between versions of your application indices.
Last updated