How tsidxWritingLevel affects storage size and performance
Today we want to have a look at an index parameter and how it is affecting storage size and performance. In indexes.conf.spec you find the parameter tsidxWritingLevel. This parameters will configure how splunk creates index files over your rawdata within a bucket. The parameter was introduced in v7.2 and updated in v7.3 and v8.1. This also sets the minimum splunk version for a bucket index, meaning you will not be able to read buckets created on v8.1 with level 4 on a v8.0 system.
Overview of splunk versions and default tsidxWritingLevel:
Splunk Version | available tsidxWritingLevel | default tsidxWritingLevel |
---|---|---|
v8.2.0 | 1,2,3,4 | 2 |
v8.1.0 | 1,2,3,4 | 1 |
v8.0.0 | 1,2,3 | 1 |
v7.3.0 | 1,2,3 | 1 |
v7.2.0 | 1,2 | 1 |
v7.1.0 | no setting, 1 assumed | 1 |
So if you want to benefit from the latest storage and performance improvements from Splunk Enterprise you have to increase this setting. As I haven’t found any valid numbers besides “up to 40% reduced storage”* what an increase of the parameter will mean in real world, i decided to test it out on my own.
When changing this setting, only new buckets will be created with the higher level. Old data which was produced with an older setting will not be converted.
I created a test where I startup a single Instance on AWS, feed it with some logs, capture the time taken and the size of the tsidx files and repeat for every tsidxWritingLevel 3 times to validate the results.
test steps:
- run splunk on AWS instance: m5.xlarge (4vCPU, 16GB RAM), 30GB Storage, default SSD
- set tsidxWritingLevel
- ingest 950k Events (863MB of raw data) of RouterOS logfiles taken from a prod system
- measure time needed for ingest (5 second interval) and the size of buckets created
- repeat this test 3 times
- Splunk Enterprise v8.1.5
run number | tsidxWritingLevel | time taken ingest | Bucket sizeOnDisk (MB) | load avg | 1min load avg |
---|---|---|---|---|---|
1 | 1 | 95s | 519.89 | 3.95, 1.68, 1.56 | 4.05 |
2 | 1 | 100s | 519.25 | 4.23, 2.57, 1.9 | 4.05 |
3 | 1 | 95s | 525.32 | 3.97, 2.98, 2.14 | 4.05 |
1 | 2 | 100s | 475.30 | 4.19, 3.29, 2.36 | 4.47 |
2 | 2 | 100s | 475.89 | 4.61, 3.68, 2.63 | 4.47 |
3 | 2 | 95s | 472.64 | 4.62, 3.86, 2.83 | 4.47 |
1 | 3 | 105s | 461.16 | 4.17, 3.83, 2.96 | 4.29 |
2 | 3 | 100s | 452.96 | 3.9, 3.71, 3.03 | 4.29 |
3 | 3 | 95s | 450.67 | 4.79, 3.97, 3.21 | 4.29 |
1 | 4 | 105s | 403.32 | 4.35, 3.92, 3.29 | 4.07 |
2 | 4 | 100s | 413.07 | 3.85, 3.82, 3.34 | 4.07 |
3 | 4 | 100s | 407.33 | 4.01, 3.83, 3.4 | 4.07 |
average tsidx storage needed:
- tsidxWritingLevel 1: (519.89 + 519.25 + 525.32) / 3 = 521.49 MB = 100%
- tsidxWritingLevel 2: (475.30 + 475.89 + 472.64) / 3 = 474.61 MB = 91.4% (-8.6% reduction)
- tsidxWritingLevel 3: (461.16 + 452.96 + 450.67) / 3 = 454.93 MB = 87.2% (-12.8% reduction)
- tsidxWritingLevel 4: (403.32 + 413.07 + 407.33) / 3 = 407.91 MB = 78.2% (-22.8% reduction) - winner!
More or less as expected, the hightest tsidxWritingLevel=4 had most optimizations regarding storage. For this test case ~20% less disk space is needed for the index files. Notice that these really only apply for the given dataset, your results may vary.
As we see the ingest time seems slightly but not noticeable higher. The load avg displays the 1min, 5min, 15min for documentation purposes. It is expectable that 5 and 15 min load is rising during the test.. in general the avg 1 min load is quite comparable.
For metrics we created a sample csv of 15mio. events containing 15 metrics with different values and 3 static dimensions. The resulting rawdata is 810MB.
- Splunk Enterprise v8.1.5
run number | tsidxWritingLevel | time taken ingest | Bucket sizeOnDisk (MB) | load avg |
---|---|---|---|---|
1 | 1 | 95s | 205.51 | 2.57, 1.34, 0.66 |
2 | 1 | 95s | 204.38 | 2.96, 1.89, 0.95 |
1 | 2 | 95s | 203.97 | 3.61, 2.4, 1.26 |
2 | 2 | 95s | 205.30 | 3.13, 2.55, 1.47 |
1 | 3 | 95s | 204.72 | 2.82, 2.57, 1.62 |
2 | 3 | 95s | 204.36 | 2.93, 2.69, 1.8 |
1 | 4 | 95s | 204.18 | 2.75, 2.61, 1.89 |
2 | 4 | 95s | 204.53 | 2.82, 2.58, 1.97 |
As we see there is almost no difference in the resulting bucketsize metrics. All results vary less than 1%. All runs have the same ingest time.
We learned that tsidxWritingLevel of metric indexes have no impact on storage size so far.
Here you find a link to the git repo where the tests are documented. You can adjust the config.yml file to create your own tests with your own data.
- link: config.yml used for events
- link: logfile produced by run_test.py for event
- link: config.yml used for metrics
- link: logfile produced by run_test.py for metrics
There is a bug affecting version v8.1.1, v8.1.2, v8.1.3 (fixed in v8.1.4) and v8.2.0 (fixed in 8.2.1) documented as SPL-197930. There indexers have huge memory spikes and may crash when tsidxwritinglevel = 4 is set: see link.