Why using XML Event Logs sucks using Splunk
Yesterday I had a discussion with a colleague if we should switch the indexing of Windows Eventlogs to XML. He mentioned that he was told that it’s faster, needs less data volume and language agnostic.
As I couldn’t imagine that something with the abbreviation “XML” in it could be something like “small” and “fast” I decided to do a test.
Interesting enough there is a blog article found at http://blogs.splunk.com/2014/11/04/splunk-6-2-feature-overview-xml-event-logs/ also stating that you would have an data reduction.Interesting enough there is a blog article found at http://blogs.splunk.com/2014/11/04/splunk-6-2-feature-overview-xml-event-logs/ also stating that you would have data reduction.
You start to collect XML Events with adding renderXml = 1 to the input stanza. When doing so the suppress_text = 1 is automatically set. Of course you could also omit the Eventlog message for your non-XML input and achieve the same volume reduction. Here I will keep the Eventlog message for the XML and non-XML scenario to ensure not comparing apples and oranges.
Index performance
The same dataset was indexed by the same forwarder on the same hardware.. let’s determine the time needed for indexing:
index=noxml OR index=xml | stats count earliest(_indextime) as ite latest(_indextime) as itl by index | eval timediff = itl-ite | convert ctime(ite) ctime(itl)
Indexing the XML took 195 seconds vs. 153 seconds, 27,5% longer.
Size
Determine the indexsize:
| dbinsprect index=noxml OR index=XML | table index SizeOnDiskMB
XML needed 17,7% more storage.
Search performance
Running a noncomplex search, I can’t remember if I was more surprised that the XML search was more than 10X slower or that it showed a different result count. I repeated the search 3 times to ensure the results are accurate.
index=noxml | stats count by EventCode – (fast mode enabled)
This search has completed and has returned 223 results by scanning 56,807 events in 2.923 seconds. This search has completed and has returned 223 results by scanning 56,807 events in 2.91 seconds. This search has completed and has returned 223 results by scanning 56,807 events in 2.888 seconds.
index=xml | stats count by EventCode – fast mode
This search has completed and has returned 106 results by scanning 56,807 events in 33.822 seconds. This search has completed and has returned 106 results by scanning 56,807 events in 33.593 seconds. This search has completed and has returned 106 results by scanning 56,807 events in 33.309 seconds.
All time ran into command.search.kv
I found a lot of events where the EventID is not extracted correctly from the XML..
index=xml OR index=noxml sourcetype=*application* RecordNumber=9172 | table EventCode sourcetype index _raw
All tests was done on “latest an greatest” Splunk TA Windows v4.8.3 runing on Splunk Enterprise v6.5
Summary
Do never ever use XML rendering because of performance or expected data volume reduction. For now the only valid reason seems to be to overcome language issues..