how-to timechart [possibly] better than timechart in splunk

I recently had cause to do an extensive trellised timechart for a dashboard at $CUSTOMER in Splunk.

They have a couple hundred locations reporting networked devices.

I needed to report on how many devices they’ve reported every day over the last 90 days (I would have liked to go back further…but retention is only 90 days on this data).

My initial instinct was to do this:

index=ndx sourcetype=srctp site=* ip=* earliest=-90d
| timechart limit=0 span=1d dc(ip) by site

Except…that takes well over an hour to run – so the job gets terminated at ~60 minutes.

What possible other approaches could be made?

?

Well.

Here are a few that I thought about:

  1. Use multisearch, and group 9 10d searches together.
    • I’ve done things like this before with good success. But it’s … ugly. Very, very ugly.
    • You can almost always accomplish what you want via stats, too – but it can be tricky.
  2. Pre-populate a lookup table with older data (a la option 1 above, but done “by hand”), and then just append “more recent” data onto the table in the future.
    • This would give the advantage of getting a longer history going forward
    • Ensuring “cleanliness” of the table would require some maintenance scheduled searches/reports … but it’s doable
  3. Something else … that “happens” to work like a timechart – but runs in an acceptable time frame.
  4. Try binning _time
    1. Tried – didn’t work ?

So what did I do?

I asked for ideas.

If you’re regularly (or irregularly) using Splunk, you should join the Splunk Usergroups Slack.

Go join it now, if you’re not on it already.

Don’t worry – this blog post will be here when you get back.

You’ve joined? Good good. Look me up – I’m @Warren Myers. And I love to help when I can ?.

I asked in #search-help.

And within a couple minutes, had some ideas from somebody to use the “hidden field” date_day and do a | stats dc(ip) by date_day site. Unfortunately, this data source is JSON that comes-in via the HEC.

Poo.

Lo and behold!

I can “fake” date_day by using strftime!

Specifically, here’s the eval command:

| eval date=strftime(_time,"%Y-%m-%d")

This converts from the hidden _time field (in Unix epoch format) to yyyy-mm-dd.

This is the ?!

What does this line do? It lets me stats-out by day and site (just like timechart does … but it runs way faster (Why? I Don’t Know. He’s on third. And I Don’t Give a Darn! (Oh! That’s our shortstop!)).

How much faster?

At least twice as fast! It takes ~2200 seconds to complete, but given that the timechart form was being nuked at 3600 seconds, and it was only about 70% done … this is better!

The final form for the search:

index=ndx sourcetype=srctp site=* ip=* earliest=-90d@ latest=-1d@
| table site ip _time
| eval date=strftime(_time,"%Y-%m-%d")
| stats dc(ip) as inventory by date site

I’ve got this in a daily-scheduled Report that I then draw-into Dashboard(s) as needed (no point in running more often, since it’s summary data that only “changes” (at most) once a day).

Hope this helps somebody – please leave a comment if it helps you!

from antipaucity https://antipaucity.com/2020/08/18/how-to-timechart-possibly-better-than-timechart-in-splunk/
via IFTTT