Poor Search Practices: Being Kind to Splunk (2024)

I’ve had the privilege of working heavily out of Splunk for the last 3-4 years now, and in that time there have been a lot of lessons learned. Splunk is a fantastic tool but like with any other product, if you don’t treat it nicely things have a good chance of going sideways. We can’t reasonably expect everyone who may touch a Splunk search head to be subject matter experts on all things SIEM, but we do want certain high points to be noted. You don’t have to know how every part of your vehicle works, but you should know that putting water in the gas tank won’t turn out well. It is in this spirit that I wanted to do a quick write up on some common pitfalls and poor practices we may come across, and how they can be avoided.

1. Leading Wildcards

As a very quick review, a wildcard is used to represent an unknown value to something you’re looking for. As an example, if one was looking for all Bobs in a list of company usernames, they may search for user=Bob*. This would pull back any names that begin with Bob - “Bob” “Bob.Jones”, “Bobbyman”, “Bobbit”, etc. When Splunk sees a trailing wildcard, it can exclude all of the other usernames that don’t begin with Bob, and focus only on the ones that do. This will make for a fairly speedy search.

Poor Search Practices: Being Kind to Splunk (1)

Leading wildcards, however, represent a bigger issue. When Splunk encounters a leading wildcard, it has to check everything inline - since it starts with an unknown value, Splunk can’t exclude any usernames since any of them could be what it’s looking for. This behavior is due to how Splunk parses logs into segments using major and minor segmenters found in logs - think spaces and colons, semicolons, periods, etc. As it goes through each of these segments of the log, it’s having to stop at each segment and say “Hm..it might be that one”, rather than just having to focus on the Bobs.

Try this instead:

If you absolutely have to use leading wildcards for discovery purposes, and it does happen, attempt to use them on a very small subset of logs first. You want to enumerate out that unknown field as soon as possible so you can adjust your search to include the known value of what you’re looking for. If you’re needing to pull a bigger set using a leading wildcard, ensure you’re being specific with the field you’re looking for; user=*Bob at least tells Splunk to wildcard check only the "user" field values. A lonely *Bob on it’s own does not.

2. Verbose Searches

There’s nothing inherently wrong with Verbose searches. When searching in this mode, it’s telling Splunk to pull back all the possible fields it can parse from an event and give them to you. This makes it fantastically useful when doing discovery searches where you may not know what the field you’re looking for is called. As such it’s generally used when first building search queries before they’re whittled down to be nice and specific to what they’re looking for.

Poor Search Practices: Being Kind to Splunk (2)

However, where this can go sideways is when verbose mode is called on a large set of data. Pulling back 60 days worth of firewall logs in Verbose Mode can put stress on the system - and your Splunk admins.Also, if you save your alerts and dashboards in Verbose Mode, the searches will continue to pull all possible fields every time they run, which may be quite frequently depending on the case. It’s a lot of extra overhead for not a lot of benefit.

Try this instead:

Again, there’s nothing wrong with using Verbose Mode. However you want to ensure you’re using it in close discretion to your time window - try to throw a smaller net to discover the fields you’re looking for, then include them in your query. This will then allow you to switch to a faster and more efficient mode such as Smart Mode or Fast Mode, which will check only the fields called. Additionally when building alerts, dashboards and panels avoid saving your searches in Verbose Mode. Dashboards are subject to refreshing with some frequency as their visual panels are driven by searches, so a verbose search means it’ll have to keep running that heavy query over and over when really you only need 3-4 fields.

3. Beating Around the Bush

I like to use the example of the Splunk search head being like a librarian - bear with me here. If you walk up to a librarian and ask “I’m looking for a book”, and she took you as seriously as possible with no sense of humor, she would inevitably bring you the entire library. There, you have your book. This represents a common mistake I see in queries, where a user may search for a single word or field. Splunk does it’s best work when you’re as specific as possible with what you want - the only time you may not be specific is when you’re doing initial discovery searches, and we touched on that in the above section. Without being specific, Splunk has to do much, much more work to pull what it thinks you want and will end up giving you far more than what you need.

Try this instead:

Let’s continue the analogy. If you add specifics such as “I’m looking for a book, it’s a book on vampires, I think it’s romance, it’s in the young adults section and I swear it’s for my kid” that librarian has a pretty good chance of bringing you back Twilight, or at least a cart of books containing Twilight. So we can look at this as the difference between the following:

Poor Search Practices: Being Kind to Splunk (3)

Or

Poor Search Practices: Being Kind to Splunk (4)

One will be incredibly inefficient at getting what you came for, one will get you to the book, and either might get you kicked out of the library. Be specific with Splunk!

4. Poor Time Picker Usage

This is another common mistake I see when users are searching. Pulling again from above, Splunk likes you to be specific - a bulk of that weight is carried in how the search is written but much of it is also carried by the time picker settings. Setting a wide time frame means that Splunk has to check much, much more data to find your results. Additionally, due to how Splunk stores its data, older data will generally take longer to pull up and sort through - like being in a warehouse and trying to get to boxes that are behind other boxes on a shelf.

Poor Search Practices: Being Kind to Splunk (5)

I’m also going to include “All Time” searches under this section as well. I have yet to work with a Splunk environment in which an “All Time” search was needed - perhaps I’ve just not worked with the right size of environment for it, but in a corporate setting you generally will not need this option. Searching for anything “all time” is like being in that warehouse and asking for everything on the shelves. For some log sources this may not be as big an issue, but if you’re searching through Windows or network traffic logs, this may very well knock over your search head.

Try this instead:

Try to aim your search as close to your desired time frame as possible. If you’re looking for a failed authentication that you think occurred last Tuesday, set your time frame to span that day, maybe stretching it a bit to cover some of Monday night and Wednesday as well. Rather than searching all failed authentications that occurred between then and now, you can speed up your search significantly by narrowing down your results via the time picker to just that day, or around it. There will be times when a larger time frame will be needed - last 30 days, year to date, etc. - and in these cases I suggest crafting your query with the pertinent fields called and running it in fast mode. It will still be a heavy search depending on the log source - a month of network logs will be a lot bigger than a month of malware logs - but it is healthier for Splunk.

5. Real Time Searches

I know this one sounds tempting. It’s a real time search. I want to see attacks and malware events as soon as they get into Splunk, the moment they do in fact. It’s an understandable want in this day and age. In some of the very small environments this may be doable; again though, as mentioned above, I’ve yet to see an environment where this was a good idea. Even a perfectly crafted and efficient query will strain an environment running in real time. Computationally, Splunk is running the same search every single second of every minute of every hour of every day, and so on. And it’s doing this at the same time as it’s running all it’s internal processes, logging, processing of other alerts and reports, etc…if this sounds like it’d be heavy, it is! And this is for a single real time search, not even concurrent searches. I’ll commonly see a real time search used within dashboard panels, usually something that’s up on the wall and is meant to carry metrics or something equally pretty for people taking SOC tours. This can create a real bottleneck in your Splunk environment though, and someone once put it in the best way possible “If you’re ever going to look away from your monitor, drink coffee or check your phone, you don’t ‘need’ real time searches.”

Try this instead:

Set your searches to run on a frequent interval. When saving your alerts and searches, you can set them to run on a schedule either by preset time windows or by cron jobs. Avail yourself of this option! Set them to run every 10 minutes or every 5 minutes; if you need as near to real-time as possible, you could have it run every minute. While still a bit computationally heavy, a search that runs every minute is better than one that has to run 60 times a minute.

These practices are all common Splunk pitfalls and have been the cause of many a headache. Some of these are more apparent than others, and there are some that I haven’t listed here - perhaps for a follow up post, looking at you "index=*" - but these were the ones I wanted to get out first and foremost. As always there may be times where you might need to utilize these practices so these are not meant to be hard rules by any means, rather best practices to keep in mind while searching.

Happy Splunking!

Poor Search Practices: Being Kind to Splunk (2024)
Top Articles
Latest Posts
Article information

Author: Cheryll Lueilwitz

Last Updated:

Views: 6398

Rating: 4.3 / 5 (74 voted)

Reviews: 81% of readers found this page helpful

Author information

Name: Cheryll Lueilwitz

Birthday: 1997-12-23

Address: 4653 O'Kon Hill, Lake Juanstad, AR 65469

Phone: +494124489301

Job: Marketing Representative

Hobby: Reading, Ice skating, Foraging, BASE jumping, Hiking, Skateboarding, Kayaking

Introduction: My name is Cheryll Lueilwitz, I am a sparkling, clean, super, lucky, joyous, outstanding, lucky person who loves writing and wants to share my knowledge and understanding with you.