Anyone in a technology organization can relate to a certain frustration. You know that adopting a certain tool or practice would help you. So you charge forward with the initiative, looking for approval. But then someone -- a superior most likely -- asks you to justify it. "Give me the business case for it," they say. And then, flummoxed a little, the gears start turning in your head. Today, I'd like to talk about that very issue in the specific context of log analysis tools.
If you have significant operations of any kind in production, you're almost certainly generating logs. If not, you should be. You're also probably monitoring those logs, in some fashion or another. And if you're consuming them, you're analyzing them in some fashion or another. But maybe you're doing this manually, and you'd rather use a tool for log analysis. How do you justify adopting that tool? How do you justify paying for it?
ROI: The Basic Idea
To do this, you have to veer into the world of business and entrepreneurship for a moment. But don't worry -- you're not veering too far into that world. Just far enough to acquire a skill that any technologist ought to have.
I'm talking about understanding the idea of return on investment (ROI). Follow the link and you'll see a formula, but the idea is really dead simple. If you're going to pay for something, will that something bring you as much or more value than what you paid? When the answer is "yes," then you have a justifiable decision. If the answer is "no," then you can't make a good case for the investment.
So, for log analysis tools, the question becomes a pretty straightforward one. Will your group realize enough cost savings or additional revenue generation from the tool to justify its cost?
Employing Back-of-the-Napkin Math
When you're asked to justify purchasing a tool, you might wonder how much rigor you must bring to bear. People working with technology tend to have an appreciation for objective, empirical data.
When making a business case, if you can back it with objective, empirical data, that's great. You should absolutely do so. But that's often hard because it involves making projections and generally reasoning about the future. We humans like to believe we're good at this, but if that were true, we'd all be rich from playing the stock market.
So you need to make some assumptions and build your case on the back of those assumptions. People sometimes refer to that as "back-of-the-napkin math" and it's a perfectly fine way to build a business case, provided you highlight the assumptions that you make.
For instance, let's say that I wanted to spend $50 on a text editor. I might project that its feature set would save me 20 minutes per day of brainless typing. I'd highlight that assumption and say that, if true, the investment would pay off after less than a week, given my current salary. These are the sorts of arguments that bosses and business folks appreciate.
First, the Cost of Log Analysis Tools
To make a business case and a credible projection of ROI, you need two projected pieces of data: the cost (i.e., the amount of the investment you're looking for a return on) and the savings or revenue benefit. I'll dedicate the rest of this post to talking about how log analysis tools can save companies money or even add to their bottom line. But first, let's take a look at their costs.
The most obvious cost is the sticker price of the tool. That might be an initial lump sum, but in this day and age, it's usually going to be a recurring monthly subscription cost. So when making your case, be sure to take that into account.
There's also a second, subtler cost that you should prepare yourself to address. Installing, learning, and managing the tools takes time from someone in the IT organization. You can (and should) argue that it winds up saving time in the end, but you also must acknowledge that investing employee time (and thus salary) is required.
Once you have those costs established, you can start to reason about the benefits.
Save Time Troubleshooting
Good log analysis tools offer you a lot of nice benefits. You can search your logs quickly and intuitively, visualize the data they contain, and collaborate easily. All of this amounts to a reduction in troubleshooting time.
Use this as you make your case. How often do production issues result in you combing through log files, looking for answers? What if you could significantly reduce the time that search took each time it happened? How much less time would the team spend fixing defects? And what about the effects on your users and customers? What could faster issue remediation do for your paid customer base?
Get in Front of Issues
You definitely realize savings from reducing the cycle time of issues that your users might raise. But what about the issues you simply prevent altogether?
With a sophisticated log analysis scheme in place, you can look for telltale signs that a problem is coming. Are you experiencing traffic spikes? Are your servers getting themselves into dangerous territory? You can glean this information from trends in your log files if you know how to look.
Make the case that you could eliminate a certain percentage of production issues, and then calculate the labor savings and reputation benefits from there.
Audit Compliance
If you work in a regulated industry, you have some experience with this concern. In general, many organizations have to comply with certain standards or ways of operating.
Your log files provide a forensic, blow-by-blow replay of what your software does in production. So if you're logging heavily and have software that complies with these requirements, you stand to benefit. But you're not just going to hand terabytes of log files over to an auditor and say, "Here, go nuts." You have to filter and munge that data into a meaningful, relevant subset of data. And that's where a good log analysis tool comes in.
So build the case for log analysis tools as a key factor in achieving or keeping a certification that matters to your organization. Or when faced with audits, calculate the time that your team can save in preparation.
Actionable Business Intelligence
So far, I've focused mostly on realizing ROI via cost savings. Less time troubleshooting, fewer angry customers, more efficient audit compliance -- all of these save the business money. But you can also use log analysis tools to make yourself some money as well.
Somewhere, sitting in all of the countless bytes of data your application leaves behind, are valuable insights. Maybe for some reason, you see a significant spike in new signups Tuesday afternoons between 2:00 PM and 4:00 PM. If you see that trend taking place, you can start to do some research into why that happens and seek to replicate it throughout the week. If you have log analysis tooling in place, you'll start to see interesting trends emerge. But if you don't, you'll almost certainly miss these sorts of opportunities.
How do you quantify this, given that you don't yet know what you don't know? That's going to be pretty hard. In this case, you might want to do a trial run and see what insights you gain. You can then present those as representative.
Gain an Advantage in the Endless Security Struggle
I'll close with an issue that resonates with anyone in the industry. Look no further than the unimaginable current PR woes of Equifax to understand how important security is and how costly breaching trust with customers and users can be.
Log analysis tools won't give you a silver bullet because nothing will. But think of the benefit of having quick, actionable information about people trying to compromise your systems. With such tools, you can receive notifications when you have spikes in unauthorized access attempts. You can even go more nebulous and cautious than that and just make yourself aware anytime something looks out of the ordinary with the way your app is running.
In the case of security, you might not need to put together any back-of-the-napkin math at all (though you certainly could make some assumptions about savings). Often, those in a position to approve budgets have already done plenty of this math on their own and have a willingness to invest in something that could help with security.
Making Your Calculations
Estimating ROI is an inherently fuzzy business. It will never be as cut and dried as measuring your site's uptime or calculating automated test coverage. But that shouldn't stop you from making an attempt.
In my travels as an employee and later a consultant, I've always been amazed at how few people really make any semblance of a business case for technical tooling. They simply ask for it, stating that it's "helpful" or that adopting it is a "best practice." So just going through the exercise and having any business case at all puts you ahead of the curve.
And the case for log analysis tools is really not a hard one to make. You can generally get started for free. And after that, the monthly cost will be such that even a few hours per month of labor savings for anyone in the DevOps organization will easily justify the investment. So calculate the ROI, make your business case, and then make your life easier.