I originally posted this on the NDepend blog a little under two years ago. < NDepend is a static analysis tool that plugs right into Visual Studio.
What makes a codebase acquirable?
This is the rare question that affects software developers, managers, and executives in a surprisingly similar way. And that's saying something since, by and large, people up and down the corporate pyramid don't tend to share a lot of professional overlap.
Codebase Acquirability: The Developer's Story
Early in my career, about 100 years ago, my company acquired a small mom-and-pop operation.
Mom and pop had built a niche, feature-rich accounting package that helped my employer's customers. And they built it in MS Access, of all things. (Neither mom nor pop was a software professional by trade, so this was a self-taught, learn-on-the-fly situation where, all things considered, they did something impressive.)
Of course, I wasn't part of this acquisition. My job title at the time was, to the best of my recollection, "software engineer."
So I first encountered this MS Access codebase (which is technically just a file) during a meeting with my boss and the VP of engineering. "Congratulations, Erik," they told me. "You're now the tech lead for this product."
Here's the MDB file, so go put it in source control or whatever, and try to make it into something that looks like it actually belongs in our application portfolio." The glass-half-full interpretation of my reaction was "wow, you're putting a lot of faith in me."
Had I been part of the acquisition discussion, I'd have thrown red flags like a bullfighter. But I wasn't. We acquired the code and it worked out.
Codebase Acquirability: The Executive's Story
But you know who is part of the acquisition discussion? Executives.
And they worry just as much about the nature of acquired codebases as the prospective developers working in them. A gigantic mess of a codebase is unpleasant for the folks working in it and problematic for achieving business goals.
In more recent years as a consultant, I've helped executives face strategic decisions about codebases.
- Should I retire this or keep plugging away?
- Can we evolve this codebase to keep up with the current market?
And, yes, there were questions of what to do with inherited codebases or whether or not to inherit them.
But how does one actually make that determination? Do you send in a few software developers to size the thing up and give you anecdotal "expert" opinions?
I mean, you can, and that's better than nothing at all. But it's some kind of law of nature that we software developers hate any code that we didn't write.
So how do you really know? How do you tell if a codebase is acquirable?
First of All, What Do Acquiring Shops Do With New Codebases
Reasoning about codebase acquirability can initially feel like boiling the ocean. You might be looking to take on a codebase with hundreds of thousands or millions of lines of code.
Even if the party from whom you were acquiring the thing were to give you endless time and access, you might be poking around for months.
And the problem itself seems a little hard to wrap your head around. Isn't this just the same as asking, "is this code 'good' for some definition of good?"
Admittedly, it's not easy to tease out a working way to evaluate acquirability. But it's not impossible, either. Consider some things that organizations will typically do when they acquire a codebase:
- Re-brand or overhaul the GUI in some fashion
- Perform various flavors of service or database migration
- Assign new people to work on it
- Merge it with another codebase
- Evolve it to keep up with market concerns
- Leave it alone and touch it as little as humanly possible
Reasoning About Acquirability
And that leads us to be able to define some questions to ask about a new codebase. With these potential activities in mind, let's look at some code-specific questions that we might ask.
- Re-brand or overhaul the GUI in some fashion: how coupled are the GUI and the application code?
- Perform various flavors of service or database migration: what is the general level of coupling in this codebase?
- Assign new people to work on it: how easily groked ("clean") is this codebase?
- Merge it with another codebase: how modular is this code, and how much does it conform to standard practices?
- Evolve it to keep up with market concerns: what kind of test coverage and quality of unit tests are there and how much tech debt is there?
- Leave it alone and touch it as little as humanly possible: how buggy does this code figure to be?
Now, this is neither an exhaustive list, nor will you probably want to do all of these things with the incoming codebase (indeed, "evolve and leave it alone" simultaneously wouldn't even be possible). But it backs us away from boiling the ocean.
It's a start.
The question now is how do you go about answering these questions?
NDepend and Acquirability Out-of-the-Box
As I've said before, I have a consulting practice wherein I help executives answer questions like these. But I don't think I'd be giving away the slightest bit of tradecraft to say that I use NDepend prominently in performing codebase assessments.
Let's consider some out-of-the-box properties of taking this approach. In a post I did about analyzing the Moq codebase, I talked about the codebase's tech debt grade and the reasoning behind it.
If you're looking to see how easily a codebase will evolve, that's an excellent starting point right there on the dashboard. You can go in and see the debt and then drill into the reasoning behind it You can also bring in test coverage data to get a comprehensive answer to the evolution question.
From there, you can take a more detailed tour through the out-of-the-box code rules. If NDepend is showing you critical rule violations galore, code smells, design issues, and regressions, you can assume that you're looking at a potentially buggy codebase and one that won't conform to standard practice.
And then, for questions of modularity and coupling, you can simply look at the codebase via detailed dependency graphs. If you're more of a numbers person, you can have that too, but when it comes to architecture, a picture is worth 1,000 words.
So with a quick install and analysis from NDepend, you can start to put together pretty compelling and visual answers to these questions.
Making the Case With CQLinq
Notice that I talked about the quality of unit tests and also the reliance specifically on GUI-based libraries. In this case, you're asking questions that are a little more specific than the ones that NDepend's default rules answer right out of the box.
But that's where CQLinq really shines. With CQLinq, you can turn the code into data, and you can answer just about any question you want.
For instance, what is a well-written unit test? Well, that's hard to say at a glance.
But figure that it probably doesn't have excessive setup and that it probably asserts something. So you can put together a CQLinq query to see what percentage of unit tests have fewer than say 10 lines of code and that have at least one assert.
That's not definitive, but if a low percentage of the tests meet those criteria, you might be looking at an iffy test suite, coverage notwithstanding.
You can also enlist CQLinq in this fashion to zero in on specific dependencies. See how many of your classes contain direct references to some GUI assembly. Or see what percentage of your codebases's references are to relatively common libraries and how many oddball utilities are in there.
This isn't an exhaustive list of what you can evaluate by any stretch, but it conveys an idea of what's possible.
You Actually Can Look Before You Leap
I could write a whitepaper or even a book about evaluating a codebase's suitability for acquisition. (I might actually do that at some point, come to think of it.)
But you don't need all of that information to understand the salient point here: evaluating a codebase in this way isn't beyond you, even with relatively limited exposure to the codebase.
Is it daunting? Sure.
Is using a tool to look at hundreds of thousands of lines potentially reductionist? Of course—how else could you look at so much code so quickly?
But is it valuable? Absolutely.
Thinking back to having an MS Access database plopped in my lap, I can't help but wonder if, armed with good tooling and the ability to make a persuasive business case, I might have helped change the trajectory of that business decision in a way that benefited everyone.
If you want to take a Freakonomics style look at your codebase by turning code into data, you can take NDepend for a 14 day spin.