Write Less, Fix Never: The Art of Highly Reliable Code

Dhruv Agarwal - Jun 17 - - Dev Community

Burnt out engineer

If you're a developer tirelessly pushing out new changes, only to be dragged back by errors in your past work, this post is incredibly relevant for you.

Over the past decade in software development, one of the key mistakes I've made and seen others make repeatedly — is focusing on doing more work rather than ensuring the work done (no matter how small) is robust and will continue to work properly. These recurring errors can significantly hamper productivity and motivation.

From my own share of mistakes, I’ve learned valuable lessons. Here, I’d like to share a few strategies that will not only help you ship robust software but also free you from the shackles of your past work.

Tell me more

We will talk about the top 5 strategies that worked for me:

  1. Plan for 10x
  2. Psst: Your old work got a bug and is calling you back
  3. Make the Systems Work for You, Not the Other Way Around
  4. Always Answer with a Link
  5. Understand software building is a team sport.

1. Plan for 10x

There are two types of engineers IMHO: those who hack their way through for today and those who design for the distant future. Neither approach is sustainable on its own.

Your code should be able to handle the growth your business is about to experience. However, over-designing for future challenges can lead to unnecessary complexity. There's a term dedicated to this - Bike Shedding

Scaling up

Here's my practical rule of thumb: plan for 10 times the current scale or consider how much your business is expected to grow in the next 2-3 years. Ensure your plans align with your business goals.

For example, if you're a cab company designing a booking module, and today your company handles 10,000 rides a day with an expectation to reach 100,000 rides a day in 2 years, use that as your benchmark. Designing a system for 10 million rides a day when you're only doing 10,000 rides might result in an overly complex and expensive solution.

2. Psst: Your old work got a bug and is calling you back

Broken system

"Days and weeks of debugging can save you a few hours of writing tests" - someone wise.

Shipping code without testing all the edge cases is like a spray and pray strategy. The simplest way to ensure your code works as expected is by adding unit tests. This might sound obvious, but the importance of thorough testing cannot be overstated.

Unit tests not only act as the first line of defense against obvious errors but also serve as insurance for your code against unintended changes that could violate business requirements. Hence, reducing those adhoc bugs being assigned to you every sprint 😉

A trick for the lazy (like me): Before you write the code:

  • Write tests covering every corner case you can think of.
  • Pretend you're trying to break someone else's system.
  • Write assert False in all the tests and run them.
  • Naturally, all tests will fail.

Now, just work towards making each test pass. This approach takes less time overall and produces robust code every time!

3. Make the Systems Work for You, Not the Other Way Around

Monitoring systems

One of my managers once gave me the most impactful advice: "Act, don't react." This advice came when I was constantly being tagged on different Slack channels for problems, customer complaints, and payment failures. I was just reacting to each request, having no clue what might happen next.

That's when I started asking three questions for every feature I built:

  • How will I know it's working?
  • How will I know it failed?
  • How will I know it succeeded?

I then answered these questions at every level (feature, screens, app) by sending metrics to our APM tools like Datadog or NewRelic.

APM sample

After setting this up, I configured alerts to notify me if anything went wrong.

By doing this, I became aware of bugs before they escalated into major issues, preventing reactive measures, poor customer experiences, and my own uncertainty about what might come next.

Start answering these three fundamental questions every time you build something to ensure you always act instead of react.

4. Always Answer with a Link

Replying with documentation

Just like bad work gets you tagged on various Slack channels for fixes, great work gets you tagged for context in areas you've worked on.

This can drain your energy when you least expect it, or worse, it can make you the go-to person for the same tasks because you know the complete picture.

Keep this secret trick to yourself:
Document everything. Include the context, architecture, and business-specific decisions you made while building the feature. When someone asks about the context of an area (feature, screen, app), just send them the link to the updated document. This will save you a few hours every time.

Additionally, thorough documentation makes onboarding new team members easier and ensures that your work remains accessible and understandable over time.

5. Understand software building is a team sport.

Ted Lasso appreciation

Software engineering often emphasizes the individual contributor path. However, reaching the end goal alone is impossible—you only reach it with your team (and vice versa).

Understanding and adopting a process excellence mindset helps you leverage the team's collective productivity.

Confused

Sorry for that worded statement 😄
To simplify, ensuring that reviews, deployments, and any collaborative activities involving code don't have significant wait times boosts your productivity immensely!

The best way to identify high waiting or blocked times in your team is to measure DORA metrics. You can use an open-source tool like Middleware, which provides DORA metrics out of the box.

GitHub logo middlewarehq / middleware

✨ Open-source DORA metrics platform for engineering teams ✨

Middleware Logo

Open-source engineering management that unlocks developer potential

continuous integration Commit activity per month contributors
license Stars

Join our Open Source Community

Middleware Opensource

Introduction

Middleware is an open-source tool designed to help engineering leaders measure and analyze the effectiveness of their teams using the DORA metrics. The DORA metrics are a set of four key values that provide insights into software delivery performance and operational efficiency.

They are:

  • Deployment Frequency: The frequency of code deployments to production or an operational environment.
  • Lead Time for Changes: The time it takes for a commit to make it into production.
  • Mean Time to Restore: The time it takes to restore service after an incident or failure.
  • Change Failure Rate: The percentage of deployments that result in failures or require remediation.

Table of Contents





PS: I'm also co-founder of Middleware and our mission is to make engineering frictionless for engineers. Do consider giving us a star if you like what we've built!

Ship code like a boss!

Boss person

By adopting these suggestions, you can significantly reduce the time spent revisiting and fixing past work. This will not only enhance your productivity but also ensure that your focus remains on innovating and delivering new features.

Be productive, not busy! All the best 😊

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .