Big O Notation as a Mid-Level Developer Who Has Been Avoiding It Since Bootcamp: Arrays and Time Complexity

Thuy Doan - Oct 12 '21 - - Dev Community

At the beginning of this year, I was promoted to intermediate developer ๐ŸŽŠ

At your company, that might be an IC2 - or whichever level is after your entry level developer, but right before the senior developer. In any case, I was now at a place in my career where computer science fundamentals needed to be stronger compared to the beginning when I could just throw myself into building things with what I learned in full-stack Javascript bootcamp.

I decided I needed to better understand data structures and be more comfortable with algorithms. Not because I wanted to leetcode more. I really don't want to leetcode more. But I couldn't shake the feeling that I would be better off if I understood more why data structure A over data structure B.

So I reached out to a friend for help and this is what I've learned ๐Ÿค“

What did I know about Big O notation?

My mental model of Big O has always been this:

1) A unit of measurement
2) Related to computer science that
3) Describes the complexity of things

From here, I needed to understand why? ๐Ÿ’ญ

Why must we measure the complexity of things?

As developers, we deal with data.

Sometimes not very much of it, like on a static website. Sometimes a whole lot of it. The multi-millions of users kind. And most of the time, that data is not in a format that we need and we need to manipulate it. Sort it, filter it, or find something. Sometimes we even need to change it into an entirely different format! And how efficiently we do that matters at scale.

What's also true is that there are many ways to solve a problem. This is especially true in programming. You can then think of Big O notation as a way to describe how efficient a solution is relative to another one.

What types of Big O notation are there?

In this post, we'll focus just on the types that apply to arrays but know there are a number of them that you can see below:

Graph depicting different big o notation time complexities with O(n!), O(2^n), O(n^2) as horrible; O(n log n) as bad; O(n) as fair; and O(log n) or O(1) between good and excellent

Source: Big O Cheatsheet

For arrays, you can have 2 types of time complexities (or Big O):

1) Constant time or O(1)
2) Linear time or O(n)

Table with two columns with headings: Operation and Time Complexity, respectively. Under Operation, it lists 4 items: Searching, Access, Insertion, and Removal. Under Time Complexity, we have O(N) or linear time for searching; O(1) or constant time for access; O(N) for insertion unless it's inserting something at the end of the array, in which case it is O(1) or constant time; and lastly, O(N) for removal unless removing something at the end of the array, in which case it is also O(1) or constant time.

Source: Big O Notation for Arrays by KodinKevin on YouTube

With Big O, the n refers to the amount of data you are working with.

Practical Examples

Example A. Kanto Starter Pokemon

Kanto Starter Pokemon. There's a cute orange lizard, a grassy type of creature that has a flower bulb and walks on all fours, and a blue turtle that looks cheerful.

Let's say you're building a Pokemon app and you have an array of Pokemon.

const kantoStarters = ['Charmander', 'Bulbasaur', 'Squirtle']
Enter fullscreen mode Exit fullscreen mode

If you know the index of Squirtle in the array, you can access it by simply doing kantoStarters[index]. If this was instead an array of all 151 Kanto Pokemon, the number of steps it takes to access a Pokemon at a known index will be the same as when there were only 3 starter Pokemon because you can go directly to the index of the Pokemon. Hence, access in an array is considered constant time - also known as O(1).

Because constant time takes the least number of steps to complete an operation, it is considered the most efficient. Check that first graph out again!

Example B. All Kanto Pokemon

Let's say instead of knowing where exactly to look for a Pokemon in an array, we have to flip through it like a clothing rack at the mall or files in a filing cabinet. In this case, it would take at worst as many steps as there are Pokemon. Remember that n in Big O notation stands for the amount of data we are working with. So should we have to look through an unordered array of all 151 Pokemon to find a Psyduck it would take us O(n) steps. This is called linear time because given more data we take proportionately more steps.

Animation of flipping back and forth through a book

At this point, since constant time - or O(1) - takes a constant amount of steps, no matter the amount of data versus linear time - or O(n) - which takes proportionately more steps when given more data, we can say that constant time is faster or more efficient than linear time ๐Ÿ’จ

Example C. It Depends

Once we move into insertion or removal of data into an array, it gets a little nuanced. Let's say we create a new type of Pikachu that wears a coloured party hat (think Nintendo 64 Super Smash Bros) and we wanted to officially recognize it as a Kanto Pokemon: Party Pikachu. If we add Party Pikachu to the end of the list of Pokemon, that would only take one step. Hence, insertion at the end of arrays is constant time - or O(1). The same goes for removal.

4 versions of Pikachu wearing different coloured party hats

It's different, however, if we're trying to insert or remove an item from any other place in the array. Why? If we added Party Pikachu to the beginning, all the indices of the Pokemon after it would have to change because the order of Pokemon is now different. This also applies if Party Pikachu were to be added in the middle of the list. We would have to take as many steps as the number of Pokemon that come after it to change the indices to the new ones. Hence, insertion or removal anywhere but the end is linear time - or O(n).

const originalKantoPokemon = ['Bulbasaur', 'Ivysaur', 'Venusaur'] // and so on
// Where Bulbasaur is index 0

const newKantoPokemon = ['Party Pikachu', 'Bulbasaur', 'Ivysaur'] // and so on
// Where Bulbasaur is now index 1
Enter fullscreen mode Exit fullscreen mode

Career Value

You might be thinking, "That's great and all but why do I need to know this?" That's fair. I've been able to have a successful last 4-5 years as a developer without it. Heck, I even got promoted. But there's two possible reasons:

1) You want to get hired at a company that does leetcode.

FAANG companies - also known as Facebook, Amazon, Apple, Netflix, and Google - or similar, are infamous for testing leetcode, algorithms, and data structures in their interview process. If you want to get hired by them, you need to be able to reference Big O when you write an algorithmic solution.

2) You need to come up with efficient solutions.

Even if you avoid interviewing for companies that do leetcode, you will still have to work with data. And unless you can always work with a small amount of data, how performant the solutions you write to handle data will be important. Especially as you become a more senior engineer.

(This will become more apparent as I continue this series by moving into showing actual algorithms. Follow me and stay tuned!)

I'm personally in the second boat but I've since been opening myself up to the idea of the first one. Let's get better first then we'll see ๐Ÿคก

Onward

I was the kind of kid who was, for all intents and purposes, intelligent but didn't identify with being good at STEM subjects despite being an honour roll student throughout my education. Heck, my favourite subject was music. But at some point, you hit a wall that makes you realize your work could go much more smoothly if you deepened your knowledge in a particular area ๐Ÿš€

My goal is to be able to confidently answer why we should store data a certain way (i.e. dictionary vs. list) or traverse large amounts of data in a certain way, no matter if I'm being asked in an interview or if I simply have to complete a task for a job I'm currently employed for ๐Ÿ’ƒ๐Ÿป

You can think of what we discussed so far as the building blocks for choosing between multiple ways of handling data. If we know that searching through an array is linear time and we later find out that there's an alternate solution for searching through data that is constant time, which is faster, we might want to use the latter solution. However, there's other things to weigh, like readability and maintainability. More on that another time.

I'll keep learning and be sure to share more ๐Ÿ˜ฌ

Off to study linked lists!

Keep it candid,

Thuy ๐Ÿ™‹๐Ÿปโ€โ™€๏ธ

Note: This post focuses more on practical examples than it does on mathematical visuals. This is because not everyone will understand Big O with mathematical graphs. But if you are someone that will, I recommend this.

. . . . . .