Key Takeaways: Grokking Algorithms

Introduction

An old post by Gergely Orosz, author of the Pragmatic Engineer blog, among other things, recently resonated with me quite heavily. In which he describes the advice that he would give himself when starting out as a software developer or what he wish he knew sooner. At least for me, the stand out part is right at the beginning where he states

I wish I had read software-related books in my first few years as a developer. 
I only started doing this around year 5 of my engineering career.

This also struck me, I thought to myself “You know, I have never actually read a technical book…”. The reasoning for taking the time to do it is quite interesting, providing you with another perspective and exposure to things you would not otherwise come into contact with during your day-to-day work. After all, you don’t know what you don’t know. By properly reading a book, doing the exercises and taking notes, you can gain a better understanding of the field, from the perspective of someone who knows a vast amount more than you do in a subject. The topic I chose was algorithms and data structures.

Now, when I say I’ve never read a technical book, whilst I have absolutely tried, when it came to churning through algorithm and data structures - be it a course or a book - I could not seem to do it; maybe there was a lack of motivation at the time, or other factors, either way I only completed ~15% of anything I tried on this subject. However, after reading around for recommendations, I came across Grokking Algorithms: An Illustrated Guide for Programmers and Other Curious People by Aditya Bhargava.

Whilst I’ll avoid doing an overly lengthy review, I will say that this book was incredibly good, mainly because of how digestible it is. Bhargava does a phenomenal job of using diagrams, simple explanations, and real world examples to drive home the use case of particular algorithms and data structures. It may be the case that he misses some nuances through simplification, if this is the case, but I would say that this is perfectly acceptable in order to allow the reader to easily understand difficult concepts. I came away from the book with a better grasp on how useful all of it can be and, most importantly, where to dive in deeper for the next steps. My notes and attempts at the exercises for this are available on GitHub.

Takeaways

Big-O Notation is important

Big-O is used to describe the worst case for time and space complexity: how long an algorithm takes to run given input and how much memory a data structure requires to store the values you have.

At scale, these things matter. If you were to be creating a simple application for yourself, that will only be used every 2 weeks, it might not matter that much if it takes 30 seconds to churn through a data set. Compare this to when an application is not only used by you, but another 450,000 people around the world, if it were to take each person 30 seconds to receive a response on the data that they send, they are then unlikely to be very happy, especially if competing applications respond in less than a second.

Understanding this concept also helps in your own code, in the real world you can reach for a battle-tested library, rather than having to start from scratch when implementing an algorithm, such as when sorting a list of names alphabetically, but knowing the speed at which it will run in the worst case or the underlying data structure’s space complexity helps you make the better choice on when to reach for them. The adage of having more tools in your tool belt certainly comes to mind here.

Fundamentals: arrays and linked lists

This is quite a short one, but something which occurred to me throughout the book was to rehash over my understanding of these two relatively simple data structures, largely because they are actually gateways into being able to fundamentally understand the more complex types. Such as representing a heap as an array and hash tables being the tieing of a excellent hash function to an array.

Hash tables are really cool

I’ve dealt with hash tables, known as dictionaries or maps to me from Python and Go, quite a large amount but only on a very high level. My knowledge on this extended as far as being able to say if you do something along the lines hashTable["my_key"] you would get a value if it exists within the data structure. I never truly understood what went on underneath it, even as far as wrongly assuming it was incredibly complex. In actual fact it is simple and elegant.

Hash tables are arrays with a hash function, this sprung on me as I was reading the chapter in that the hash function returns an integer value that can be used as the index to place the key and its corresponding value - having a strong hash function also prevents collisions.

They’re also insanely fast on average, regardless of their size you can retrieve something in constant time, known as O(1); however, in the worst case they are actually O(n) for insertions, deletions, and searches. You want to use a hash table when we do not have a large load factor, meaning the number of elements in the array to the current max size, having a low load factors gives us the average performance, which is incredibly quick, but due to resizing the underlying array and needing to place all of our elements back in again from newly computing the hashes of the keys means that our hash table can be quite slow when under heavy load.

[2, 4, 9, <empty>] 
Array of size 4 with 3 elements already. 
Load factor = 3/4 = 0.75, we should resize the array

Although do remember, you probably won’t have to implement a hash table yourself, since a lot of languages come with them as part of the standard library. This means we can assume to get the average performance, constant time, out of them. Having a better understanding of them will most certainly help you going forward as a software engineer.

Recursion is difficult

Recursion was something that I came across when I first started to learn coding, it was something that I sort of understood, but have not used in any meaningful way. When done properly, you can create an elegant and readable solution for another person to read. It is also used in a lot of algorithms, e.g. merge sort and quick sort, so understanding it helps you immensely going forward.

Recursion also pops up very frequently in functional programming, so it is required when you want to delve into this. I feel that I struggled more than I should have on the exercise to do with recursively adding up all of the numbers in a list, but it was still really useful to try and think about things in a different way, as opposed to imperatively looping over values - this certainly feels more natural to me.

Feynman algorithm, solve any problem!

To close it out, I will leave you with my favourite part of the book. This is rather tongue in cheek and made me chuckle upon reading it, I would draw similarities to Kelsey Hightower’s No Code repository, although this was way before its time.

It is rather simple, there are only has 3 steps and it can help you solve absolutely any problem, regardless of difficulty.

Write down the problem you face.
Think really hard about that problem.
Write down the solution.

Now you can be well on your way to becoming a problem solving guru. This gave me so much of a chuckle, that I coded up a quick and silly repository myself for it.

Introduction#

Takeaways#

Big-O Notation is important#

Fundamentals: arrays and linked lists#

Hash tables are really cool#

Recursion is difficult#

Feynman algorithm, solve any problem!#