Book Club: Grokking Algorithms. 6: Dynamic programming and k-nearest neighbours

ruthmoog - Sep 13 '23 - - Dev Community

Grokking Algorithms An illustrated guide for programmers and other curious people, by Aditya Y. Bhargava


Chapter 9: Dynamic Programming

Chapter 10: K-nearest Neighbours

If you read any of these chapters, please go ahead and discuss what you thought in the comments!

📝 you can read my notes below.

📆 The next post will be on the final chapter, chapter 11. I'll aim for end of September 2023.
If you're on Storygraph and want to join me, say hi and I'll start a Buddy Read.


Dynamic Programming

Key topics covered prioritisation, Levenshtein distance ...

Solve sub-problems to solve bigger problems with dynamic programming (not to be confused with dynamic programming language). Start by using a grid of sub-problems x scenarios. Always start with a grid, and the cells are usually filled with what you're trying to optimise. Each cell will be a sub-problem.

The order of the rows doesn't matter, but the column vs row axes can matter. Adding another item to your grid might mean you have to recalculate at a finer granularity e.g. size in .5lb increments.

There is no solution in dynamic programming for splitting the existing items - you would use a greedy algorithm for that.
Dynamic programming only works when each sub-problem is discrete (doesn't depend on other problems), but it's possible for sub-problems to have their own sub-problems.


K-nearest Neighbors

Key topics covered feature extraction, regression, prediction, use cases, limitations, classification, normalization / normalisation, weighting, machine learning, pythagorean formula, OCR, optical character recognition, Naive Bayes classifyer ...

The K-nearest neighbor (KNN) algorithm is simple and the first choice if you want to classify something. It depends on feature extraction; you have to think about many different considerations to pick the right features. You should choose,

  • features that correlate to what you're trying to classify or regress
  • features that don't have a bias (e.g if you ask a group how much they like cats, you won't know how much they like dogs

Use the chosen features to plot items on a graph, so you can find the distance between then using the Pythagorean formula a2 + b2 = c2, or for an x-y graph, distance = √(x1-x2)2 + (y1-y2)2.
The same formula can be used to calculate the distance in several dimensions.

Justin likes a movie

copyright Manning Publications, drawn by adit.io
  • Normalisation allows you to compare ratings at the same scale, by adjusting the overall average ratings to be comparable.

  • Weighting lets you prioritise particular given points as more influential in your calculation. It's like giving special points an extra vote.

  • Regression is when you use the k-nearest neighbours to forecast or predict an outcome, by taking the average of the k-nearest neighbours.

  • Naive Bayes classifier spam filters use this algorithm which is similar to KNN; you train your classifier on some data (e.g. "legit email subject" = not spam / "you won $ sign up now!!1" = spam). Naive Bayes figures out the probability that each word would show up in a spam email. This wouldn't work well for something like predicting the stock market, because there are no good features to use for regression.

  • OCR (Optical Character Recognition) Is when you take a photo of text and the computer can automatically read or digitise the text for you - 1. Learn on a bunch of images the features of a text character (this is called training), then 2. when you have a new image, extract the features and see what it's nearest neighbours are. Generally, OCR algorithms measure lines, points and curves. The same ideas are used for speech or face recognition.


 Key Takeaways

  1. Dynamic programming is useful when you want to optimise given a constraint (e.g. the max value of goods you can fit in a bag). Use it when,
  • the problem can be broken into smaller problems
  • the smaller problems don't depend on each other
  1. The values in the cells are (usually) what you want to optimise

  2. Theres no single formula for calculating a dynamic programming solution

  3. KNN is used for classification (categorising things into groups) and regression (predicting a response).

  4. Feature extraction means converting an item into a comparable like a number.

  5. Picking good features is important for KNN to be successful.


Discussion

  1. Can you think of any other real-world examples that might use dynamic programming to find a longest common sub-string? The examples in the chapter were DNA sequence alignment, word wrap in word processing, git diff, spell check, and plagiarism detection
  2. Have you written a program using a dynamic programming formula - what did you learn when applying the formula in code? Did you use a grid?
  3. Have you worked on feature extraction? How were the features chosen? Were they the right features?
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .