Debugging like Hawks and Tortoises

dev
debugging
ml
hawks
bugs
tortoises
Author

Patrick Deutschmann

Published

July 7, 2024

Debugging is a critical skill in programming. And yet, nobody ever explicitly taught me how to do it. The internet is full of tutorials about programming, but there are things you need to pick up as you go. Debugging is one of them. It’s to programming what seasoning is to cooking: essential, yet hard to formalise. After some time, you develop a feeling for it.

Debugging (and seasoning) are very individual tasks. What you do depends on the situation. Still, there are two patterns I’ve found myself using a lot. I call them hawk debugging and tortoise debugging.

The core of debugging is not, as the name might suggest, about removing bugs from your code. It’s about finding them like a needle in a haystack. That’s what my patterns are about.

🦅 Hawk debugging is for the best case. You use it when you already have a hunch about where the issue could be. Like a hawk on its prey, you swoop down on a specific function or class to check your assumptions. Is some parameter not what it’s supposed to be? Has this function not been called? If it works, you feel like a genius. If it doesn’t, you have to try again. At some point, you might run out of hunches. So, hawk debugging has its limits. It’s fast if it works, but it’s not guaranteed to.

🐢 Tortoise debugging to the rescue. A tortoise makes progress through persistence and thoroughness, and so do you. You’re not trying to find the problem directly anymore. You’re slowly isolating it. When hawk-debugging, you’re making educated guesses about which parts of the haystack to look at. As a tortoise, you’re chipping away at that haystack by reducing the amount of code involved. Do you think the problem is related to component A? Make sure it is. Take away all other building blocks and try reproducing the issue only with A. This won’t tell you the exact source of the problem, but it’s progress. The problem is related to component A? Dive into it. Which classes and functions inside it can you chop off and still see the issue? Next to finding the faulty piece of code, there’s another way of isolating the issue in tortoise debugging: time travelling. We’ll get to that later1.

Let’s look at some examples of hawk and tortoise debugging in action. Say you see a crash in your REST API. The consumer sends some payload, and your service returns 500. When you see the stack trace, you immediately go, “Oh damn, yep, okay.” It’s an edge case you hadn’t accounted for. So you come in like a hawk, attach a debugger to what you expect to be the offending line, and fire off the request. Sure enough, you find everything as you expect it. You quickly draft a fix PR and send it off for review. Happy ending; the hawk saved the day. You’ve reached hacker level 3000. Go ask your boss for a raise.

The next day, another issue comes around. The machine learning model you have been successfully retraining for ages suddenly gives bad results. (Yes, you’re in machine learning because you hate determinism and happiness.) No problem, unleash the hawk. If the model is bad, perhaps the training loop is faulty. Stoop, nope. The training loop looks good. Maybe it’s an issue with input feature X? You remember someone recently changed something there. You check, but it’s a miss. Perhaps the raw data? No again. Slowly, you’re getting worn out and discouraged. In situations like this, you usually think, “Why the f*** isn’t this working?” It seems like it really should, right?

No need to despair. Time for the tortoise. Let’s start reducing the size of your haystack. Usually, you would slowly isolate the problem to a specific part of your code. In the case of a machine learning model, you could try finding a faulty part of the pipeline. This will work well if you have intermediate results to compare to. Say you know precisely what the model inputs are supposed to look like. Then, you can compare the status quo to see if everything is fine. You might not have this information, though. But you do have one excellent clue: It used to work before. So, something must have changed. This information allows you to switch to the debugging dimension I mentioned before: time.

Before, you tried to find what breaks your model in the current version of the code. Let’s try finding when something changed that broke your model. For this, you need version control, which means git for most people. Prepare a minimal test case to confirm if your pipeline is broken with a certain revision. Perhaps you can already tell whether that’s the case after one training epoch. So put that in a little script and dispatch my favourite last-resort debugging tool of all time: git-bisect. Git bisect uses binary search to determine which revision introduced an issue2. If you know that build 160 was still good, but 180 is bad, it’ll first test 170. It turns out that 170 is still good, so the issue must have been introduced later. So we’re testing 175. Two more iterations of this, and we find that the problem was introduced in build 174. Now that’s a haystack of manageable size. By looking at the changes in that build, you should be able to figure out the root issue.

Which is better, then, hawks or tortoises? There’s no clear winner. As a hawk, you let your intuition guide you to the issue. As a tortoise, you use slow, detailed exploration to circle it. How you use both techniques is a question of personal preference. I usually go for some stoops of hawk debugging first. When an issue turns out to be thorny, I switch to tortoise mode. Or the other way around: I start reducing the problem space (“haystack”) to something more manageable with tortoise debugging. Then, like a hawk, I swoop down on specific parts that look like potential culprits.

There’s also a hybrid approach. Do you remember how we ensured component A was to blame in tortoise debugging? Why did we look at component A? Well, that’s where some hawk-style informed judgment came in.

Another aspect to consider is the emotional impact of each method. Maintaining a positive mindset and avoiding frustration is incredibly important. Hawk debugging can be exhausting. Everything you check that turns out not to be the source of the issue feels like a failure. Running out of ideas is disheartening. In contrast, tortoise debugging offers emotional reassurance: you keep making progress. No, you haven’t gone mad–it’s just a very obscure problem.

In principle, tortoise debugging is guaranteed to be successful. That is, of course, barring horrible things like stochasticity, side effects, and race conditions. But that’s a topic for another blog post.

🐢🐞🦅

Footnotes

  1. See what I did there?↩︎

  2. For a more in-depth explanation of git bisect, check out this Stack Overflow answer.↩︎