A Programmer's Guide To Effective Debugging

As a software developer, I can guarantee you one thing for sure: you are going to spend a great deal of time debugging code.\n\nThere are certain constants in life which are unavoidable: death, taxes, and programmers creating bugs.\n\nSince so much of your time will be spent debugging code, it’s probably a good idea for a developer to be good at debugging, don’t ya think?\n\nUnfortunately, many developers—even highly experienced ones—tend to, well… suck.\n\nThere are plenty of developers who can whip through new features and sling code around like nobody’s business, but who cleans up the mess of bugs they leave behind?\n\nIt’s one thing to know how to write good code; it’s another thing to know how to debug the ugliest code you’ve ever seen in your life that was written by the legendary Bob who built the whole first version of the application in 48 hours in his basement, but was kind of an “odd” fellow.\n\nFortunately, debugging, like any other skill, is something that can be learned.\n\nIf you apply the right techniques and practice, you can become great at it.\n\nWho knows? You might even enjoy it.\n\nThe key to debugging is realizing that it’s all about mindset.\n\nIt’s about taking a systematic approach to the problem—not rushing it, and not expecting you can just find the problem, get in, and get out.\n\nIt’s about staying calm and collected: attacking the problem from a logical and analytical perspective instead of an emotional one.\n\nIn this chapter, I’m going to lay out a systematic approach to debugging, which will help you to avoid that dreaded debugger mindset and take your debugging skills to the next level.\n\n

What Is Debugging?

\n\n\n\nBefore we dive deep, let’s go shallow.\n\nWhat exactly is debugging?\n\nIt seems pretty obvious, right?\n\nYou open up the debugger and you “debug” problems with the code.\n\nAh, but that is where you are wrong.\n\nDebugging has nothing to do with the debugger.\n\nDebugging has everything to do with finding the source of a problem in a code base, identifying the possible causes, testing out hypotheses until the ultimate root cause is found, and then eventually eliminating that cause and ensuring that it will never happen again.\n\nOk, I suppose we could call that fixing bugs. Semantics.\n\nThe point is, debugging is more than fiddling around in a debugger and changing code until it seems to work.\n\n

First Rule of Debugging: Don’t Use the Debugger

\n\nAh, what’s this you say?\n\nA new bug for me to fix?\n\nOh, this is a hairy one?\n\nHave no fear, sir. I will unleash the full power of my mental arsenal on this unholy terror.\n\nWith that mindset, you the programmer sit down at your desk.\n\nYou fire up the debugger.\n\nCarefully you step through the code.\n\nTime seems to blur, minutes turn into hours, hours into weeks.\n\nYou are an old man sitting at the keyboard, still in the same debugging session, but somehow you are “closer.”\n\nYour children have all grown. Your wife has left you.\n\nThe only thing that remains is… the bug.\n\nThe first thing most programmers do when they want to debug an issue in the code is to fire up the good old debugger and start looking around.\n\nWrong.\n\nDon’t do this.\n\nThe debugger should be your last resort.\n\nWhen you fire up the debugger immediately, you are effectively saying, “I don’t have any idea what is causing the issue, but I’m just going to look around and see.”\n\nIt’s like when your car breaks down and you don’t know jack shit about cars, so you open up the hood and look for something wrong.\n\nWhat are you looking for?\n\nYou don’t even know.\n\nDon’t get me wrong.\n\nThe debugger is a wonderful and powerful tool.\n\nUsed properly, the debugger can help you solve all kinds of problems and can help you see what happens when your code is running.\n\nIt’s not, however, the place to start, and many bugs can be solved without ever touching the debugger.\n\nYou see, just like Facebook or funny YouTube cat videos, the debugger has a way of sucking you in.\n\n

Reproduce the Error

\n\nSo, if you don’t simply fire up the debugger to debug a problem, what do you do?\n\nWell, I’m glad you asked.\n\nThe first thing any sane person should do is to reproduce the bug to make sure that it is actually a bug and that you will be able to debug it.\n\nOne-hundred percent of problems that can’t be reproduced can’t be debugged.\n\nSo, if you can’t reproduce the problem, there ain’t no point in debugging it. Ya hear me?\n\nNot only can you not debug a problem that can’t be reproduced, but also, even if you did fix it, you can’t verify it was fixed.\n\nSo, the very first thing you should do when you are trying to debug a bug is to make sure you can reproduce the bug yourself.\n\nIf you can’t, go and get help.\n\nIf a tester filed the bug, get them to reproduce it for you.\n\nIf the bug is intermittent and can’t be reliably reproduced, this means that you do not know the circumstances required to reproduce the problem.\n\nThere is no such thing as an intermittent problem.\n\nIf it is a problem, it can be reproduced; you just have to know how.\n\nSide note on intermittent problems:\n\nOk, so your boss is demanding that you fix the problem.\n\nThey’ve seen it in production. Customers have seen it. It’s definitely a problem.\n\nThe “I can’t reproduce it” pushback is not working—they aren’t buying it.\n\nWhat do you do?\n\nYou still can’t debug a problem that you can’t reproduce.\n\nBut what you can do is gather more evidence.\n\nInsert logging statements in the code.\n\nGather as many details as possible about when the problem happens and under what conditions it happens.\n\nArtificially recreate the environment and circumstances if you can.\n\nDo not be tempted to throw in “fixes” for the problem that you can’t recreate.\n\nIf you don’t understand the problem enough to recreate it, you have a very, very low chance of accidentally fixing it by a guess, and you will have an extremely difficult time knowing if your fix even worked.\n\nFind a way to reproduce the problem, even if it is only reproducible in production.\n\n

Sit and Think

\n\nAfter you can reproduce your issue, the next step is a step most software developers skip because they are so eager to solve the problem—but this step is crucial.\n\nIt’s a really simple step.\n\nJust sit and think.\n\nYes, that’s right.\n\nThink about the problem and what the possible causes could be.\n\nThink about how the system works and what might bring about the odd behavior you’re seeing.\n\nYou are going to be in a rush to jump into the code and into the debugger and start “looking at things,” but before you start looking at things, it’s important to know what you are looking for and what things to look at.\n\nYou’ll likely come up with a few ideas or hypotheses about what might be causing the issues.\n\nIf you don’t, be patient. Keep sitting and thinking.\n\nStand and walk around if it helps, but before you move on, you should at least have a few ideas that you want to test.\n\nIf you absolutely can’t come up with anything, continue to resist firing up the debugger, and instead take a browse through the source code and see if you can gather a few more clues about how the system is supposed to work.\n\nYou should have at least two or three good hypotheses you can test before you move on from this step.\n\n

Test Your Hypotheses

\n\n\n\nOk, so you’ve got some good hypotheses, right?\n\nThe flux-capacitor is connected to the thingamabob, so if the voltage coming out of the whozitswatt is below grade… THE THINGAMABOB MUST BE CONFIGURED INCORRECTLY!\n\nErr… something like that.\n\nOk, let’s fire up the debugger and test our hypotheses! Yeah, man, let’s do it!\n\nNo! Wrong.\n\nHold up there, young buck.\n\nWe don’t need the debugger just yet.\n\nWait, what? How am I going to test my hypotheses if I can’t use the debugger, you ask?\n\nUnit tests.\n\nYes, that’s right: unit tests.\n\nTry to write a unit test to test your hypotheses.\n\nIf you think some part of the system isn’t working correctly, write a unit test that you think will exploit the issue.\n\nIf you are right and you’ve found the problem, you can fix it right then and there, and now you will have a unit test in place to verify the fix and ensure it never happens again.\n\n(Still make sure you try and reproduce the actual bug, though, before you call it fixed.)\n\nIf you are wrong and the unit test you write passes as expected, you’ve just made the system a little more robust by adding another unit test to the project, and you’ve disproved one of your hypotheses.\n\nThink of it as ratcheting up the problem space.\n\nEvery time you write a unit test and it passes, you are eliminating possibilities. You are traversing through your debugging journey by locking and closing doors behind you as soon as you find out they are dead-ends.\n\nIf you’ve ever been lost for hours or days in a debugging session, you should immediately realize how valuable this is.\n\nOne of the reasons why the debugger is so bad is because it can encourage us to revisit the same wrong corridors over and over again as we check and recheck our assumptions, either forgetting what we already looked for or not trusting that we looked hard enough.\n\nA unit test is like climbing a mountain and putting an anchor in place that makes sure you can’t fall too far backwards.\n\nWriting unit tests to test your hypotheses will also ensure that you aren’t haphazardly trying things and looking around.\n\nYou have to have a specific assumption you are testing when you write a unit test in order to help debug a problem.\n\nNow, I’m a realist.\n\nI’m pragmatic.\n\nI know that sometimes it will be extremely difficult or impossible to write a unit test to test a hypothesis.\n\nIn this case, it’s ok to fire up the debugger, but only if you obey this one rule:\n\nHave a specific purpose for doing it.\n\nKnow exactly what you are looking for and what you are checking when you use the debugger.\n\nDon’t merely go in there to look around.\n\nI know it may seem like I’m being a bit anal and pedantic about this whole thing but trust me, there is a reason for it.\n\nI want you to become a skilled debugger, and you are only going to get that way by being deliberate about how you debug.\n\n

Check Your Assumptions

\n\nMost of the time, your hypotheses are not going to pan out.\n\nThat’s just life.\n\nIf that’s the case, the next best thing you can do is to check your assumptions about how things are working.\n\nWe typically assume that code is working a certain way or that some input or output must be some value.\n\nOften we think, “Well, this can’t possibly be happening. I’m looking at the code right here. There is no way it could be producing this output.”\n\nOften, we are wrong.\n\nIt happens to the best of us.\n\nThe best thing you can do with these assumptions is to check them.\n\nAnd what’s the best way to check them?\n\nYes, that’s right. More unit tests.\n\nWrite some unit tests that check obvious things which “have to be working” along the workflow of the problem you are trying to debug.\n\nMost of these tests should easily pass, and you’ll say, “Duh.”\n\nBut, every once in awhile, you’ll write a unit test to test some obvious assumption and the results will shock you.\n\nRemember, if the answer to your problem was obvious, it wouldn’t be a problem at all.\n\nOnce again, the pragmatist side of me has to tell you that, yes, it’s ok to open up the debugger to check your assumptions as well.\n\nBut only after you’ve tried to check the assumptions using unit tests first.\n\nAgain, it’s like climbing that mountain and putting in anchors along the way.\n\nAvoid the debugger if you can, use it if you must, but, once again, only to validate or invalidate specific assumptions you have already formed.\n\n

Divide and Conquer

\n\nI remember working on a really hairy bug with a printer incorrectly interpreting a print file written in the PostScript printing language.\n\nI tried everything I could think of to debug the problem.\n\nI tested all kinds of hypotheses.\n\nNothing panned out.\n\nIt seemed like the bug was some kind of combination of multiple commands in the print file, and I had no idea which ones it was.\n\nSo, what did I do?\n\nWell, I cut the print file in half.\n\nThe bug was still there.\n\nSo, I cut it in half again.\n\nIt disappeared this time.\n\nI tried the other half. Back again.\n\nI kept hacking away at big chucks of the print file until I got the entire file down from several thousand lines of code to just five.\n\nThe five lines of code that, in that order, produced the bug.\n\nEasy peasy.\n\nSometimes when you get stuck debugging, what you need to do is figure out a way to cut the problem in half—or take as big of a chunk out of it as possible.\n\nDepending on the problem, this could look very different, but try and think of ways you can eliminate a large amount of code or remove a large amount of the system or variables and still reproduce the bug.\n\nSee if you can come up with tests which completely eliminate parts of the system for being responsible for the error.\n\nThen do it again… and again.\n\nIf you keep hacking away, you’ll likely find the critical components required to create the error, and then the problem can become relatively easy to solve.\n\n

If You Fix It, Understand Why

\n\nI’m going to give you one final piece of advice about debugging—although I’m sure I could write a whole book on the subject.\n\nIf you fix a problem, understand why what you did fixed it.\n\nIf you don’t understand why what you did fixed the problem, you are not done yet.\n\nYou may have inadvertently caused a different problem, or—very likely—you haven’t fixed your original problem.\n\nProblems don’t go away on their own.\n\nIf you didn’t fix the problem, I can guarantee you it’s not fixed. It’s just hiding.\n\nBut if you did fix the problem, don’t stop there. Explore a little deeper, and make sure you understand exactly what was going on that caused the problem in the first place and how what you did fixed it.\n\nToo many software developers debug a problem by twiddling bits, the code apparently starts working, and they assume it is fixed without even knowing why.\n\nThis is a dangerous habit for many reasons.\n\nAs I mentioned above, when you randomly tweak things in the system and change bits of code here and there, you could be causing all kinds of other problems which you aren’t aware of.\n\nBut, perhaps more than that, you are training yourself to be a shitty debugger.\n\nYou are developing the habit of messing with things until it works. No technique, no rigor.\n\nYou may get lucky sometimes, but you won’t have a repeatable process or reliable skillset for debugging.\n\nNot only should you understand what broke, why, and how you fixed it, but also you should verify the fix.\n\nI know it seems like common knowledge, but I can’t tell you how much time is wasted by programmers “fixing a problem,” assuming the fix worked, and passing the code to QA only for QA to reproduce the problem and have it go back to the developer who has to start over at square one again.\n\nIt’s a huge waste of time that can be prevented by taking an extra five minutes to verify that what you fixed is actually fixed.\n\nIn fact, don’t just verify the fix; write a regression test for the problem so that it never happens again.\n\nIf you truly understand the problem you fixed, you should be able to write a unit test that exploits the issue, and then your fix should make that unit test pass.\n\nFinally, look for other instances of this same class of bug.\n\nBugs tend to hang out together.\n\nIf you found something wrong with one assumption you made about the system or some incorrectly coded component, it’s very likely that there are other issues which are also caused by that same problem.\n\nAgain, this is why it is so critical that you understand what the real problem was and why your solution fixed it.\n\nIf you know what happened and why, you can quickly figure out if there are likely to be other issues caused by the same underlying problem.\n\n

Art and Science

\n\n\n\nRemember, debugging—like software development—is part art and part science.\n\nYou can only get good at debugging by practicing.\n\nBut practicing is not enough. You have to specifically, systematically debug, not just play around in the debugger.\n\nHopefully, I’ve given you a good overview of how to do that; now the rest is up to you.\n\n

\n\n

A Programmer’s Guide To Effective Debugging

What Is Debugging?

First Rule of Debugging: Don’t Use the Debugger

Reproduce the Error

Sit and Think

Test Your Hypotheses

Check Your Assumptions

Divide and Conquer

If You Fix It, Understand Why

Art and Science

John Sonmez

A Programmer’s Guide To Effective Debugging

What Is Debugging?

First Rule of Debugging: Don’t Use the Debugger

Reproduce the Error

Sit and Think

Test Your Hypotheses

Check Your Assumptions

Divide and Conquer

If You Fix It, Understand Why

Art and Science

John Sonmez

Related Articles