The Debugger Bug

Matt Schellhas
5 min readMar 7, 2024

--

Part 1 of a series of bugs, all answering “What is the worst bug you ever had to track down?”

Early in my career, I worked on a pretty basic healthcare “portal”. Companies would sign up for it and their employees would get access to a whole bunch of articles about losing weight or quitting smoking or exercising better. Insurance companies would then give them discounts since websites are cheap, doctors are expensive, and these articles would (in theory) lead to healthier employees at a lower cost. You can guess how that turned out…

The bug showed up when I was doing some full stack development. Bog standard .NET shop. MSSQL for the DB. Some C# APIs feeding a vanilla JavaScript front-end (this was a long time ago now, jQuery was cutting edge). I was adding a search bar for some table of articles. Search for “Asthma”, get 94 articles about asthma. Why use Google, when you could get worse results from a proprietary health portal provided by your employer?

I had wired up the search to our article tables, but the UI didn’t display anything (yes, the summaries just sat in the relational DB; there weren’t many of them and quality was priority number 16 for this project). Oh, right. I needed to populate the test DB. Easily done. Still not displaying. Hmm. Probably something wrong with this God-forsaken grid control. Fine, fine. Let’s make sure that the search is actually working. Open up the debugger, set a breakpoint, kick it off.

The search did come back properly. Step. Step. Step. Cool, flowing back up through the API. Open up dev tools in the UI. Oh. I see, the response format isn’t exactly how the JavaScript expected it. That was easy enough to fix. Restart the debugger. Step. Step. Step. Watch the data get formatted properly. Step. Step. Okay, there’s some results in the UI. Good.

Then I stopped the debugger, ran the whole search in real time. Worked fine. Tried a few different searches, checking for corner cases. Worked fine. Awesome. I finished off some unit tests and sent it all off for review.

The bug showed up the next day. My review got approved, and I merged the code. Then I moved onto the next task. It was adding a column to the table or some nonsense. Same process. Modify the SQL query to return the additional data. Modify the API to pass that data along to the front-end. Configure the front-end to order and format the new column. Fire up the debugger. Run.

I run a search and the UI displays 8 articles, all the same. They’re just duplicates of one another. Hmm. That’s weird. Did I screw up the test data? No. Did the table somehow duplicate the result? No, it looks like it just got 8 entries with the same data. Did I break the query somehow? No, it works fine… but it produces 8 different results — none of them matched what was in the UI!

Okay, it must be something weird in the API. Crack open the debugger. The search query returns the right results. They get translated into DTOs correctly. I return them to the UI and… they’re suddenly 8 duplicates. WTF!

So I call over a coworker. Hey can you help me look into this? They come over. I start the UI, run the search… and it works fine. Eight different results, all on-topic for my search. WTF!!!

They laugh. I swear, this was broken a second ago when I was debugging it and… oh no.

C# was not my first programming language. I’d spent years doing C and C++ (and Pascal and PHP and perl) before I ever got paid to write a line of code. I had run into bugs in those languages where some uninitialized memory or buffer overflow only blew up when built in debug mode. Those were bad. There’s no concurrency, how could this even happen in a managed language?

Aside: For those who have had the pleasure of never writing C or C++, most compilers will initialize all program memory to some known pattern when debug flags are set, but will leave it as whatever happens to be in RAM otherwise. That leads to non-deterministic bugs that accidentally work sometimes because the memory values are sane. Debug modes also tend to have different memory layouts, which can cause buffer overflows or bad pointer math to accidentally work sometimes because they overwrote something less vital.

The next three days were a cavalcade of horrors.

Shelved the code so that my coworker could try to reproduce. Nope. Works on their machine.

Is it just this search for some reason? Nope. Different searches produce more or less duplicates, but always the same article.

Reverted my changes. It was like an hour of work. I could just redo it. Nope. The duplication problem still exists. Which means this issue is in main, but only I can see it?

Maybe a cache problem? I wipe out my build directories, database… anything with any state and rebuild it all. Nope. Problem remains.

Coworkers can’t help. The Internet says that it is impossible. I begin to worry that this is all an elaborate prank.

In the middle of the third day, I found the problem.

I had introduced it the day before I discovered the bug. When building the search in the first place, I used the debugger to trace the data as it flowed through the system. When doing that, I set a conditional breakpoint inside one of the DTO properties so I could see when my test record was found:

id = 42

A simple enough mistake after writing a bunch of SQL. Visual Studio dutifully executed the conditional breakpoint every time something accessed that property (overwriting the actual ID found by the search), and the data binding dutifully populated the result array with 8 copies of article 42. I thought nothing of it, because all of the IDs were 42 — that was the problem — of course the breakpoint fired!

You’ll all be happy to know that Visual Studio has since fixed this issue. “is true” conditional breakpoint expressions that do not return boolean values will now produce an error message when evaluated.

--

--