Welcome to MSDN Blogs Sign in | Join | Help
OMG, You Broke Everything!!

As a new hire, you will mess up. But don’t sweat it, it's to be expected. You will learn more from a mistaked because you will be forced to find a solution when someone points it out.

Indeed your success as a new hire isn’t a question of how few mistakes you will make, but what you do to resolve those mistakes and prevent them from happening again.

I’ve learned a great deal about how to avoid common regressions or build breaks in the Office code base. It had been about 6 months since I’d caused the build lab to trip on one of my changes (and in the mean time I’d fixed numerous other build problems).

Of course, there’s always a new level of problem that you can cause. In fact, the longer you work, the more obscure the problem you cause will be (since you're so well versed in preventing more obvious problems).

Last week, I caused a problem.

The Problem

Two weeks ago, I checked in a change that looked like the following:

class Helper

#ifndef Project2

      : public INewInterface

#endif

{

      ...

};

The idea is that INewInterface is not defined in Project2 , so I #define it out. We build two separate lib binaries, one that is linked by Project1 and one for Project2. There are several other places that do this exact same thing with INewInterface, so at first glance, this should work.

It compiled, it linked, I shipped it.

The problem that I didn’t see is that the definition for the Helper class lives in a seperate shared library. Which means that we get one binary that both Project1 and Project2 link against.

Now the problem is that Project2 expects the size of the Helper class to be one thing but in the compiled code it’s another. Miraculously (or disastrously) this links without error (for reasons I’m still not sure of)

The Result

Last Friday a bunch of dreary eyed people from Project2 came into my office bearing a post it note with a number on it; my change list number. It had taken them nearly two weeks to track down the problem that I caused. "TWO WEEKS!!" their eyes said.

It had some weird behavior: you’d step across a specific function call and all of sudden all the member variables of Helper get messed up. It's like the fundamental laws of the coding universe were thrown out the window.

In the end, someone went through all the recent changes and came across my code.

The Solution

So what did I do now?

Aside from dying a little inside since I had killed the productivity of practically an entire team, I went through my build process for the things that could be improved:

-          Headers built into shared libraries are in the same folder as those that aren’t, we should separate them out.

-          Change our “build verification test” script to run changes against Project2 by default.

-          Brief our team on the dangers of changing this shared code and its headers.

-          Talked to those involved and asked how they traced the bug back to my change.

-          Wrote a blog post detailing my harrowing experience so others could learn as well.

The end result is our changes are now less likely to cause such a regression and future devs who are less aware of the project dependencies do not make the same mistake I did.

Things_learned_from_messing_up++;

Posted: Tuesday, January 22, 2008 1:33 AM by Chris Becker
Filed under: ,

Comments

Andy said:

Great post Chris, sharing best practices and lessons learned via the blog from the team is great! More like this one, though hopefully different bugs :)

# January 21, 2008 11:10 PM

wit said:

"...this links without error (for reasons I’m still not sure of)"

Could it be related to the following post from the VC++ team?

http://blogs.msdn.com/vcblog/archive/2007/05/17/diagnosing-hidden-odr-violations-in-visual-c-and-fixing-lnk2022.aspx

Granted, I can't see a real explanation there either, only "The linker can diagnose this in limited scenarios, especially under IJW.  If you’ve gotten the boggling “inconsistent metadata” error (LNK2022), you’ve probably fallen victim to this issue." - but it might be helpful to see another example of (probably) the same problem: ODR violation. (difference in inheritance here, in alignment there)

# February 4, 2008 5:41 AM

Chris Becker said:

Neat, thanks for the link!

# February 4, 2008 2:20 PM
Leave a Comment

(required) 

(required) 

(optional)

(required) 

Comment Notification

If you would like to receive an email when updates are made to this post, please register here

Subscribe to this post's comments using RSS

Page view tracker