Tuesday, February 02, 2010

The Code Bomb, or: The Newbie with Big Ideas

Are you considering making your first big contribution to an open source project? If so, don't make this mistake.

I've been working on open source projects for a number of years now. Sometimes, we've received feedback from potential new developers, saying something like this: "Your project looks great, and I'd love to help out. BUT your code is a mess. I'll help, but only if we XXX."

In this case, "XXX" is an expensive wide-reaching enhancement to infrastructure, something like:

  • Replace the automated build scripts
  • Refactor the code in a huge way
  • Add/remove a significant library dependency
  • Rewrite the code in another language

These are very risky infrastructure enhancements: they have a known large up-front cost, an unknown cost in bugs introduced during the enhancement, and unclear benefit, because typically the goal of the enhancement is to do exactly what the code already does, but in a better or more maintainable way.

In many cases it's very tempting to indulge in infrastructure projects like these, and as a result, I'm guilty of implementing my fair share of them myself.

But we have to say "no" most of the time; we're here to build something great, not to shave yaks. Especially since, in most cases, people who say "I'd like to help, but..." don't really want to help anyway; they don't usually stick around long enough to fix the bugs in the new infrastructure they suggested.

UPDATE: Newbie developers tend to want to propose big refactors partly because it's harder to read code than to write it, but also because it's easy to get used to crappy code over time.

In the worst case of this I've ever experienced, a developer "John" (not his real name) required a major refactoring of the code before he'd work on the project, but volunteered to do the entire thing himself. He decided to go dark; he returned, months later, with huge changes to the code, touching almost every file in the system.

In the open source community, we call this a "code bomb." A code bomb is a patch that's so large that no one can review it. Here's Ben Collins-Sussman on code bombs:

One of the main community "anti-patterns" we’ve talked about is people writing "code bombs". That is, what do you do when somebody shows up to an open source project with a gigantic new feature that took months to write? Who has the time to review thousands of lines of code? What if there was a bad design decision made early in the process — does it even make sense to point it out? Dropping code-bombs on communities is rarely good for the project: the team is either forced to reject it outright, or accept it and deal with a giant opaque blob that is hard to understand, change, or maintain. It moves the project decidedly in one direction without much discussion or consensus.

When a developer (typically a newbie developer) drops a code bomb, it's perfectly fine to just say:

Sorry, we can't accept this patch because it's too large; it's a code bomb!

The onus is on the contributor to break it up into smaller reviewable patches, each of which fixes a clear bug.

In our case, we had to just reject "John's" patch, wholesale. Months of work were simply wasted! Don't let this happen to you!

UPDATE: Good commentary on reddit