Friday, October 29, 2010
Tales of Test-Driven Development

Inspired by a talk about Clojure given by Rich Hickey at Philly Emerging Tech earlier this year, I've been toying with building a Java library of pure (immutable) data structures, starting with the Map implementation based on Phil Bagwell's Hash Tries. Yes, I know I could probably just figure out how to use them straight out of Clojure by interoperating, but that would deprive me of an interesting coding exercise.

At any rate, I hadn't really gotten around to doing this in any great detail yet, and as it turns out, I'm glad I didn't. I was fortunate to be able to attend a refactoring and test-driven development (TDD) class taught by Bob Martin this week, and one of the code examples we ran through was the "Bowling Game" of writing an algorithm to score ten frames of bowling. Prior to developing this with a TDD approach, we identified a pretty simple object-oriented design including things like Games, Frames, Rolls, TenthFrames, etc. Yet when we actually got down to it, it turned out we just needed a pretty simple algorithm built into the single Game class--ultimately a much simpler design.

Rewinding to the pure hashmaps: I had previously been thinking about how to decompose the internal datastructure for the hash tries into classes like InternalNodes and LeafNodes that would implement a common TrieNode interface, and then mark all of those things as package private so I could hide all the messy implementation details from the client behind a PureHashMap facade. Hoo boy.

Instead, over the course of a very few hours today, I instead took the following approach: first, I used TDD to develop a brain-dead simple implementation of a PureHashMap using real HashMaps as the backing store, but cloning them on modification. Didn't even make an attempt at the Hash Trie implementation. Wouldn't work well at all on large data sets, but it did allow me to develop a set of unit tests that documented the required functional behavior of a PureHashMap, and it didn't really take long at all.

Next, I did something crazy: I completely threw away the simple implementation of PureHashMap, breaking all the tests. Then I started hacking the hash trie implementation in there until I could get all the tests to pass one by one. Now, this was ugly, cut-n-pasted, massively high cyclomatic complexity code--I shudder to think of it. But it really didn't take too long to get that working either, with the tests as a guide. Finally, as any TDD afficionado would know, once I had all my tests working again, I was able to easily but mercilessly refactor until it was all cleaned up.

The end result was something far better than I could have imagined: a relatively understandable PureHashMap hash trie implementation in a single class file with 100% unit test coverage, built in less than an afternoon. That's powerful stuff. Thanks, Uncle Bob.