Monday, August 31, 2009

After a fresh checkout, a number of tests started failing on my laptop. This is new (within the past week or so). The problem is actually fairly tiny, but is a good lesson on testing in the presence of environmental assumptions.

Apparently there are customizations which we can apply to our computer's "en-US" mapping, like choosing 24-hour clocks or military dates. Of course, I had done so when I first got my computer because I like military style date/time strings. It never mattered at all until this morning.

So look at this line:
CultureInfo culture = new CultureInfo("en-US")

When we call it, we don't get the unmodified stock version of en-US (what ever the heck that might be) but our customized one. This is generally good when we're working on our own laptop, because we get the version we've configured. In unit tests, though, it's not so good. The tester's machine was not set up quite like my machine, and this caused the failures.

As a side note, this is evil:

Assert.That(expected == actual);

The problem is that it tells me "expected true, but received false." That is hardly enough information for me to realize that the issue was in date formatting. Instead, something like:

Assert.That(expected, Is.EqualTo(actual));

will tell me that it expected "January 12, 2009" but received "12 Jan 2009". That is far better than the former error message. Write your tests to isolate and explain errors, typing speed is far less important.


That means that we really do need to be more specific about formats because we are indirectly or accidentally testing locale-specific date-to-string conversions. These tests are indirectly testing the CultureInfo configuration of the machine on which they run.

In the test cases I'm looking over, the code should probably not be converting dates to strings, but the system under test is built that way. Perhaps the failing tests are just showing a weakness in the design. Note that it is easier to compare DateTime instances than to convert them to strings first. Regardless, dates & times are being strung together and the tests can fail on other people's machines.

What to do?
  • I could configure the tests to require my formatting, but that only works for me, not you.
  • I can certainly reconfigure my machine to use the stock formatting, and that makes the tests pass for me but not the next guy who likes "01 Jan" better than "Jan 01". It leaves the sloppiness in the tests.
  • I can take the time to make the tests specify their expected formats. This is relatively easy, as most tests are not written in such a fragile manner. It leaves the sloppiness in the code, though.
  • I can mock the CultureInfo with a nice factory, but that may be rendering the parser tests moot. They seem to be specifying that en-US does dates a predictable way (which is overspecifying the string, or underspecifying the tests depending on how you squint).
  • I can change the system under test to not string-convert dates and times. That will just push that into the UI code-behind so it might be a bit of work. It would kill the sloppiness entirely, though.


Now the interesting questions are what you would do, and what your development manager would like you to do, and what the Customer would like you to do. Discuss among yourselves.

Agile is a Management Trick

I'm planning four blogs this month to be part of a suite of "tricks". This little blog will act as the placeholder for links to all four of them.

  • Agile is a management trick to exert more control over development.

  • Agile is a customer trick to make development do what they want.

  • Agile is a ploy by QA to take control of the development process.

  • Agile is a programmer trick to manage their workload.



Historically, none is strictly true. But I think they're all true. I will welcome your feedback, as always.

Saturday, August 29, 2009

Long and Short of Naming

The point of good naming is not to make the names long but to make them clear and obvious. 

A coworker showed me some code where variable namess were descriptive, but rather similar to each other in shape and containing some of the same words.  He then showed me the same code where the variable names were replaced with X, Y, and Z to illustrate that (even in this case) shorter names were superior. He was right. 

When I'm looking at code, I need to see its structure. When long identifiers obscure the structure, will struggle. 

I also need to see all the ways a given variable is used in the function. If I can barely tell the difference between variable names (ie I have to READ them all to tell them apart) then it's more work. 

Is a short, distinctive, nonsensical name better than a long hard-to-distinguish name? Can that be true? In the context of a passage of code (not just an isolated name) it certainly can be true.  A short, distinct, and clear name is even better.

Good naming has never been about making names long. It has always been about making them clear.

Typically overlong names are a result of messy context. I really don't need to see ClientAddressStreetString and try to tell it from ClientAddressZipcodeString.

These names are long and similar in shape. I can read them and tell them apart, but the change is 3/4ths of the way through the pile of syllables.

If the code is complex enough it might be better if the strings were A1 and A2 so I could see each at a glance. In this case, I think I would rather see them called "zip" and "street" so that I could tell them apart easily and they would make sense in the context of ClientAddress.

When you are tempted to make overly long names, consider these concepts instead:
  1. If you add type context and are not writing a type conversion, you're probably doing something wrong. If the only form of the variable is a string, then "string" context wart is entirely unnecessary as a suffix or prefix.
  2. If you have to add a class name to a variable as a wart (eg "ClientAddressStreet") then maybe you are doing the work in the wrong place. The name should make sense in its context, so that context warts are unneeded.
  3. A short bad name is better than a long bad name.
  4. The similarity of multiple names destroys readability. Consider reading a book where the main characters all had names that were all too similar:
    George's son Georgy turned to Geoffrey's daughter Georgina whose question about Geoffrey and Geordi had clearly aroused Geoffrey's son George's curiosity. He feared Georgia would realize that her past history with George's Georgina was no longer a secret to Geoffrey. Would Georgia maintain her silence? "Georgina," he asked, "talk to me of Geography and Geology and let's leave Georgia with Georgy to discuss choreography."
  5. If your long names start to obscure the path, we can't call them "good names".
  6. Even in the face of long bad names, abbreviations and mental mapping are not good things. They may happen to be less bad than some other bad things you could have done instead, but they're still not good names.
Any misunderstanding of "good naming" to mean "good and long" is indeed a misunderstanding. Longer names are not, pound for pound, better than shorter names in a given context.

Backsliding From Agile to Waterfall

This is the place for you to regale me with stories of backslides . I am interested in how it happened as well as your guesses as to why. Is agile too counterintuitive? Does the method underdeliver? Are managers too addicted to practices made impossible in agile? Was it too hard?

I'm hoping to learn and adjust my perceptions. You can help.

Thursday, August 27, 2009

40 Hours is Unprofessional

A recent tweet captured a sound bite from my friend and mentor Robert Martin. I don't believe it is misquoted:
"A 40 hr week is not the choice of a professional" @unclebobmartin
As a friend of UncleBob, I know what he means, but I fear that this quote is too easily hijacked. I tweet the same:
Be careful with your sound bites. It is their misinterpretation that will be implemented.
As a counter-soundbite, I offered this to the ether, in hopes of seeing some refinement of the idea in twitterspace, and especially hoping to counter a wave of 70-hour-week taskmasters.
A good point, yet how much should we take from faith, family, friends, and such life-giving pursuits? ... discussed?
Of course, Dave answers back, channeling UncleBob. This answer is still prone to abuse, but is generally better and safer It is:
... that self improvement & learning is your responsibility, not your client/employer's.
This is clearly true. Taken correctly, it tells us that we should self-educate all of the time, and whether or not our employer provides training to us we should take matters into our own hands. After all, when a great job requiring cool skills comes up, the guy doing the hiring looks for people who have the skills already. My old statement is:
When the bus comes, it's the people waiting for the bus who get on the bus. Then the bus leaves.
There could be a lot of space devoted here to life/work balance, which is a crock. You only have a single life, and the hard part is not balance but integration. I would consult on that, but I don't have it worked out myself. What is true is that working too much is bad, and working too seldom is bad, and stagnation is bad, and neglecting your true loves is bad.

Dave explains:
RT: @dastels: @tottinge that self improvement & learning is your responsibility, not your client/employer's
Finally UncleBob makes a good quote that clarifies his intent:
Let's be clear. Your career is YOUR responsibility, not your employer's. Your employer is not your mother.
Nobody said "sustainable pace is poppycock" or anything like that. All that was said is that a person should improve. If that person's employer wants to help, that's great. If the employer does not, that's tough. The truth remains that you probably need a better you more than you need a better engineered career path, vocational guidance, or training programs or corporate carrots.

Besides, if your company trains you they are intending to make you better at working there. Your goals may align, or may be different. I think it's time to go get ready for that next bus.

Late additions:

Please read through the comments.

  • If your company wants to take some responsibility for the development of its people, then by all means take advantage of it. If they offer seminars off-hours, or brownbag seminars, or whatever ... don't pass up anything that might grow you.

  • The point of this is not that it is wrong to have company training programs, or harmful. The point is particularly that you should continue learning all of the time, and you should take this into your own hands *especially* if the company you work for does not do it.

  • In other words "my company doesn't train me" is not a valid excuse to stop and stagnate. Quoting uncle bob one more time "Don't be blocked" in your career or your work.

  • Don't let your life become a never-ending stream of overtime hours. If you think that's professional development, you're wrong. Very few have risen to great authority from sweatshops. Work hard, but don't let a job ruin your life and don't let a single job ruin your career.


Does that cover it?

Tuesday, August 25, 2009

The Bottom Of The V?

It is backlog management. Everyone is talking about "value" and "maximizing the work not done" and "minimal marketable features" and "prioritizing."

I've been thinking for a while that the problem in a lot of "agile organizations" is that the "agile" ended in the programmers' bullpen. Getting agility pushed into management and Customer hands is often like pushing a rope.

A recent blog ("Platitudes of Doom") covered part of my concern.

Instead of enjoying yesterday, I spent a lot of it pondering the reasons I've seen so much dysfunction in this area.

Of course, it is partly Taylorism (hello Darrin) since a lot of people filling the Customer role aren't paid on the basis of making the product better, but on the basis of making particular external customers happy. Often this places the marketing/product people in competition with each other for development time. If A gets his changes through ahead of B then A is rewarded and B takes a hit. This means a single point of decision has the power to promote some reps and demote others, just by doing his work.

Sad story: I was setting up my "magic funnel" at a company where I worked last year. A product guy told me that I was just killing him. He told me that he would never get any developer time if we grouped all the work together and prioritized. Essentially, he was placed in charge of work that mattered less to the company, so that doing the most important work would mean he would lose his job (as indeed he did, just a few weeks before I did). The organization wasn't ready for a single funnel, because it had a competitive rewards system.

I should let Esther Derby tilt at that one for a while. It is a bad system that sounded good to some executive(s) somewhere, yet this is exactly the kind of problem we in the agile community need to address.

We talk about the agile principles, but we drive programming changes into organizations that haven't adopted agile or lean or win-win thinking or sometimes even human dignity.

"I'd love to change the world / But I don't know what to do..."

One of my brother consultants was talking about such a situation, that sometimes the best we can do is to help liberate people who get it from systems that do not.

Today I am swallowing that pill. Gerry W said that you should not solve problems for people who have no sense of humor.

I've been given all the good advice I need, perhaps. Time to deal with such limitations in a healthy way. After all, there are companies that deserve to go out of business, and they have people who deserve better.

At peace now.

Avoiding Negative Tim

It's not just a good idea for you, it's a better idea for me.

I admit to being a little run down from my last nine days of activity (college for son, travel, etc) and also have other stresses, but I was feeling the negative yesterday. I don't like that feeling. I could have been enjoying all the agile peeps at Agile 2009, but instead was fretting about vendors selling agile-in-a-box and various dysfunctions in the Customer space (see previous blog).

I admit that dysfunctions fascinate me, and I can spend too much time with the magnifying glass over my eye. CS Lewis said that he was not a pleasant man when writing "The Screwtape Letters" because he was in the head of his antihero demons for too much of the time. I think I can also spend too much time fretting about dysfunction.

Odd, that. Here I am surrounded by the people who do agile well, and fretting about those who do not.

I stayed up with Brandon, Joe, Bonnie, &all last night too late, but this morning am so aware that I'm among peers. I am hoping to give "negative tim" the slip and get into some fun conversations with functioning agilists.

Wish me luck.

Monday, August 24, 2009

Platitudes Of Doom

Here's where I suspect things go wrong. I see a combination of two ideas at the core of a lot of corporate dysfunction.

  1. I can want things faster than you can build them.

  2. I believe that you can always do more if you really try.

The former is obviously true. Wanting is cheap, instantaneous, and unencumbered. I want a car with a 500K mile all-inclusive warranty, Maserati looks and performance, moped operating costs, and a Hot Wheels sticker price. And I want it to drive itself. I am betting that the average reader had at least three improvements they would make to my wish list. None of us expect to see such a thing, of course. The engineering and design costs and material costs make this impossible.

Wanting is cheap, but invention and production are expensive. There are material costs, costs of design, costs of errors and rework, costs of research, time and materials. There are human beings who do design, and they like to be paid. It takes them longer to work out the requirements than it does for me to decide I would like a new kind of car, or desk lamp, or smart phone. Again, it is the very nature of work that invention is generally much more expensive than desire.

Now, an astute reader will see the first statement as an axiom. I should adjust my desires and decide which things I really, really want first. I should realize that all the things I want will take time, and perhaps more time than one would expect. I should not be surprised that I will want many more things before the first thing I desired is complete and delivered. I should realize that many of the things I want are beyond the capacity of the market to provide (at least for now). I should realize that I may have to sacrifice some less important desires (however achievable) to achieve some greater ones. I should expect to be disappointed if I do not adjust my expectations.

On the other hand, the less-astute reader will take this as a problem statement:
  • the team is, for some reason, not delivering the things we want fast enough. 
  • yet people can always get more done by trying harder
Therefore my desires are unmet because
  • they aren't really trying
  • they don't care enough


That sounds quite reasonable, but it isn't. The second statement (that trying harder always gets more done) is false.

That second statement rephrases technical issues as motivational issues.

It takes a problem from a context in which it can be managed (quality, technique, capacity, and priority) and into a context in which it cannot be solved but can be easily worsened (motivation).

It amplifies a misunderstanding of the first statement in a particularly bad and quite synergistic way.

Inventors and designers have frequently worked themselves past a point of diminishing returns. A tired, cranky, unfulfilled, pressurized worker does not become better by trying harder. Trying harder only works a little bit, and only for a little while, and only if the developer was not really trying hard to begin with. After the frustration sets in, trying harder just makes it worse.

Sometimes you need a more orderly approach and not a more passionate one.
Sometimes you need some new skills and techniques, not a greater display of sacrificial devotion.

If a team can not achieve more by working harder, or will not produce as well, then heaping on more pressure is a poor choice.

My colleague in practice, Darrin, brought up Taylorism and Anti-Taylorism over at Agile In A Flash. Of course, a large number of talks at Agile 2009 are on managing backlogs, improving quality, and dealing with productivity. The meme is in the air (as it has been since before Mythical Man Month). I wonder how many development problems we can solve if we can just accept the first statement as an axiom and recognize the second as a falsehood? How can we bring this about?

I suspect we will need to develop some kind of "desire management" discipline. What would its tools be?
  • Value assessment. We should get past "want" and "contract" and into issues of value. What features will most improve our standing in the market? What features will most improve our reputation? Which features will cheapen us? Which yield little or no ongoing improvement? Which features will never really be used (AKA "checklist features")? Which should we farm out? Which should we partner out? Which should we forget about? and which should we really be spending our time implementing?

  • Value Alignment. In particular, we need to be more interested in what our customers value. The backlog has to stop being a way that our sales/product/management staff compete against each other and become a way that we improve the value of our company and our product in the market. We need to want better things.

  • Cost assessment. We need to quit trying to drive estimates downward, and take them under advisement. Even today, most estimates are ambitious and not conservative (more likely to go long than fall short) and yet we try to push them down to a cost we want rather than adjusting our desires to be within our budgets. That needs a rethink.

  • Expectation management. We need to get the truth about capacity, WIP to the decision-makers in our organizations. None of our companies have unlimited capacity, and lean concepts are the only way we know to really get the best out of our people. The system has limits, and we need to respect them first and adjust them second. We cannot simply ignore them.

  • Controlled disappointment. We need to stop stringing our customers along. Sales, marketing, and developers alike have a nasty habit of wishful thinking. They put things on the backlog that are not realistically going to be completed in the next month, the next year, or the next three years. I might suggest honesty instead of delay tactics. This work is not going to be done any time soon. Will your customers be happier and respect you more if you keep telling them it will? I think they'll be disappointed to find the work won't be done, but moreso if they find out after a series of false assurances to the contrary. Controlled disappointment might be a good policy.

  • Operational cost assessment. Maybe we should not consider a $2M contract a win if it will take $1M to build the feature, and will increase operating costs by $10K per month for the indefinite future, and implementing the feature will cause us to lose $3M in opportunities as other features are delayed. ROI is not all about the money we can make on a contract, and development efforts go into the red more often than anyone wants to admit.

The secret, we're told, is to:
  • do less
  • do it better, and
  • do it more often.
That sounds easy, but it is swimming upstream in many organizations. How can we make it better?

Thursday, August 20, 2009

Going Faster

It seems obvious that you can go faster if you take longer strides and cut corners.
In software, this is very much false.
Be advised.

Tuesday, August 18, 2009

Don't Geocache Your Variables

GeoCaching is a quirky, slightly geeky, fun treasure hunting game made possible by the ubiquity of GPS devices on the market. Essentially, person A will hide some object somewhere on the face of the earth, then publish the location as GPS coordinates. Person B will find their way to the given coordinates, take the object left by A, and replace it with another little gift/trophy for person C who might come along later. The locations may be at the bottom of a lake, top of a mountain, inside a cave, in a hollow tree, underground, in the hulk of an abandoned vehicle, etc. The game, therefore is partly in hiding the items in fun places, and partly in finding the items.

Geocaching in "meatspace" is fine, and probably a lot of fun (I've not had the pleasure personally). Geocaching in software is wrong.

I have found classes which contain something like this:

public class Location{
private Item XXX;
public Item Xxx {
get { return XXX; }
set { XXX = value; }
}
// A bunch of other variables
// A bunch of methods that deal with other variables,
// ... but never mention XXX or Xxx ever again.
}


Now the problem is that the Location doesn't need the Item at all, yet it is the place where some other class' method, perhaps CacheFiller.SetItem(Item x) will store item x. At some point in the program another class' method, like CacheRetriever.GetItem(), will retrieve the value.

Why is the data in Location? Because Location is somewhere in the intersection of the set of objects CacheFiller and CacheRetriever have in common. In other words, it is a convenient place.

Convenience is a pretty good reason, yes? Isn't it important to write new features quickly? You bet your bottom line it is! The code we write today is the code that's in our way tomorrow. What kind of week do we want to have next week?

If we are contract coders, and also selfish jerks, we might not care too much about the guys who have to deal with our code in the future. Even so, "the future" may be as soon as next month or next week, and short-cutting our code today will mean we (ourselves!) can do less tomorrow.

When I worked for Uncle Bob Martin the first time (Hi Bob) we traveled the world teaching object oriented design. Uncle Bob said in a maximally cohesive class, all of the methods used all of the variables. Most classes will not be totally cohesive, but it certainly is suspicious and smelly if a class has variables that are never used by its methods, or has methods that do not use any of its own variables. Low cohesion is poor design.

That having been said, you can get an official Agile Otter Cohesive Design Waiver in two circumstances:

  1. For certain utility classes have no variables of their own, and work by using a low-level API methods, provided the utility does not store the API's variables in other classes.

  2. For certain tuple-like "data truck" objects which have no behavior of their own, provided no other classes in the system are performing operations on the data (ie. feature envy).



I'm finding that for many of the beany objects I encounter, I find a number of other places in the code that are performing the object's behaviors. Remember that "having no methods" is not the same as "having no behavior". Frequently these are duplicated in multiple inconvenient places (close to UI or database code, intermixed with the rendering of reports, etc). The feature envy can be solved by extracting and migrating methods, and might result in a much more cohesive class.

When I find data-less utility classes, I sometimes find that they are geocaching some variables in other non-static objects. It suggests that there are proper classes possible that would combine functions from the utility classes and parts of the data objects together in a more cohesive whole.

Let's reiterate some examples for those who joined us late:

  1. When methods of a class touch their own instance methods, that is cohesion. In general, cohesion is good and should be maximized.

  2. When methods of one class touch the variables of other classes (even through getter/setter methods) that is coupling. In general, coupling is considered to be necessary but not so good, is to be minimized.


Coupling is not so good because it frustrates testing and refactoring. The places where a change is made and where the change causes a failure are not very close together. It is best when cause and effect are more closely situated in time and space. The more coupled a module is, the easier it is to break and the harder code is to separate and repurpose. For instance, the code in a triply-nested 'if' statement inside a switch/case statement in the query generation section of an Initialize method of a code-behind for a UI form is a little hard to test in fitnesse or NUnit. Testing is very important and we should not make it hard to do important things.

Cohesion is good because a cohesive class can be tested in greater isolation from UI and network and database issues. It can provide its own evidences of activity. It can be mocked-out more easily. It can be understood in a single IDE window. This makes it easy to do important things, and there are advantages in making important things easy to do.

Geocaching variables is the practice of locating them in places remote to their origins and destinations where other classes who are aware of their secret locations can re-acquire them later.

If you geocache your variables, you are increasing coupling and minimizing cohesion. In practical terms, that means it is harder for your coworkers to be sure that they're not breaking something when they refactor code you've written. It will be harder for them to write tests, which makes it easier for them to make undetected mistakes. Mistakes prevent systems from being accepted and deployed in a timely manner. Being unable to deploy systems keeps companies from being paid, and may cause them to lose paying customers permanently. Being paid is a good thing.

That means that geocaching variables is one of the ways to make your company stop paying you. That can make your spouse sick from anxiety and cause your children to grow up poor and wretched. It may or may not cause you to be attacked by velociraptors. It may speed the coming alien attack/ice age/global warming. It could bring about an apocalypse.

Most likely, though, it will slow you down and make your coworkers ashamed of your code.

How Big is a Story Point?

Noob asks: "How much is a story point?"

Master looks on card wall, says: "We do 16 each week."

Noob asks: "But how much work is it?"

Master says: "One sixteenth of a week right now."

Noob is enlightened.

Friday, August 14, 2009

Odd Function Signature

Today I found code that looked like this:


public IList SetSomeAttribute();


Notice anything funny? A setter method that takes no parameters, so there's nothing to set, and returns a list of some sort. Hmmm... Did the author mean "GetSomeAttribute" instead?

It turns out that the code cleared an instance array, then rebuilt it from data present in an indirectly-held instance of another class, and finally returned the newly reconstructed list.

So, it seems that the method was caching some other class' data, and returning the local version. One could wonder whether that makes any sense or not, but first one has to grapple with the bizarre method interface.

We only had a little time. We fixed the query/command separation right away and renamed the method, but I think there's likely something wasteful and wrong going on in there. Maybe Monday we can sort it out.

Beany Objects and Busy Code-Behinds

Yikes.

My new partner and I had a task which involved a new feature. The complication is that the current version of the feature was implemented almost entirely in the initialization method of a web page code-behind class.

Yeah. The method that we wanted to test in Fitnesse was almost entirely written into the GUI, and worked entirely in terms of the query it was building to populate the screen. This was not welcome news, because the Fitnesse test couldn't make use of the UI Screen's code.

Sadness is that this situation isn't all that uncommon.

In my last company, there was code that implemented the same feature two different places: once in the web UI and again in the report generator. Of course, changes to the one would not affect the other. The code had two different maintenance points, and each was written in an idiosyncratic way to thread the calculations into the screen or report. A customer was simply flabbergasted that the developers had written the same functionality two different ways, in two different places. I shared his surprise. If this had been a single class or method, it could have been used by the screen and the report and the unit tests and the fitnesse tests.

But now I am a little saddened. There are definitely parts of the UI that are big, busy UI code-behind methods working on big, stupid, beany classes. That means that we are very likely to have duplicated code and the usual abuses of program-craft.

We've known for years that the right place for functionality is in functions. We know that objects exist as places to hold functions that are relevant to a single area of responsibility. This is the golden rule of coupling and cohesion:
Place things that belong together together.

The second rule is like it, and applies to OO coupling/cohesion:
A class has a single responsiblity. It does it all, does it well, and does it only.

And the final rule, the most practical and pragmatic of all, slams to door on so many abuses:
Code is written Once And Only Once (no duplication!).


If we violate the first two rules, we necessarily violate the final rule, or cause it to be violated by our peers. If they cannot find or cannot separate the code they want from the code we've written it into, then the easiest thing for them to do is to duplicate the code without the unrelated concerns we've woven into it.

I personally have a "moment of decision" when I find that the code I want is in lines 5-8, 12, 30-35, 40 and 78 of a 100-line function. I know that the right thing is to extract that method into a the appropriate class (creating such a class if it does not exist), but it sure seems easier to avoid breaking that long function by rewriting the code elsewhere. These moments of decision are warning signs.

This method broke my heart a little. Its job is to initialize the screen, but it does this in the course of a very long method that combines magic and sql and object method calls and many responsibilities woven into its tangled web of multiple-responsibilities and "convenient temporal coupling". I bet it wasn't that easy to write, either. You can bet it is untested. And you know that this week we're going to have to wrap tests around it and break it into pieces, and reorganize pieces into classes, and wrap those classes in tests as well. In the end, it should be better. Sadly, it is likely that it will stretch our feature completion date out by a few days.

If we didn't have to rewrite badly-coupled code, we could be truly done much more quickly.

We could "seem" done by ignoring the ugly code now, but that just shovels the risk and mess onto the next pair of programmers who stumble into this area. It's possible that they'll be in the mess because there were no tests to detect some breakage we caused. It will hurt them later or it will hurt our schedule now.

Leaving broken and ugly code is not only unprofessional, it is also irresponsible, and cowardly. It would say that we value our own time more than our teammates' time, and more than we value the quality of our product. I'll go "late" for this reason. It has to be done, and I like to sleep at night.

Monday, August 10, 2009

Beany Classes and Busy Methods

You know you're in trouble in hardly more than a glance. All the methods are long and busy, and do their work data members of other classes... other classes without any behavior. Methods are placed in ersatz global functions. Why? Because that's where they're closest to the runtime. Perhaps in module initialization, perhaps one function call away from the "static main."

The solution for the code is well known. Extract method, move method, pull up method, red/green/refactor.

The real problem here is that we don't know enough about prevention. Why do we keep getting saddled with busy methods running roughshod all over beany classes? Is it laziness, poor education, a lack of good examples, a confused sense of practicality? The human problem is interesting enough that I need insight from a wide range of agilists.

Thursday, August 6, 2009

Any Road To Rome is Also A Road From Rome

I'm back (with my partner) working on the same stinky stuff I blogged about earlier. That code all made it back to the main codeline where we can refactor it but we had to lose a lot of our refactoring in order to get it back in. It was a lesson learned. We had to abandon some changes because we let our code diverge too far from trunk. Now we know about code perturbation.

Then there were bugs in the mudball function, and bugs are always bad news. These were mostly of the "implicit" variety. Code that falls through the "if" traps in a certain way, combined with some default settings in some of the variable(s) at the start of the method, in between the interwoven concerns and feature-envy ...

I really want to get in there and get it under control (which is to say, wrapped in a fuzzy blanket of unit tests). Sadly the blanket does not start out soft and fuzzy. It's a blanket of glass shards and barbed wire. In this case, the test doubles (testable classes) are loaded with so many details that they cloud the story they're trying to tell. We end up with a dozen lines of code to set up a test double followed by one invocation and an assert. Those dozen lines are hard-won by tracing the function with a debugger until we find where the code turns wrong, then perhaps extracting some expressions into functions before adding more setup to the test case to clean up the flow.

This is okay. Every fat, heavily-coupled legacy routine's characterization tests require massive, intricate setup. There are parameters to construct, seams to find and exploit, prefactoring, code spelunking, and all kinds of reverse-engineering involved. The immediate form it takes is a nearly-embarassing set of ugly fat tests.

We're writing the intricate, ugly tests. The class that contained the original ugly method is growing as various paragraphs of code and unclear expressions get moved into new methods. The new methods often have nothing to do with the containing class and really should be moved to more appropriate classes (e.g. "managerObject.hasExpired(session)" probably should be "session.HasExpired"), but here they are. If we were building from scratch, the current state of the code would be unacceptable.

At this time we should remember that every road into Rome is also a road out of Rome.

However, we are not building from scratch but scratching the code apart, so we need this intermediate state in order to get control of the code. One method makes it more possible to detect an effect on one of the related objects. Another makes it possible to fake (through an inheritance seam) a failure that throws an exception. Another reduces the number of variables the main routine needs to manage. This is reverse-engineering, and involves some dead-end assumptions and experimentation. Each stage is taking us closer to understanding the intended effects of the big honking function.

When we have a pretty good specification via unit tests, we can focus on cleaning the code properly. Until we have the tests built up, refactoring is irresponsible and hazardous (hence some of the new bugs). We will relocate methods, probably inline a few, probably extract several more. Statements dealing with one concern will be grouped and extracted until the main function has a single level of abstraction. Flags will be eliminated. If statements will be obviated (if possible) or extracted.

Yeah, the code was in a mess, and all the work we're doing is making it even messier as it enables the work that drives us out of the mess.

We will be done when the tests are clear and tell a good story, and the code that implements the tests is not so detail-oriented and intricate. At that point, only the NDA will keep me from showing you all the cool things we've done. I expect to see that day tomorrow or the next. It's going to be glorious!