# Being Cellfish

Stuff I wished I've found in some blog (and sometimes did)

# October, 2009

 This blog has moved to blog.cellfish.se.
Posts
• #### How to burn down estimates in t-shirt sizes

Some people estimate their user stories is t-shirt sizes, i.e. each story is either small, medium or large. But how do you create a burn-down chart for these estimates in order to estimate when you will be done? I guess a very common way is to assign some kind of number to each size. But what number? I know that one person used 1-2-3 (i.e.S=1, M=2, L=3). Let's assume that you are completing tasks in more or less random order, and random size (but more big tasks than small tasks) then the burn-down might look like this after 10 iterations.

From this burn-down we can see that we will need slightly less than 21 iterations to complete all user stories if the trend is correct. But what if the size difference is wrong? What if a medium is three times as big as a small user story and large is five times the size of a small story?

This is the same question another Microsoft developer got when showing some co-workers how burn-downs could be used to estimate project completion. So this person took the actual data he had and changed the values to 1-3-5. Interestingly enough the result was almost the same. I did the same thing on my random data and the estimated number of iterations needed is about the same as you can see here.

So this made that developer think... Maybe the values representing t-shirt sizes does not matter all that much so he gave all sizes the same value; one. I.e. the burn-down basically represented the number of remaining user stories. Funny enough it turned out that the estimated number of iterations needed to complete them all was again very close the the first estimate. Using my random data you can see that it would take exactly 21 iterations to complete all user stories.

Back to that developer that inspired me to write this. He finished the project and it turns out that the actual number of iterations needed to complete all stories was also very close to all estimates made when half the stories were completed. If I remember correctly they turned out be all be within 5% of the final number. And being 5% wrong on estimates is way better than most people I know when trying to estimate in hours, days or weeks instead of looking at the velocity of the team.

Since I had the data I could not resist making a few more tweaks to the graph. Not surprisingly using 1-2-4 is very similar to 1-2-3. Same for using preferred numbers (my data using start of R5 series (1-1.6-2.5)). Made me think... What about values in totally different ballparks so I used 1-10-100 (i.e. Large being 100 times bigger than a small task). Interestingly enough the estimated number of iterations turned out to be slightly less than 20 in the burn-down. And reversing the values (using 5-3-1, i.e. small is biggest task) gives us an estimate of just over 22 iterations.

So is it safe to draw the conclusion that size of tasks does not really matter in your burn-downs? Is it safe to say that the number of user stories is enough? In my opinion; yes. Yes in theory you may complete only big or small tasks in the beginning saving the other type for last which would make your early estimates wrong if you do not take size in to account. But I don't think teams typically work in this way. There will be a mix of small and big tasks in each iteration. I would compare this to quick-sort. Quick-sort is pretty bad for the worst case scenario but pretty decent on average. I would say the same goes for using burn-downs where you ignore the size of user stories. And the benefit is that you do not have to spend time doing estimates. Just focus on completing those user stories delivering value to your customer because having no sizes is probably good enough.

• #### Never convert your story points to time!

When I read these observations on estimations I was reminded of a thing that happened to me a few years ago. The team I worked with started out doing story point estimates and then breaking everything down in the iteration into hours. After the first iteration the team know that X points equaled Y hours. Turned out that X=10Y. When planning the second iteration that knowledge was used to "verify the hour estimates" and sure enough it unveiled a few tasks the team had forgotten during planning. So it all looked like a great way for the team to make sure they didn't miss some important part when breaking down user stories into tasks during sprint planning. At the time for iteration three the story points were no longer used because the team felt it was just an annoying double bookkeeping and everybody knew that one point was ten hours anyway. From that day forward the team consistently under estimated their work. This team did not have the time to establish a track record as the team described in the link above but they also never had a chance I think. Estimating software development in time is very hard and the amount of time you spend on making the estimates does not necessarily improve the estimate with the same factor. Add to that what I wrote earlier this week indicating that the size is more or less useless and only the number of things are important.

But in the defense of time estimates, it is not really the time that is the problem, it is the lack of learning from the past and using historical data to predict the future. I was personally once part of a project involving around 20 developers for one year. Turned out the amount of work it took to complete that project was one man-month less than the initial estimate (i.e. less than 0.5% over estimated). The reason for this? The project manager and key developers had developed virtually the same application for another customer a few years earlier... But that is in my opinion a very expensive way to do estimates so I prefer spending as little time as possible on actually estimating things and more time on understanding the user stories and completing them.

• #### Adding days to an iteration (sprint)

In Scrum you never extend a sprint. Read that again. Never. So why do some teams extend their sprint a few days sometimes?

I think the most common reason is there is some functionality that is almost done and it feels better delivering "in a few days" rather than "at end of next sprint". Might even be so that there is a dependency or hard date deadline that needs to be met. I think this usually is an invalid argument. I don't think Scrum says you're forbidden to deliver things to the customer in the middle of sprints. If you really need/want to deliver before the end of the next sprint, do the planning so that you deliver something after a few days in the new sprint instead of extending the sprint.

Another reason I've seen is that the team have miscalculated the number of working days they could have anticipated (holidays, team events) and some they couldn't anticipate (power outages). The reasoning here is that "planning was for X days and since we missed Y we should extend Y working days". At first glance I think this looks OK. The team is just trying to keep the sprint length consistent. But if you look deeper I think this is also wrong. The team is not looking at the root cause and learning from the failure (i.e. make sure to take holidays and team events into account). Also this is no different than if there is a flu going so that most of the team is sick a few days during a sprint.

Changing the end date for a sprint also introduces another problem. What the team has planned outside work. Personally I tend to plan things like dentist appointments and other meetings during the sprints since I'm more flexible during the sprint. It is my choice if I have to go out for a few hours during the day and I can decide myself to make up for that time in the morning or evening. But between sprints my working day is not that flexible since it typically involves a number of planning and retrospective meetings. Even when I have several days between sprints they tend to fill up quickly with all sorts of meetings. So when the end date changes it is a high probability that some plans I've made for the beginning of the next sprint need to be changed too since it is now in a day between sprints.

So what if there is nothing outside work that needs to be rescheduled and adding another day to the sprint means the ability to deliver something much wanted to the customer. Isn't it worth it? Once? I think you should be pragmatic but in this case I think there are more pedagogic ways to handle this. Handle it as the failure it is, analyzing what went wrong and come up with a plan on how to avoid it next time. Then then take one or two days  just focusing on the thing that is so much wanted. This way you are less likely to just accept and forget the failure just because everyone agreed it was a good idea to extend the sprint. And if you find yourself often in this situation, sprints may not be suitable for you. Something like Kanban might be more appropriate for you.

• #### Software development at the fish market

The Fish market at Pike place is famous for its flying fish. I was there this weekend for the second time in a few weeks (benefit from having friends and family visiting from Sweden). But it is not the flying fish that make the show interesting I think. It is how the crew is shouting all orders out and get it repeated by the rest of the crew. Most people watching probably just think it is a funny detail in the show. Some might know that it started as a prank only. But what it really is is a great way of making sure everybody involved (the employees) know what the order is and what is going to happen; i.e. what fish will be thrown where. Made me think again about one of my old comparisons between software development and the military. The same thing is used in the army to make sure orders (and important information) are heard by everybody. The fact that an order is repeated is also not only good to make the order heard by everybody, it is also an acknowledgment of that the order has been heard. So when the order maker hears the order repeated he knows it have been heard.

So what about the title; software development at the fish market? Well actually nothing more than I think people in the software business repeat things much too rarely. Not only small things that can be quickly repeated but all kinds of feedback and instructions benefit from being repeated. Think about it. Each time you ask somebody to help you with something, does it turn out as you expected? Probably not. And you either think the other person is stupid or you blame yourself for being unclear. In my experience people are not stupid in general. It is you who are being unclear because the other person has a different set of reference points to what you want them to do. The best thing to make sure somebody understands what you mean is to ask them to repeat,in their own words, what you just said. But don't forget to do the same thing when somebody asks you to do something...

• #### Five reasons to shorten your sprints

I've been involved in a few discussion on iteration length lately and was going to write something about it, but it turns out I don't really have to... Since it would more or less be a repetition of this. So sorry, there will not be a list of five reasons to shorten your iterations here... But I would like to add one thing. Scrum is not a silver bullet. Scrum helps getting surfacing problems and if you think the length of the sprint makes something cumbersome you should shorten the sprint because the more it hurts, the more likely you are to address the real problem and fix it. Souse this rule of thumb (not only for sprint length): If it hurts, make it hurt some more. That way you have a good incentive to fix it.

• #### Coding Dojo 6

It was MineSweeper for the 6th MSFTCorpDojo today. We also applied the object calisthenics rules this time. We ended up with a lot of interesting discussions on how to design to follow the calisthenics rules and this lead us to not really driving the design using tests but rather to create tests following the design. In my opinion it was also unsatisfying that we only implemented the small building blocks we thought we needed and almost no part of the algorithm needed to solve the problem. After using object calisthenics a few times it feels like it leads to interesting design discussions at the expense of the basic TDD experience. It'll be interesting to see if we'll be able to get both in the next session.

• #### Exceptions or not?

I've always had mixed feelings for exceptions. First of all exceptions should only be used for exceptional things, not things that are expected. For example if you open a file it may fail for several reasons; it does not exists, you do not have rights to open it or it may be locked. These are all perfectly reasonable errors and throwing an exception for these failures is in my opinion not a great API design. If there are other methods to check for all these circumstances it might be OK but not optimal. A reasonable compromise is to have a TryOpen method which never throws an exception. This gives the user of the API the option of letting file open errors being expected or fatal.

Because exceptions should only be thrown for exceptional circumstances there should be no reason to catch them other in the application's main loop unless you can add information to the error in which case you want to wrap and throw a better exception. The exception to this rule is when you use an API (such as file open mentioned before) that throws for pretty common error cases you want to ignore.

Another thing I don't like with exceptions is the fact that you (in C++ and C#, Java is better here) get no help from the compiler to know what types of exceptions a function may call. So in C++ and C# you may think you catch everything you need but you really don't which leads to unwanted behavior in your application. It is just too easy to get into a situation where you think a number of lines are going to be executed but they're not for some exception.

However if you use return codes (as you would in a C program) you get your code cluttered with "if (previous called failed) handle error" which hides what the code really does. This can be hidden using macros to handle errors (ex: result = DoSomething ON_FAIL_RETURN(result); ). And if there is no error handling it is hard to know what happens when the next line is executed even though there was an error.

Using exceptions is also a kind of "I failed and I hope somebody else somewhere can handle it" while the use of return codes defines a clear contract "I failed and you who called me must handle it". So even though exceptions are convenient, return codes feel more defensive, i.e. are easier to get right resulting in less bugs but sometimes at the cost of less clear code. As usual the mix of both approaches is probably the best. But there are more alternatives. If you use adverse as described by Feathers I'm prepared to accept exceptions to a much larger degree because they're being handled immediately so it reminds of the return code approach.

• #### How many t-shirts in your next iteration?

So considering my last post on t-shirt sizes and burn-downs, how do you decide on how many t-shirts to take on in the next iteration? If you assign some value to each size I guess it is easy since you have your velocity to use. But I heard another interesting thing from a guy I worked with the other day. He had been on a team where they did not assign any number to their t-shirt sizes. They just expressed their velocity in terms of how many user stories of each size they could make in each iteration. For example they might have had a velocity of "2L, 1M, 1S" meaning that in each iteration they tried to complete two large, one medium and one small task. I like this idea since it is simple. And easy to adjust. When you realize you don't have that many large user stories left but a bunch of medium ones, just try a new mix and adjust until you know your new velocity (which might be 1L, 3M, 2S).