TL;DR

Letting the people who will do the work estimate is preferable. Ideally, several independent predictions should be combined to improve accuracy. Relative prediction methods such as story points and non-linear scales (e.g. the Fibonacci sequence) can be used but they do not lead to more accurate results.

Disclaimer

A substantial amount of this post is based on the excellent and free e-book Time Predictions – Understanding and Avoiding Unrealism in Project Planning and Everyday Life by Torleif Halkjelsvik and Magne Jørgensen. I highly recommend their work to anyone who has to deal with project planning.

Practices for Estimation

In the first part of this blogpost, I explained why the accuracy of estimates is limited, and why adding them up is not straight-forward. I then examined several human biases & heuristics, which affect how we estimate tasks. Understanding them lets us avoid common planning fallacies. Check out the first part to learn more, if you haven’t already done so.

Let us now take a look at some common estimation practices.

Tackling Complex Tasks

In practice, tasks are often too complex to estimate without breaking them up into smaller and simpler sub-tasks. There are two ways to go about this. First, we can think of all sub-tasks and estimate them individually. Alternatively, we can write them down, but still, only estimate the whole set of tasks. Which one is better?

Writing down a list of work items or sub-tasks does increase prediction accuracy, but estimating each sub-task individually can have negative effects. If the tasks are too small, they tend to get overestimated. If we forget about sub-tasks, then basing the overall estimate on the sum of the estimated sub-tasks decreases our prediction accuracy. Also, it leads to the aforementioned problem of summing up “most likely” estimates. Save yourself some work, and do not estimate sub-tasks.

Using Analogies

Our intuition likely bases predictions on analogies it can recall. We can also use conscious analogies. This means identifying similar tasks we did in the past and looking at how much effort we needed to complete them. If we find very close analogies, this can lead to accurate predictions. If the analogies are not that close, we can take the average of several analogies. However, if we do not find any close analogies, then other methods of predicting will perform better.

Making Relative Predictions

Agile methods favor relative measures for predicting time usage and complexity. Most commonly, they are called story points. The narrative here is that humans are bad at estimating absolute values (how tall is that tree?), but are good at estimating the relation between values (is that tree taller than the house next to it?). But just knowing that one task is simpler than the other is not enough, we also need to know the approximate ratio between the two tasks. Currently, it appears that relative predictions work, if the tasks to be estimated do not differ too much in size. Overall, the research is still limited and does not suggest that relative predictions are more or less accurate than absolute predictions.

Taking aside prediction accuracy, relative predictions may have other benefits. For example, they translate more easily between different people performing the task. You might be more experienced than I am, and thus estimate less absolute time for any task than I do. But both of us will use more time for a complex task, than for a simple task, and our relative estimate for a task will be similar.

Using Non-Linear Scales

Relative predictions are often combined with non-linear scales, such as the first few elements of the Fibonacci series. This is meant to express that the accuracy of predictions is limited, and also speeds up discussions. Your teammate cannot draw you into a lengthy discussion about whether something takes 7 or 8 points. The scale does not allow for such subtle distinctions, and this can save you from pointless discussions.

Be aware that scales have side effects. The so-called central tendency of judgment draws us to the middle of a scale, especially if we are uncertain. If the scale is non-linear, this can lead to distorted estimates: 3 is the middle of the Fibonacci numbers from 1 to 8. This means that if we are uncertain we will give lower estimates than when using a linear scale from 1 to 8 (where we would choose 4 or 5).

Further, while the Fibonacci numbers and T-Shirt sizes are common scales, there is no evidence that they are particularly good choices.

Observers vs Actors

Depending on how your company is structured, the people estimating the tasks may be the ones who implement them, or not. There is the idea that observers can be less biased and thus more rational at estimating, than the actors who will do the work. What is true about this is that observers are less over-optimistic. However, unless they have a lot of historical data upon which to base their judgment, their estimates are also more inaccurate than the ones of developers. Unless you have that historical data, having the actors do the estimates will give you more information.

Combining Independent Predictions

Should you bother with group estimates? You should! The median of several predictions is always more accurate than half of the individual predictions. For that, it is important that the predictions are done independently. Also, group discussions help to achieve even more accurate estimates. Overall, whether you are using story points, or absolute times, the planning poker method ensures that you take advantage of the positive effects of group estimates.

Planning Poker

Each participant starts with an identical deck of cards. The deck commonly uses the first numbers from the Fibonacci sequence. In principle, one could use any set of absolute or relative values, as long as the deck does not become too big.

Each round is then played like this:

  • The product owner briefly describes the task to be estimated.
  • The team is given time to discuss the task and to ask questions.
  • Everyone plays a card face down.
  • When all the cards are on the table, they are turned over.
  • If the estimates do not agree, the participants with diverging numbers justify their estimates. Then, everyone again plays a card. The process is repeated until a consensus is reached.

Conclusion

Good practices can make the difference between unrealistic, over-optimistic estimates and ones that are accurate enough to inform decision-making.

There is another perspective to this: many things that are worth doing involve risk and uncertainty. If your work was fully predictable and risk-free, it would probably also be quite boring, and not of much value to anyone else. In this sense, I do hope you get your estimates completely wrong from time to time.

Top Recommendations

  • Try to avoid providing irrelevant information as much as possible. Anchoring effects greatly reduce the accuracy of estimations.
  • Combining several independent estimates is a good idea: use planning poker.
  • Keep in mind that people usually estimate the most likely effort. This is a lower number than the average effort and, if summed up, will lead to over-optimistic aggregate estimates.
  • Make sure that the tasks you estimate are not too large.
  • Do not plan & estimate for hours without taking breaks.

References

Sources for all claims made in the post can be found here: Time Predictions – Understanding and Avoiding Unrealism in Project Planning and Everyday Life.


Nikolaj Leischner is a Backend Developer and People Manager at MobiLab. When he’s not reading about estimations, he is building backend services for projects like Mobility inside PKM Editor.