This is an old revision of the document!
Table of Contents
What Can We Do To Improve Our Point Based Estimates?
Or “How do we improve our point based estimates and resultant velocity?”
Backgound
We focus on making story points a true measure of the relative size of the work, and velocity as the true capacity of the team to complete requirements to done.
Most organizations settle on a team-by-team Story Point based approach to estimate work. These estimates are used to understand how much the team can deliver in a Sprint / Iteration. The capacity to deliver working code, called the Team Velocity is determined by summing up the each Team estimates of User Stories that are Done in a Sprint / Iteration.
We come to rely on this information for a number of uses. For example it helps us plan and understand how Teams are doing. A number of Teams have reported issues in both calculating and using Story Points. The following approaches were pulled together from 2 primary sources:
- We gathered up this information from the Teams. Review of the numbers raised a number of questions. The subsequent discussion highlighted both issues and potential solutions to those issues.
- A number of Teams reported that velocity in release sprints is different to production sprints which means that historical velocity cannot be easily used to determine likely release date or that defects there is a difference in the size of the story when you compare defects versus new work. Again subsequent discussion of the issue with Scrum Masters highlighted issues and potential solutions.
These pages is a summary of what was discovered on the approaches Teams have used to improve their estimation process.
What is the Purpose of Estimation?
We need to remind ourselves why it is we estimate. Let's face it, estimation is mostly about planning and forecasting. There is a huge amount of baggage in most organizations associated with the process of estimation. There are also legitimate questions the business has:
- When we will get this capability?
- How much will it cost to get this capability?
- Are we making progress to delivering this capability?
- What capabilities are coming up?
From the business perspective, the main reason for estimates is to provide data to the business (through the Product Owner / Manager) to understand, manage, and forecast plans. They are trying to make sure that, for a given capacity for the organization (enterprise, program, or team), we make trade-off decisions about how to best use that capacity based on the strategy we have and the need for self-investment, and then understand progress against these decisions. A secondary reason is to help the organization understand their capacity so they do not over-commit; so they can establish a sustainable pace.
Agile requires that we provide the business (through the Product Owner / Product Manager) with good enough data, that we work to improve the estimates when they do not provide the data required. In other words, when we say “make estimation work” what we mean is that the business can easily plan using the estimates and velocity and can make informed business decisions. If the business cannot make it work, it is up to the team / team-of-teams to help fix the problem.
The agile approach to estimation stresses speed, full team involvement, and information that is accurate enough for the purpose intended, not pure precision. This is aimed at reducing the burden of estimation, while still providing the data needed.
What Are the Characteristics of a Good Estimate?
Estimates need to reflect reality. Some basic characteristics of what we need from estimates include:
- The estimates consistently reflect the capacity of the Team to deliver.
- The estimates consistently reflect the size of the work no matter what kind of work it is. For example, it should not matter that we are estimating a new feature or defects.
- They are truly relative in that, on a Team by Team basis, an 8 is an 8, and an 8 is about 4 times a 2 no matter the source.
- Estimation data allows the use of ranges of estimates to help understand the risk profile of the work. A simple example of this is when the Product Owner uses best, average and worst velocities to understand what is likely to happen with their plan.
It is worth saying again:
The estimates consistently reflect the capacity of the Team to deliver.
They must be based on reality.
What is the Basis of Our Estimating Process?
The basis of the estimating process is “Team based relative size estimates” or “Story Point estimation” to estimate User Stories (the need). This means:
- We do relative size estimates (how big something is in relation to something else), not duration estimates (how long something takes).
- We use a team based approach to estimation such as planning poker or affinity (or triangulation) estimation
- We have the people (in other words “the Team”) doing the work doing the estimates
- We use the modified Fibonacci sequence (1, 2, 3, 5, 8, 13, 20, 40, 100) for estimating
- All data we collect can and should be used to understand what and how we are doing so that we can get better.
- The estimation process is refined. Estimations are revisited to see if they are accurate and precise enough.
Relative Estimates
If two work items are the same size, should they always have the same estimate?
It is worth delving into the notion of a relative size estimate for a moment. Most traditional estimates are based on duration (“it will take about 20 hours”), and that leads to a whole bunch of problems (see What Are The Problems with Estimation? for more information). But it is worse than that. If you have two pieces of work to do, both of which you think will take about 20 hours to complete, but the first piece of work is straight forward, while the second is in an area where there is a traditionally a lot of problems, should these items have the same estimate? No, you'll probably want to put some level of buffer in place to offer a more realistic estimate for the second piece of work. This is another reason to move away from pure, absolute time based estimates.
One thing that often confuses people who are used to traditional task based estimates is that the resultant estimate is a Team's view of how big this item is. When we say the estimate is a “3”, we are really saying that the Team's view of this piece of work is that it is a “3”. In particular, it is not a single Team member's view. When starting, many Teams fall into the trap of saying “There are design, implementation, and testing components of this piece of work. George will do the design so what do you think the estimate is for the design piece … Jenny will implement, so what do you think the estimate for the implementation piece is …” and then just sum up the results. In Agile the unit of execution is the “Team” and so the estimates indicate the size of work for the Team not the individuals on the Team. Sure the time to “design” and “implement” are part of that. But to deliver the value represented by the Story the Team might, for example, want to pair implementation and testing as they do the design to improve the design. Or the Team might find that testing might need an “all hands on deck” approach to assure quality. What we are estimating is the Team's ability to deliver value.
In Agile the unit of execution is the “Team” and so the estimates indicate the size of work for the Team not the individuals on the Team.
For more information on the basis of the process see Planning Poker for Estimates from Mike Cohn. Affinity mapping is based on Play, Pass, or Move approach.
Want to Know More?
How Can We Scale Our Estimating Approach Beyond a Team?
Often, you need to provide estimates for items of work that are beyond the size of a typical Story, say an Epic or a Feature. For example, you might want to determine an overall road-map of intent and need to line that up with a calendar. This is only realistic if you have some kind of view of how long something will take. Or you might want to determine the Cost of Delay, which includes a sizing component in it, to determine what work you should schedule first to maximize the return.
The question therefore is “How do you go about getting size information for these bigger pieces of work?”
For the sake of discussion we are going to make a couple of assumptions:
- We are going to use the basic SAFe structure of Stories (less than two weeks), Features (less than a quarter), and Epics (more than a quarter). While the discussion is related to SAFe, there is nothing SAFe specific here. The scaled estimation problem is the same.
In general there is a different focus in the “why we estimate” question as we scale to epics and features. At the team level you are interested capacity to take on work. At the program level the main use is to prioritize the work. At the portfolio level its to help with overall budgeting. Pictorially this is represented as:
(Note: Thanks to Steve Sanchez for the graphic.)
There are two basic ways to scale estimates:
- Pure Feature and Epic points which are used as an estimate: With this approach you use a similar process that you would use with Story Points. In other words you would use Team based relative size estimate on Features and Epics to generate “Feature Points” or “Epic Points”. You might choose to define a new scale so that it is clear you are talking about Feature Points or Epic Points. For example, Feature Points might be 10X a Story Point, with a sequence that looks like 10, 20, 30, 50, etc. In the same way that you track Team Velocity to determine how much you can get done, you can track closure of Features and Epics to reflect how much you can get done in these scaled environments.
- Pros: In many ways, this approach is a "fractal" of the story point approach and benefits in a similar way.
- Cons: The main downside of this approach is that it will take time to get useful (in terms of forecasting) data. Just like a Team has to wait an Iteration (a couple of weeks) to get their first data point, Features will take a quarter to get their first real data point and Epics will probably take longer still. Another weakness is if one class of work is significantly different from others it is difficult to get consistent estimates and capacity based on feature velocity.
- Summated Story Points to create Feature and Epic estimates: Based on a common understanding of the size of a Story Point, the idea here is that Features are estimated in terms of the number of Story Points it would take to complete the work, and similarly for Epic. So for example, if we look at a Feature and as we do the estimating we would say “this Feature is about the same amount of work as this other 60 Point Feature, so we'll call this a 60 for the estimate.”
- Pros: One benefit of this approach is that you can start using data immediately. If, for example, you want to start managing Epics at the portfolio level now, you can quickly create estimates and understand capacity even if you do not have all the teams in place. The data you will have, while not highly accurate, will be sufficient to do capacity calculations and prioritization (see How Do I Facilitate a Prioritization Meeting? for an idea on how to get started).
- Cons: The main downside of this approach is that you need to have a common understanding of a Story Point. Many start with “a day equals a point” with the result this can quickly turn into a pure duration based estimate with all the problems that entails. Another problem occurs when one Team's velocity is significantly different (ie orders of magnitude different) from other Team's as, while each of the Teams can work their own estimates and velocity, when you summate very large numbers from one Team they can completely swamp other Team's numbers which means it is hard to understand capacity of the multi-Team Train.
In practice, the two approaches really are not that far apart from each other. What you will find is that even with the summated approach, you will end up with Features in the 10's (ie 10, 20, etc) and Epics in the 100's. The hard part is getting people to really do relative size estimates that include a view of risk and complexity at all levels. As a general note SAFe starts with the “Summated Story point” approach, but it is not that specific over the long haul. In fact SAFe assumes a relative size approach (not duration) as it says “Start with day is a point then never look back”.
What this means is that in most places I have worked, organizations end up with the Pure Feature and Epic Points approach where Feature Points are the Fibonacci numbers times 10 (10, 20, 30, …, 130) and the Epic points are the Fibonacci numbers times 100 (100, 200, 300, …, 1300). The other thing that organizations do is limited the highest number to 13 or 20, no 100’s etc, with the idea that this encourages people to split the work up if it gets to this level. So if there is a Feature Point estimate of “this is more than 130”, the discussion is “Perhaps this is an Epic? Or perhaps we need to split the work so that it will fit in a quarter.” This is a good discussion to have.
One final note on this. Many organizations I've worked with like to abstract estimation one step further by using t-shirt sizes for estimates. For many it is easy to say “in comparison to this small piece of work, this is a large”; it helps because since there are no numbers, you don't think about time. Once they have the t-shirt size, organizations usually settle on a mapping between these t-shirt sizes and Feature or Epic points. The following diagram shows a sample mapping that might be put in place:
Note that the mapping and the numbers would be validated to ensure that there is in fact a meaningful mapping between a feature we call “small” and the actual story points needed to complete that feature.
One final note. Some organizations I’ve worked with step back from an either or approach and work a both and approach. When little is know about the Feature, say when it is still being analyzed, they use a Feature point scaling based on S, M, L t-shirt sizing. They then equate the t-shirt size to numbers, so a S might be a 3 Feature Points, M might be 8, and so on. Then, as they learn more and understand the kind of work they have they move to a summated story point approach to Feature size; a second estimate, if you like.
Want to Know More?
- Why the Fibonacci Sequence Works for Estimating - Weber’s Law approach from Mike Cohn
- Why Progressive Estimation Scale is So Efficient - Information theory approach from Alex Yakyma
What Estimation Practices Can We Try?
Here is a list, in no particular order of things that someone has tried or suggested as a way of improving something about the estimation process: