## Ordering the product backlog

Several posts in this blog discuss the order in which new features should be implemented. In this post I try to summarize some of my thinking so far. The following terminology will be used in this post:

• New proposed features are described in “change requests” that are in effect small documents or records in a database describing various aspects of the proposed feature.
• To realize a change request a number of “tasks” need to be completed. Some tasks are related directly to a change request whereas other tasks are more “global” (e.g. system testing).
• Change requests are organized in an ordered list with change requests to be completed first highest up in the list. Change requests are always picked from the top of the list.
• For practical purposes also the tasks (task descriptors) that are derived from the change requests are stored in the product backlog.

Where in the product backlog a new change request and its associated tasks should be put (and thus when it is going to be realized) depends on several things:

• The additional expected income we will get from the new feature. This is, among other things, a function of the customer benefit of the new feature and the certainty that we will be able to deliver the feature. We wish to deliver high value features first everything else being equal.
• The expected cost of developing the new feature. This depends on a large number of things such as the novelty of the technology and the skills of the developers. We want to deliver features that are inexpensive to realize first everything else being equal.
• The level of uncertainty of successful realization or attractiveness of the new feature.
• The dependencies among the features. Several functional features may for instance depend on that we can achieve enough performance on the given hardware platform.

Let’s consider three different development scenarios:

#### “Web site”

We own a web site to which we add features more or less continuously from a potentially long product backlog. We have a dedicated team that implements and releases new features in an ongoing process where new change requests come in regularly and new features are released incrementally as they become available. The new features are mostly independent from each other and carry low uncertainties. A faulty or useless new feature can easily be removed from the web site without affecting other features.

In this scenario change requests in the product backlog should be ordered strictly based on their estimated income / cost ratio; inexpensive features which bring in a lot of money should be realized first. Since uncertainties are assumed to be low, they can be largely ignored. Also, with low uncertainty, we don’t really need the overhead of a project organization with all the planning, tracking and risk management. A Kanban-style development process is quite sufficient. Since there are no dependencies, it is sufficient to look at each change request and compare its income / cost with that of all the other change requests. See also this post and this post.

#### “New version of an embedded system

We develop an embedded system for, say medical imaging modalities in a series of projects, adding new features in each project. The product has existed for several years and has a long product backlog. A new project is started based on some signal from the market or based on a predetermined schedule. A new project usually has a “theme” that binds together the change requests. The new features are mostly independent but there may be a few features that are critical to the success of the rest. One example is performance: if we discover that the hardware resources available are not sufficient then we need to scale back on some of the other features.

This case is similar to the first one except that instead of delivering features in a continuous stream, we deliver them in batches, produced in projects. Uncertainty is assumed to be higher so we need to consider uncertainty-weighted expected values of the income and the cost of each feature before ordering the features in a income / cost order.

Ideally, independent high-uncertainty features should in this scenario be evaluated outside the regular product development stream in a research project, concept development project or similar, so that the few high-uncertainty features don’t stall the whole project. High-uncertainty features that are necessary for some other, low-uncertainty features, on the other hand need to be addressed early in the project; there is no use in developing a number of new features if we at the end discover that we can’t get for instance real-time video performance when this is a must-have requirement. The project tasks therefore need to be ordered so that we bring down uncertainty as fast (and inexpensively as possible). See also this post and this post.

#### “Innovation”

We develop a totally new and innovative product based on new technology or new science in general. The product backlog only covers the features for the first project. There are several make-or-break uncertainties within the selected set of features regarding the technology, the market, or perhaps some other area. This means that there is a significant risk that the project will fail.

In this case we assume that all or most features in the product backlog are required for the product to have any value at all (as this is the first version of the product); all features depend on all other features. Selecting change requests for the project is therefore relatively straightforward in this scenario. Instead we need to focus on the order in which we realize the features within the project so as to minimize the expected project cost. We need to order the project tasks so that tasks that give a large degree of uncertainty reduction per unit of cost come first in the project; if we fail, it is better to fail early than to fail late. See also this post.

## Optimizing the value of a project using Stage-Gate

In an earlier post I wrote about risk-driven development. The idea that I proposed was to address project risks starting with the biggest risk first (to “fail early” if you have to fail). In this post I will try to elaborate on that rather vague statement and prove that a strategy along these lines indeed maximizes the value of the project.

The traditional way to evaluate projects is by using discounted cash flow valuation (DCF). The model assumes that you make a big one off investment and then get a positive cash flow from that investment. This model works fine if we are investing in say a new paper machine, provided that we can foresee the future cash flow attributable to the new machine. The risk of the investment is weighted in by discounting the future cash flow with a discount factor that is a function of the risk level.

In new product development projects that use the Stage-Gate model we get to make incremental decisions and thus do the investment piecewise while learning along the way. (It’s not as easy to see how one can learn from investing in say a 10:th of a paper machine.) If the market changes or we realize that we can’t overcome a technical challenge then we can abort the whole project minimizing our losses before we’ve spent all our money. This possibility to make incremental decisions increases the expected value of the project by decreasing the expected cost. This is the principle behind real options valuation.

Another way to see Stage-Gate is as a series of consecutive go-no go experiments. Each successful experiment takes us one step closer to the full product. If the experiment fails then we abort the project. All experiments must succeed in order for the whole project to succeed.

Let’s look more closely at the stages in a Stage-Gate (project) process: We do some work in each stage and based on the results we either abort the project with probability

$1–{p}_{1}$

or continue the project with probability

${p}_{1}$

We abort the project if a fatal (for the project) risk has been realized during the stage. The probability to abort is therefore here equal to the probability of a fatal risk being realized.

The question discussed in many of the posts on this blog is: in what order should we do the experiments in order to maximize the value of the project? For this we need to introduce the concept of a decision tree and some associated entities.

 A number of consecutive experiments represented as a decision tree.

In the decision tree we have a number of “events” depicted as circles. These represent our experiments. Each experiment has a cost of

${c}_{i}$

,

a probability of succeeding of
${}_{}$
p
i

, and a probability of failing of
$1$

p
i

. The cost of failing an experiment is
${}_{}$
C
i

,
${}_{}$
C
i

and
${}_{}$
c
i

>
0

. Also

${}_{}$
C
i

=

n
=
1

i

cn

which means that the cost of failing the project at the n:th experiment (by failing the n:th experiment) is the accumulated cost of all experiments up to and including the n:th.

We can “unfold” the value of the project step by step. Let’s look at the value
${}_{}$
V
1

of the project before the first experiment. It is simply the probability weighted average of the value of the two branches.

${}_{}$
V
1

=
(
1

p
1

)
(
&InvisibleTimes;

C

1

)
+

p
1

&InvisibleTimes;
(

C

1

+

V
2

)
=

C

1

+

p
1

V
2

If the first experiment fails we will have a negative value
$V$
=

C

1

=

c

1

of the first experiment. Otherwise we get whatever comes down the other branch which is
${}_{}$
V
2

subtracted with the cost
${}_{}$
C
1

.

${}_{}$
V
2

,
${}_{}$
V
3

, and
${}_{}$
V
4

can be written in the same format.

${}_{}$
V
2

=

C

2

+

p
2

V
3

${}_{}$
V
3

=

C

3

+

p
3

V
4

${}_{}$
V
4

=

C

4

+

p
4

I
${}_{}$
V
4

is where it gets a little more interesting as it is here we actually have an opportunity to get some income
$I$

.

Untangling the recursion we get

$V$
=

V
1

=

C

1

p
1

C
2

p
1

p
2

C
3

p
1

p
2

p
3

C
4

+

p
1

p
2

p
3

p
4

I

The income
$I$

is multiplied by all probabilities so for the income the order of the experiments doesn’t matter. Maximizing the value with respect to the order of the experiments is therefore equivalent to minimizing the cost (remember that all costs in the expressions here have positive values). So we need to minimize

$C$
=

C
1

+

p
1

C
2

+

p
1

p
2

C
3

+

p
1

p
2

p
3

C
4

with respect to the order of the experiments with costs
${c}_{j}$

, and associated probabilities for success
${p}_{j}$

. It is also from the above easy to guess what the expression for the cost is with an arbitrary number of experiments. I choose intuition before induction for now though and will not try to prove it.

What we want is a rule or a set of rules for sorting the experiments so as to minimize the expected cost. Let’s first assume that the order of the experiments
${}_{}$
E
i

as shown in the figure above minimizes the total cost
$C$

. Any permutation of the experiments would therefore increase the cost. From this we can deduce how the
${c}_{i}$

and
${p}_{i}$

must relate to each other.

Now trade places between the first and the second experiment. This should (per definition) give a higher expected cost. Expanding all
${C}_{i}$

into their constituencies and setting up the inequality we get

${}_{}$
c
1

+

p
1

(

c
1

+

c
2

)
+

p
1

p
2

(

c
1

+

c
2

+

c
3

)
+

p
1

p
2

p
3

(

c
1

+

c
2

+

c
3

+

c
4

)
<

c
2

+

p
2

(

c
2

+

c
1

)
+

p
2

p
1

(

c
2

+

c
1

+

c
3

)
+

p
2

p
1

p
3

(

c
2

+

c
1

+

c
3

+

c
4

)

After some juggling around we finally get

${}_{}$
c
1

+

p
1

(

c
1

+

c
2

)
<

c
2

+

p
2

(

c
1

+

c
2

)

Switching any two adjacent experiments give similar (but not entirely the same) inequalities

${}_{}$
c
2

+

p
2

(

c
1

+

c
2

+

c
3

)
<

c
3

+

p
3

(

c
1

+

c
2

+

c
3

)

and

${}_{}$
c
3

+

p
3

(

c
1

+

c
2

+

c
3

+

c
4

)
<
c
4

+

p
4

(

c
1

+

c
2

+

c
3

+

c
3

)

As long as all inequalities above are true, we will increase the cost by reversing the order of two adjacent experiments. I have not managed to prove that the pair-wise inequalities are a sufficient condition for a global minimum. Switching the first and the third experiment would for instance give the inequality

${}_{}$
c
1

+

p
1

&InvisibleTimes;
(

c
1

+

c
2

)
+

p
1

&InvisibleTimes;

p
2

&InvisibleTimes;
(

c
1

+

c
2

+

c
3

)
<

c
3

+

p
3

&InvisibleTimes;
(

c
2

+

c
3

)
+

p
2

&InvisibleTimes;

p
3

&InvisibleTimes;
(

c
1

+

c
2

+

c
3

)

which doesn’t necessarily follow from the pair-wise inequalities above it. Remains also to do the math for an arbitrary number of experiments but that seems like the easier of the two remaining issues.

The expressions in the inequalities are easy enough to put in a spreadsheet to get simple tool for ordering a number of experiments though. I did just that and the spreadsheet simulation show that the conditions above are a predictor for a global minimum with the admittedly small number of experiments I have carried out. I therefore still dare to postulate that we wish to have a small
${}_{}$
c
i

in some way combined with a small
${}_{}$
p
i

in early experiments. Remember that
${}_{}$
p
i

is the probability of succeeding with the experiment. A small probability of success means a large probability of failure means that we should do the uncertain and cheap experiments to start with.

The spreadsheet simulation I did for instance gives that if we have a series of four experiments with costs 20, 30, 40, and 20 with the corresponding probabilities for success of 0.4, 0.6, 0.8, and 0.9, then we should order the experiments in the order 1, 2, 4, 3 whereby we get an expected cost of 80.56. The sum of the costs of all experiments is 110 so by doing the experiments one at the time and aborting if failing we can bring down our expected cost by 27%. With many other random ways to order the experiments we will only decrease or expected cost by a few percent.

In conclusion: the riskier the project, the more we will gain (a) by using some kind of Stage-Gate model with a decision to continue or to abort after each experiment (or group of experiments) and (b) by ordering the experiments with those that give most uncertainty reduction for the money in the beginning.

When I started this post I was hoping that either the proof would be pretty easy (there is after all no esoteric mathematics involved) or that it would fall into a class of well-known problems such as a shortest path or a traveling salesman that already have solutions. But so far, no luck. I will keep on looking and if you, dear reader, have some ideas, please let me know. Until then, I’m going to trust my hunch and my incomplete proof.

## I wasn’t first – this time either

Having Googled around a little bit more I realize that what I wrote two posts down wasn’t exactly new thinking. Similar ideas were described by Robert C. Cooper in this article. I didn’t read the paper before I wrote my post, I swear 🙂

Even if I didn’t earn the Nobel Prize in management this time either, I’m happy to see my ideas corroborated.

## The discovery backlog

I have realized that engineers use words differently from other people. When an engineer says “problem” he or she often doesn’t mean anything negative (except in “Houston, we have a problem”). Problems are engineers’ raison d’être; engineers thrive on solving problems. When the problems get tough, the tough engineers get going.

The same goes for the word “risk”. We have “risk lists” in our projects. We do “risk mitigation”. There are entire companies filled with brilliant engineers doing nothing but “risk management”.

Using the words “problem” and “risk” in some other contexts, like with the sales team, may not always be a good idea though. The lone engineer may come out as downer, an overly pessimistic person, who’s not willing to “see the opportunities instead of the problems” (a popular cliché at least in Sweden).

So I realize I need a better word than the “risk backlog” i just invented in my previous post. What about “discovery backlog”? We don’t have to call the items “risks”, they are just things that we currently don’t know. Like if anybody is going to buy our product or if the quantum drive will really work as intended. We need to sooner or later discover those things. I can’t really wrap my brain around “opportunity backlog”.

## Risk-driven development

Several project management models include provisions to manage risk. Risk is here defined as a probability for an adverse event times the quantified consequence of that adverse event. The IBM Rational Unified Process recommends addressing risk while planning the iterations of what in RUP is called Elaboration phase. Barry Boehm’s Spiral Model is guided by risk considerations. So are the various versions of the Stage-Gate model. The Scrum literature, while mentioning risk as one of the prioritization principles for the product backlog, leaves it mostly to the judgment of the product owner to make a good prioritization.

We can intuitively understand that creating something entirely novel such as a car that runs 10 000 km without refueling is more risky than developing next year’s model of an existing car with only some cosmetic changes. The risk in new product development is usually not evenly distributed on all tasks in the development project. Developing the engine of the ultra-long-range car (ULRC) carries far more risk than developing the entertainment system or the suspension.

Risk-driven development means that we want to eliminate as much risk as we can, as fast as possible, in any way possible; we don’t want to end up having invested a large amount of money and reputation in a project that after all that investment still has a high probability of failure. We also have to take into account the opportunity cost, the gain we would have got if we had invested the money in another project.

As an illustration, assume that the biggest uncertainty in a project (like the ULRC engine) is left as the last component to be developed in the project, then we would end up having invested a lot of money in the project without still knowing if the product will ever work. The cost of the risk being realized would be the opportunity cost plus the total accrued project cost up to the time of the ultimate failure.

We can also look at it from an capital budgeting point of view. When selecting investment targets, we always wish to match return and risk. For a particular level of risk we expect a certain level of (expected) return. Assuming that the income from the project is fixed (as long as it succeeds), then the risk level at which we invest our next unit of money in the project should be guiding our willingness to make that investment; the lower the risk, the more attractive the investment. I will try to elaborate on this in later posts.

When developing an ULRC it is probably thus not be wise to start with specifying and designing the entertainment system or the suspension. Neither does a comprehensive and approved requirements specification help much to lower the risk in this particular case. The only novel requirements may be the 10 000 km range and that’s easy enough to understand and to write down. Instead we should, as already hinted above, focus on designing and building prototypes of the long-range engine and its related parts.

There are of course variations to the risk-driven development theme. In some cases we need to build some low-risk parts first to be able to even start with the high-risk parts. For instance, we may need to build the rest of the powertrain or at least a test bench simulating the rest of the powertrain to be able to carry out tests with the new engine.

One framework for risk-driven development is, as mentioned in the introduction, the Stage-Gate process consisting of phases (stages) and tollgates. The tollgates are decision points at which the future execution of the project is decided based on the project’s risk level so far. If we at a certain tollgate think the risk is too high for a substantial new investment, e.g. for ramping up development or starting an expensive marketing campaign, then we need to find ways to lower the risk further before we make the additional investment. If we can’t find such ways, then we may need to abort the project altogether.

A problem with the Stage-Gate model is that it is often confused with a waterfall development model which e.g., mandates that the product requirements are developed and preferably frozen and approved in the beginning of the project. Indeed, in many quality management systems the tollgate criteria are defined in terms of produced documents and those criteria are the same for all projects.

The Scrum process doesn’t have formal tollgates. All development in Scrum is made in sprints (similar to iterations). The progress of the project is checked after each sprint and adjustments are made to both the plan and the process as needed. Scrum does not mandate any particular order in which the product should be developed but recommends that potentially shippable product increments are delivered as a result of each sprint. (This usually works for software but maybe not for a car.)

To conclude, here are a couple of ideas that should make the Scrum and the Stage-Gate processes more effective together:

• Rename the risk list that exists in most project models to risk backlog and think of it in the same way as about the product backlog in Scrum. This implies an order in which the risks shall be addressed and should be used to plan the project (iterations, sprints, whatever). Risk-driven activities include developing functionality, interviewing customers, building prototypes, doing analyses, and so on.
• Use the risk backlog as the main input to the tollgate decisions criteria in the Stage-Gate model. The tollgate criteria should be allowed to vary from project to project and should be concerned about the biggest remaining risks in the project (including risks such as that there is no market for the product we are developing). The fixed lists of documents that is often used as tollgate criteria do not fit every project since they do not match the risk profile of every project. It is after all risk that we wish to assess at the tollgate and the risk backlog, including any more detailed material on each risk, is the main indicator of project risk.
• Synchronize any gate decision with the end of a sprint and make sure that whatever is required for the gate decision is produced in the last sprint(s).

## Understanding and misunderstanding the Stage-Gate model

Many project management models are based on Robert Coopers original Stage-Gate model [1][2]. My experience is that it is often misunderstood. Two common such misunderstandings are:

• That the stages in the Stage-Gate model imply a waterfall development process in which development activities are mapped onto the stages and performed in a strict sequence.
• That a Gate is some sort of project impediment or, marginally better, any old milestone.

I will below suggest alternative perspectives. But first some background.

### Risk and reward

The goal of most project management actions should be to maximize the value of the project. Net present value (NPV) is one way to measure the (expected) value of an investment such as a project [3].

The NPV of a project is the additional profit from the project as compared to other projects or other investments with the same risk level. A project with zero NPV is comparable with the “average” project or other investment with the same risk. Projects with NPVs greater than or equal to zero are thus worth pursuing. NPV can be seen as the ultimate result of the business case for the project. NPV is calculated as follows:

Here Ct is the expected cash flow for the project at time t and Dt is a discount factor at time t. D0 equals 1 and Dt is always less than Dt+1. A negative cash flow is a cost, a positive cash flow is an income.

The empirical reason for discounting future cash flows is twofold: (1) one unit of money now is worth more than one unit of money tomorrow (money today can be invested to get more money tomorrow) and (2) certain (risk-free) money is more worth than uncertain (risky) money. The discount factor accounts for both these facts. A large D, due to high risk, will result in a lower NPV and therefore a less attractive project.

The NPV formula is really your best guess at any given point in time and subject to decisions and unforeseen events. It will therefore inevitably change as new facts unfold. It also assumes that you hold a whole portfolio of investments (such as projects) so that all risk that can be diversified away is eliminated. In a company setting you most likely don’t have that option. There is the department budget that you need to adhere to or run the risk of losing your bonus. And try to explain to your boss that you wish to start another 29 projects to diversify your risks. So you better deal with that project specific risk too! Despite its limitations, we can still learn a couple of things from the NPV formula:

• Cash sooner is better than cash later so a short project execution time is most often a good thing. I addition to the discount factor effect, customer preferences are likely to change during a long project execution time which may make your product obsolete even before it is released.
• Other things being equal, a risky project is worse than a risk-free project. Most managers prefer a €10 000 income with 100% certainty to a €10 000 000 income with 0.1% certainty. (And the Mediterranean debt being what it is they will soon enough prefer \$10 000 to €10 000.)

In [1] Cooper also explains:

• The higher the amount at stake, the lower the tolerable risk. We thus need to lower the project risk level at the same pace as, or faster than we increase the amount at stake (investment). We don’t want to find out that the cold fusion drive that we bet our project on didn’t really work after having designed the rest of the space ship and built the first prototype. Instead we should buy an option by spending some money early on for securing that the cold fusion drive will work as intended.
 Any old milestone. Or is it?

We thus need to control both the projected cash flow (the NPV calculation) and the risks throughout the project, as part of the project management activities, and try to get rid of the major risks as early as possible to make those later (and probably larger) investments as risk-free as possible.

The gates of the Stage-Gate model are good points in time to reassess the cash flow and the risk level, i.e. the business case. Ideally this should be done continuously but for practical reasons we have to settle for a few gates at strategic points in time.

Back to the misunderstandings at the beginning of this post:

### Stage-Gate is no waterfall

It is easy to map the phases of a Stage-Gate model onto the phases of e.g. a waterfall system development process. The first Stage-Gate phase would be mapped onto the requirements analysis phase, the next onto a design phase and so on. From this sort of mapping it is then easy to arrive at a simplistic interpretation of the gates too. A condition to pass the first gate would mean that “all the requirements are frozen” etc.

Now, since we want to get rid of risk as early as possible, focusing on collecting every single requirement in the beginning of the project may not be the entirely right thing to do. Most of the early risk does have with requirements to do but not all of it. We may already know most of the requirements even though they aren’t written down yet and there is instead a major uncertainty about the feasibility of a certain technical solution (think cold fusion drive), about a new supplier partnership or some other issue not related to the requirements. Then these risks must also be mitigated early on, perhaps before writing down all the requirements and freezing them.

An approach in which all requirements must be frozen at a certain gate is thus often not optimal from a risk reduction point of view.

### A Gate is not any old milestone

A Gate is an occasion when the NPV calculation and the risk estimates should be reiterated. Project cost estimates may have changed, the market demand (~ income) estimates may have changed, and the risk estimates may have changed. We may in particular not have reached the risk level we were targeting at the particular point in time. If not, then some further risk reduction activities are needed before committing more money, especially if the next phase is an expensive one.

A Gate only used as a regular milestone without reassessment of the business case including the risk level is a waste of time for the project steering committee and the other participants of the Gate meeting. Regular milestones can and should be tracked by the project manager.

* * *

The Stage-Gate model is a great tool for managing projects if used right. Used in the wrong way it adds bureaucracy without giving the expected benefits.