## Optimizing the value of a project using Stage-Gate

In an earlier post I wrote about risk-driven development. The idea that I proposed was to address project risks starting with the biggest risk first (to “fail early” if you have to fail). In this post I will try to elaborate on that rather vague statement and prove that a strategy along these lines indeed maximizes the value of the project.

The traditional way to evaluate projects is by using discounted cash flow valuation (DCF). The model assumes that you make a big one off investment and then get a positive cash flow from that investment. This model works fine if we are investing in say a new paper machine, provided that we can foresee the future cash flow attributable to the new machine. The risk of the investment is weighted in by discounting the future cash flow with a discount factor that is a function of the risk level.

In new product development projects that use the Stage-Gate model we get to make incremental decisions and thus do the investment piecewise while learning along the way. (It’s not as easy to see how one can learn from investing in say a 10:th of a paper machine.) If the market changes or we realize that we can’t overcome a technical challenge then we can abort the whole project minimizing our losses before we’ve spent all our money. This possibility to make incremental decisions increases the expected value of the project by decreasing the expected cost. This is the principle behind real options valuation.

Another way to see Stage-Gate is as a series of consecutive go-no go experiments. Each successful experiment takes us one step closer to the full product. If the experiment fails then we abort the project. All experiments must succeed in order for the whole project to succeed.

Let’s look more closely at the stages in a Stage-Gate (project) process: We do some work in each stage and based on the results we either abort the project with probability $1-{p}_{1}$ or continue the project with probability ${p}_{1}$. We abort the project if a fatal (for the project) risk has been realized during the stage. The probability to abort is therefore here equal to the probability of a fatal risk being realized.

The question discussed in many of the posts on this blog is: in what order should we do the experiments in order to maximize the value of the project? For this we need to introduce the concept of a decision tree and some associated entities.

 A number of consecutive experiments represented as a decision tree.

In the decision tree we have a number of “events” depicted as circles. These represent our experiments. Each experiment has a cost of ${c}_{i}$, a probability of succeeding of ${p}_{i}$, and a probability of failing of $1-{p}_{i}$. The cost of failing an experiment is ${C}_{i}$, ${C}_{i}$ and ${c}_{i}>0$. Also

${C}_{i}=\sum _{n=1}^{i}{c}_{n}$

which means that the cost of failing the project at the n:th experiment (by failing the n:th experiment) is the accumulated cost of all experiments up to and including the n:th.

We can “unfold” the value of the project step by step. Let’s look at the value ${V}_{1}$ of the project before the first experiment. It is simply the probability weighted average of the value of the two branches.

${V}_{1}=\left(1-{p}_{1}\right)\left(⁢{-C}_{1}\right)+{p}_{1}⁢\left({-C}_{1}+{V}_{2}\right)={-C}_{1}+{p}_{1}{V}_{2}$

If the first experiment fails we will have a negative value $V={-C}_{1}={-c}_{1}$ of the first experiment. Otherwise we get whatever comes down the other branch which is ${V}_{2}$ subtracted with the cost ${C}_{1}$.

${V}_{2}$, ${V}_{3}$, and ${V}_{4}$ can be written in the same format.

${V}_{2}={-C}_{2}+{p}_{2}{V}_{3}$

${V}_{3}={-C}_{3}+{p}_{3}{V}_{4}$

${V}_{4}={-C}_{4}+{p}_{4}I$

${V}_{4}$ is where it gets a little more interesting as it is here we actually have an opportunity to get some income $I$.

Untangling the recursion we get

$V={V}_{1}={-C}_{1}-{p}_{1}{C}_{2}-{p}_{1}{p}_{2}{C}_{3}-{p}_{1}{p}_{2}{p}_{3}{C}_{4}+{p}_{1}{p}_{2}{p}_{3}{p}_{4}I$

The income $I$ is multiplied by all probabilities so for the income the order of the experiments doesn’t matter. Maximizing the value with respect to the order of the experiments is therefore equivalent to minimizing the cost (remember that all costs in the expressions here have positive values). So we need to minimize

$C={C}_{1}+{p}_{1}{C}_{2}+{p}_{1}{p}_{2}{C}_{3}+{p}_{1}{p}_{2}{p}_{3}{C}_{4}$

with respect to the order of the experiments with costs ${c}_{j}$, and associated probabilities for success ${p}_{j}$. It is also from the above easy to guess what the expression for the cost is with an arbitrary number of experiments. I choose intuition before induction for now though and will not try to prove it.

What we want is a rule or a set of rules for sorting the experiments so as to minimize the expected cost. Let’s first assume that the order of the experiments ${E}_{i}$ as shown in the figure above minimizes the total cost $C$. Any permutation of the experiments would therefore increase the cost. From this we can deduce how the ${c}_{i}$ and ${p}_{i}$ must relate to each other.

Now trade places between the first and the second experiment. This should (per definition) give a higher expected cost. Expanding all ${C}_{i}$ into their constituencies and setting up the inequality we get

${c}_{1}+{p}_{1}\left({c}_{1}+{c}_{2}\right)+{p}_{1}{p}_{2}\left({c}_{1}+{c}_{2}+{c}_{3}\right)+{p}_{1}{p}_{2}{p}_{3}\left({c}_{1}+{c}_{2}+{c}_{3}+{c}_{4}\right)<{c}_{2}+{p}_{2}\left({c}_{2}+{c}_{1}\right)+{p}_{2}{p}_{1}\left({c}_{2}+{c}_{1}+{c}_{3}\right)+{p}_{2}{p}_{1}{p}_{3}\left({c}_{2}+{c}_{1}+{c}_{3}+{c}_{4}\right)$

After some juggling around we finally get

${c}_{1}+{p}_{1}\left({c}_{1}+{c}_{2}\right)<{c}_{2}+{p}_{2}\left({c}_{1}+{c}_{2}\right)$

Switching any two adjacent experiments give similar (but not entirely the same) inequalities

${c}_{2}+{p}_{2}\left({c}_{1}+{c}_{2}+{c}_{3}\right)<{c}_{3}+{p}_{3}\left({c}_{1}+{c}_{2}+{c}_{3}\right)$

and

${c}_{3}+{p}_{3}\left({c}_{1}+{c}_{2}+{c}_{3}+{c}_{4}\right)<{c}_{4}+{p}_{4}\left({c}_{1}+{c}_{2}+{c}_{3}+{c}_{3}\right)$

As long as all inequalities above are true, we will increase the cost by reversing the order of two adjacent experiments. I have not managed to prove that the pair-wise inequalities are a sufficient condition for a global minimum. Switching the first and the third experiment would for instance give the inequality

${c}_{1}+{p}_{1}⁢\left({c}_{1}+{c}_{2}\right)+{p}_{1}⁢{p}_{2}⁢\left({c}_{1}+{c}_{2}+{c}_{3}\right)<{c}_{3}+{p}_{3}⁢\left({c}_{2}+{c}_{3}\right)+{p}_{2}⁢{p}_{3}⁢\left({c}_{1}+{c}_{2}+{c}_{3}\right)$

which doesn’t necessarily follow from the pair-wise inequalities above it. Remains also to do the math for an arbitrary number of experiments but that seems like the easier of the two remaining issues.

The expressions in the inequalities are easy enough to put in a spreadsheet to get simple tool for ordering a number of experiments though. I did just that and the spreadsheet simulation show that the conditions above are a predictor for a global minimum with the admittedly small number of experiments I have carried out. I therefore still dare to postulate that we wish to have a small ${c}_{i}$ in some way combined with a small ${p}_{i}$ in early experiments. Remember that ${p}_{i}$ is the probability of succeeding with the experiment. A small probability of success means a large probability of failure means that we should do the uncertain and cheap experiments to start with.

The spreadsheet simulation I did for instance gives that if we have a series of four experiments with costs 20, 30, 40, and 20 with the corresponding probabilities for success of 0.4, 0.6, 0.8, and 0.9, then we should order the experiments in the order 1, 2, 4, 3 whereby we get an expected cost of 80.56. The sum of the costs of all experiments is 110 so by doing the experiments one at the time and aborting if failing we can bring down our expected cost by 27%. With many other random ways to order the experiments we will only decrease or expected cost by a few percent.

In conclusion: the riskier the project, the more we will gain (a) by using some kind of Stage-Gate model with a decision to continue or to abort after each experiment (or group of experiments) and (b) by ordering the experiments with those that give most uncertainty reduction for the money in the beginning.

When I started this post I was hoping that either the proof would be pretty easy (there is after all no esoteric mathematics involved) or that it would fall into a class of well-known problems such as a shortest path or a traveling salesman that already have solutions. But so far, no luck. I will keep on looking and if you, dear reader, have some ideas, please let me know. Until then, I’m going to trust my hunch and my incomplete proof.

## Bring in the just machines please!

As hinted in an earlier post, human beings are not exactly behaving in a consistent and measurable way when it comes to acting upon risk. I usually consider evolution to be rational and therefore people to be rational in some paleolithic sense but sometimes I wonder. In a book published only (?) on the Internet, Aswath Damodaran summarized a number of interesting facts about our behavior when exposed to risk:

• Individuals are generally risk averse, i.e., they don’t act on expected returns only, and are more so when the stakes are large than when they are small.
• There are big differences in risk aversion across the population and significant differences across sub-groups.
• Risk aversion for a population varies with time.
• Individuals are far more affected by losses than equivalent gains.
• Individuals become more risk averse when they get frequent feedback on the results of their activity.
• The choices that people make (and the risk aversion they manifest) when presented with risky choices or gambles can depend upon how the choice is presented (framing).
• Individuals tend to be much more willing to take risks with what they consider “found money” than with money that they have earned (house money effect).
• There are two scenarios where risk aversion seems to decrease and even be replaced by risk seeking. One is when individuals are offered the chance of making an extremely large sum with a very small probability of success (long shot bias). The other is when individuals who have lost money are presented with choices that allow them to make their money back (break even effect).
• When faced with risky choices, whether in experiments or game shows, individuals often make mistakes in assessing the probabilities of outcomes, over estimating the likelihood of success, and this problem gets worse as the choices become more complex.

The reason I’m reading the book is that it gives an account of real options as a way to reasoning about project investment decisions, the theme of some earlier posts. I will return to real options later.

The book’s author at one point speculates if it wouldn’t be better to have computers make our investement decisions, given the inconsistencies of human decision makers; as it says in the lyrics of the song I.G.Y. By Donald Fagen:

A just machine to make big decisions
Programmed by fellows with compassion and vision
We’ll be clean when their work is done
We’ll be eternally free yes and eternally young

## I wasn’t first – this time either

Having Googled around a little bit more I realize that what I wrote two posts down wasn’t exactly new thinking. Similar ideas were described by Robert C. Cooper in this article. I didn’t read the paper before I wrote my post, I swear

Even if I didn’t earn the Nobel Prize in management this time either, I’m happy to see my ideas corroborated.

## The discovery backlog

I have realized that engineers use words differently from other people. When an engineer says “problem” he or she often doesn’t mean anything negative (except in “Houston, we have a problem”). Problems are engineers’ raison d’être; engineers thrive on solving problems. When the problems get tough, the tough engineers get going.

The same goes for the word “risk”. We have “risk lists” in our projects. We do “risk mitigation”. There are entire companies filled with brilliant engineers doing nothing but “risk management”.

Using the words “problem” and “risk” in some other contexts, like with the sales team, may not always be a good idea though. The lone engineer may come out as downer, an overly pessimistic person, who’s not willing to “see the opportunities instead of the problems” (a popular cliché at least in Sweden).

So I realize I need a better word than the “risk backlog” i just invented in my previous post. What about “discovery backlog”? We don’t have to call the items “risks”, they are just things that we currently don’t know. Like if anybody is going to buy our product or if the quantum drive will really work as intended. We need to sooner or later discover those things. I can’t really wrap my brain around “opportunity backlog”.

## Risk-driven development

Several project management models include provisions to manage risk. Risk is here defined as a probability for an adverse event times the quantified consequence of that adverse event. The IBM Rational Unified Process recommends addressing risk while planning the iterations of what in RUP is called Elaboration phase. Barry Boehm’s Spiral Model is guided by risk considerations. So are the various versions of the Stage-Gate model. The Scrum literature, while mentioning risk as one of the prioritization principles for the product backlog, leaves it mostly to the judgment of the product owner to make a good prioritization.

We can intuitively understand that creating something entirely novel such as a car that runs 10 000 km without refueling is more risky than developing next year’s model of an existing car with only some cosmetic changes. The risk in new product development is usually not evenly distributed on all tasks in the development project. Developing the engine of the ultra-long-range car (ULRC) carries far more risk than developing the entertainment system or the suspension.

Risk-driven development means that we want to eliminate as much risk as we can, as fast as possible, in any way possible; we don’t want to end up having invested a large amount of money and reputation in a project that after all that investment still has a high probability of failure. We also have to take into account the opportunity cost, the gain we would have got if we had invested the money in another project.

As an illustration, assume that the biggest uncertainty in a project (like the ULRC engine) is left as the last component to be developed in the project, then we would end up having invested a lot of money in the project without still knowing if the product will ever work. The cost of the risk being realized would be the opportunity cost plus the total accrued project cost up to the time of the ultimate failure.

We can also look at it from an capital budgeting point of view. When selecting investment targets, we always wish to match return and risk. For a particular level of risk we expect a certain level of (expected) return. Assuming that the income from the project is fixed (as long as it succeeds), then the risk level at which we invest our next unit of money in the project should be guiding our willingness to make that investment; the lower the risk, the more attractive the investment. I will try to elaborate on this in later posts.

When developing an ULRC it is probably thus not be wise to start with specifying and designing the entertainment system or the suspension. Neither does a comprehensive and approved requirements specification help much to lower the risk in this particular case. The only novel requirements may be the 10 000 km range and that’s easy enough to understand and to write down. Instead we should, as already hinted above, focus on designing and building prototypes of the long-range engine and its related parts.

There are of course variations to the risk-driven development theme. In some cases we need to build some low-risk parts first to be able to even start with the high-risk parts. For instance, we may need to build the rest of the powertrain or at least a test bench simulating the rest of the powertrain to be able to carry out tests with the new engine.

One framework for risk-driven development is, as mentioned in the introduction, the Stage-Gate process consisting of phases (stages) and tollgates. The tollgates are decision points at which the future execution of the project is decided based on the project’s risk level so far. If we at a certain tollgate think the risk is too high for a substantial new investment, e.g. for ramping up development or starting an expensive marketing campaign, then we need to find ways to lower the risk further before we make the additional investment. If we can’t find such ways, then we may need to abort the project altogether.

A problem with the Stage-Gate model is that it is often confused with a waterfall development model which e.g., mandates that the product requirements are developed and preferably frozen and approved in the beginning of the project. Indeed, in many quality management systems the tollgate criteria are defined in terms of produced documents and those criteria are the same for all projects.

The Scrum process doesn’t have formal tollgates. All development in Scrum is made in sprints (similar to iterations). The progress of the project is checked after each sprint and adjustments are made to both the plan and the process as needed. Scrum does not mandate any particular order in which the product should be developed but recommends that potentially shippable product increments are delivered as a result of each sprint. (This usually works for software but maybe not for a car.)

To conclude, here are a couple of ideas that should make the Scrum and the Stage-Gate processes more effective together:

• Rename the risk list that exists in most project models to risk backlog and think of it in the same way as about the product backlog in Scrum. This implies an order in which the risks shall be addressed and should be used to plan the project (iterations, sprints, whatever). Risk-driven activities include developing functionality, interviewing customers, building prototypes, doing analyses, and so on.
• Use the risk backlog as the main input to the tollgate decisions criteria in the Stage-Gate model. The tollgate criteria should be allowed to vary from project to project and should be concerned about the biggest remaining risks in the project (including risks such as that there is no market for the product we are developing). The fixed lists of documents that is often used as tollgate criteria do not fit every project since they do not match the risk profile of every project. It is after all risk that we wish to assess at the tollgate and the risk backlog, including any more detailed material on each risk, is the main indicator of project risk.
• Synchronize any gate decision with the end of a sprint and make sure that whatever is required for the gate decision is produced in the last sprint(s).

## Cleaning up

I have given up my graphical editor (GMF) project a second time. The reason is that although it is rather simple to get something to work, it’s extremely difficult to get everything to work. The main reason is that the different parts that you need for creating a complete graphical editor seem to be created at different times by different people. They use the same design patterns but different class libraries. The different frameworks have concepts such as Command, Editing Domain, Undo Context, but they are not implemented with the same classes. To be able to get them to work together, a lot of “wrapping” of classes and handling of several instances of almost the same class is necessary and the end result becomes a mess. Too much of a mess to keep in memory if not working with it on a daily basis.

To clean up this blog I have made all the Eclipse EMF, GMF, and GEF posts private, i.e. invisible for the external reader. If you wish to discuss any aspects of those frameworks or give me a hint as to how to go forward with less intellectual effort, then please drop me an email.

## Running Eclipse Process Frameworkd in Ubuntu 12.04 LTS 32 bit

I’m using the EPF to create the quality system manual of the medical device company I’m working for. While we are using Windows at the company, I also wanted to be able to use Ubuntu when working from home. Getting it to work on Ubuntu was not trivial. Plain Eclipse seems to run out of the box but EPF uses editor components that aren’t installed by default in 12.04 and the packages are also hard to find.

What worked for me was to install xulrunner-1.9.2 from the Mozilla site. Not all versions of this library will work according to the EPF docs.

I installed as instructed on the Mozilla page. Don’t forget to run:

sudo ./xulrunner --register-global

I then also added the following lines to .bashrc:

export MOZILLA_FIVE_HOME=/opt/xulrunner
export LD_LIBRARY_PATH=\$MOZILLA_FIVE_HOME

and reread the file by running:

bash

I then started EPF from the terminal thus:

./epf -clean

I still get error messages about failed assertions but at least the editors in epf now seem to work.

I also tried to run EPF on a 64 bit Ubuntu but the application wouldn’t even start so I’ll settle for running it in a virtual 32 bit machine (that runs on a 64 bit machine). (I need the 32 bit machine anyway for my Internet banking application which runs neatly on Linux but only on 32 bit machines.)

## Trust, transparency and Toyota

A recent article in The Economist [1] ascribed some of the economic and social success of the Nordic countries to a high level of trust. In the period of large-scale emigration of Swedes to America, they came to be known as “dumb Swedes” in the new country because their high level of trust in people. Today the descendants of these dumb Swedes still have higher that average trust in their fellow citizens and tend to live in rather well-run states such as Minnesota.

It is possible to have relatively high taxes (such as in Minnesota or Sweden) if people trust that taxes are used for good purposes. Trust in the Nordic countries emanate from many sources but transparency is a major one. It is not easy to embezzle public funds when all records are public and subject to the scrutiny of the press and of curious citizens.

I claim that the same goes for corporations. With a high level of trust between employees, departments, country organizations etc the transaction costs can be low. Transaction costs in a corporate setting are typically different types of follow-up and reporting procedures, elaborate internal pricing schemes, and in more extreme cases, turf wars.

 A car and a process you can trust.

A typical scenario is that when a particular problem area catches the eye of a particular manager (or a group of managers) who doesn’t entirely trust the organization’s ability to handle the problem then they feel the natural, and in this situation perfectly responsible, need to alleviate their uncertainty by starting to make inquiries. If the situation gets more serious the managers start requiring extra (ad-hoc) reporting on the progress of the resolution of the problem or feel that they need to put together a “tiger team” to expedite the resolution process.

Despite superficial similarities, the above behavior is the antithesis of the Toyota Production System where managers likewise come running when there is a problem. But unlike in the scene described above, they don’t come running to expedite the process, they come running to help solving the root cause of the disturbance.

The extra reporting, the extra phone calls, the extra emails etc are all caused by the lack of trust in the process and add little value to the actual problem resolution process. They in fact make the process less efficient. A “tiger team” furthermore masks any deficiencies in the regular process by effectively bypassing it, preventing the organization from addressing the root cause of the lack of trust.

The explicit goal of building trust has not afaik been on the top of any process improvement models. Many models do result in higher trust when successfully implemented but I believe more explicit actions can be taken to improve trust faster. Some such actions could be:

• Make all processes extremely transparent; make it easy for anybody to see the backlogs and progress of every department. This facilitates the Genchi Genbutsu, “go and see”, of the Toyota Production System, an attitude that helps managers to stay informed about what’s going on in the organization on a continuous basis.
• When the organization is more mature, make metrics about performance visible for everyone.
• Make decisions and their rationales visible.
• When communicating about your area of responsibilities, make sure that you are well read. When uncertain, state this and the reason for the uncertainty.
• State clearly who’s responsible for what. If nobody steps forward and clearly takes charge of an issue, then uncertainty thrives.

Last but not least: do a good job (and make it known to others that you did a good job)!

## Product, not project – part 2

Something caught my eye yesterday when I helped my son to get started with Code::Blocks, a light-weight integrated software development environment (IDE): in all IDEs that I’ve worked with lately (Eclipse, Visual Studio and Code::Blocks) the collection of source code and other files is collective called “project”. This may seem like an unimportant little observation but again I believe that using the right term is important for people’s mental models of what’s going on. A “project” is something temporary while a “product” would be something rather more persistent. I would have suggested the “product” word here instead. See also my earlier post on this topic.

## Managing products

In an earlier post I wrote about the difference between a project and a product. This distinction may seem obvious for some but considering the number of times I’ve found myself discussing its implications I’ve come to the conclusion that it may not be all that obvious.

Many traditional development process descriptions start with a requirements specification of some kind and then go on describing the creation of the rest of the development artifacts, all the way to a verified, validated and released product. In contrast to such one-off, linear processes, most system development organizations are almost completely occupied with continuously upgrading and correcting existing products based on a steady stream of internal ideas and new requirements and wants from customers, distributors and other external stakeholders. The upgrades are typically indicated by the version number of the product. (I’m for instance writing this on a computer running Ubuntu 12.04 which is an upgrade from Ubuntu 11.10 and so on.)

For such more or less continuous product development clearly something else is needed than a one-off development process to guide the engineering efforts. We need to plan several (upgrade) projects ahead to secure the necessary resources and for communicating with the market. We also need to continuously decide exactly what new features and corrections to add to the product at each upgrade.

Enter product management (and say hello to the product manager).

A product is according to Wikipedia “anything that can be offered to a market that might satisfy a want or need”. To maximize profits we want to make sure that a product matches the “wants or needs” as well as possible at the right price. Not only do we need to react to the feedback from existing customers and other stakeholders, we also need to pro-actively add innovative (and some not so innovative) new features so as to maximize profits, market share or whatever our goal might be for the moment.

To handle both the short-term requests from existing customers, sales etc and to actively manage the products features in the medium and long term is the purpose of product management and the ultimate responsibility of the product manager.

To plan and communicate the overall contents and timing of the major upgrades of the product, the product manager creates and maintains a product plan (aka product road-map) that in turn is based on both ad-hoc input from the market and thorough analyses of the targeted market segments, societal trends, the competition, available and future technology, partners etc (see e.g. [1]). All this makes the product manager one of the most important roles in the company, if not the most important role, and product management perhaps the most important process in the company.

Since we need to manage each product over its entire life-cycle, product management is not (and I’m sorry for keeping repeating myself) a one-shot project activity but a recurring line activity. The figure below gives a simplified view of how product management can be integrated into a project-oriented organization.

 Continuous product management.

Each new suggested feature, whether a new function requested by a customer or a feature suggested internally by the project manager, system architect or somebody else, is described in a change request.

New change requests are regularly evaluated by a change control board (CCB) with respect to cost / benefit and their consistency with the overall product plan. If the benefit exceeds the cost and the suggested new feature is in line with the product plan, then the change request is accepted for implementation and at some point scheduled into a project. Otherwise the change request is rejected.

The CCB is typically chaired by the product manager and has the major stakeholders of the change request such as the project managers of all ongoing projects and the line managers supplying the resources as members. The CCB is moderated and administered by a configuration manager.

While I can’t see any real alternatives to the above process and I have implemented it in several organizations, there are several challenges associated with doing so. I will return to these in future posts. A nice thing with the above process is that it plays very well with Scrum and other agile methods. This too may be the topic of future posts.