Is estimation a scam? Can a non-expert estimate as well as an expert? Help me investigate these questions!

How do you prove that estimation is a scam? Well, it’s not easy, especially if luminaries in the Agile community defend that you can honestly estimate a project where you don’t have a team available and only a fuzzy understanding of Requirements or Technology involved, or both!

The fact is that many projects are indeed estimated before a team is even set up, but that is ok because we are armed with “expert” estimates! So I decided to test that particular aspect of the estimation “game”: can experts estimate better than non-experts a software project?

To be able to answer this question I have to design an experiment. It should avoid bias, and focus only on the estimation process and it’s outcome.

It should focus on the usefulness of an estimate for a particular project go/no go decision: accuracy of estimation. The parameters known should be a description of the problem (the problem has to be relatively common, or estimation would not be possible anyway) and the technology choices made for that particular project. And the resulting actual length of the project must be independent of the estimates.

The experiment

So here’s my first try at defining the experiment:

  • Collect information on a project that has already ended (i.e. the length can no longer be affected)
  • Ask 10 people with expertise in that area (problem and technology) to estimate the length of the project given a team size
  • Ask 10 people without expertise in that area (problem and/or technology) but with expertise in software development in general to estimate the length of the project given a team size

Here’s my expectation (or Hypothesis 1): I believe that the accuracy of estimation will not correlate with the expertise in the technology and problem domains. If I am correct, this will invalidate the concept that expert estimates are somehow “better” (more accurate in the context of this experiment). The obvious conclusion should then be that the accuracy of estimates in this type of context (very common when bidding for a project) is at least significantly affected by other variables rather than expertise. If I am not correct, and the experts can indeed deliver reliably better estimates than the non-experts this should mean that expertise is an advantage when bidding in this type of projects.

Hypothesis 2 focuses on the usefulness of this type of estimates, i.e. their accuracy and therefore relevance for go/no go decisions. In this case I don’t have a clear bad/good line between different deviations, I’m just exploring the actual deviation and hopefully will run other experiments later on to explore the impact/meaningfulness of these deviations.

Now I need your help

I need you to help me design/improve this experiment, so that we can continue to further the investigation into Estimates and its alternative #NoEstimates. Help me out by answering these questions:

  • Can you identify in the experiment description bias for or against expert estimation?
  • Do you have other Hypothesis that we could test with a similar experiment design?

Let me know what you think about this experiment. Once the experiment has been reviewed by enough people and we have improved its design enough I’ll start preparing the data collection. For now help me fine-tune this experiment.
Photo credit: CC BY NC ND by Alvin K @ flickr

16 thoughts on “Is estimation a scam? Can a non-expert estimate as well as an expert? Help me investigate these questions!

  1. Dear Vasco!

    I still remember your session at the SUGLC about (no)estimates. And to be quite frankly: I still think you start with the wrong assumption.

    Estimates give you so much more than an (expert?) estimation of project effort/time – they show you the knowledge distribution in the team at the moment they estimate. Maybe you know of better ways to do that, but estimations/”grooming meetings” provide “young” (not that literate in agile) teams with the right tool to pair the right people on specific problems and thereby share knowledge much faster.

    I would have loved to talk about that topic at ABKON but didn’t get the chance, so i guess we’ll have to discuss this topic in lenght at some other point in time.

  2. Maybe I have the wrong understanding of “no estimates,” but I think it applies to steering a project that is already underway, and not to projecting the likely duration of a new project. If my understanding is correct, then the fundamental assumption behind the proposed experiment are invalid.

    Secondly, you write that the experiment should avoid bias, but you also begin by asking, “How do you prove that estimation is a scam?” This indicates you have a conclusion in mind before you begin. Also, the word “scam” is value-loaded – the underlying assumption is that people who use estimation are trying to trick others on purpose. Therefore, you will be unable to create an unbiased experiment.

    If a company has a history of doing similar types of work, then they will be able to estimate the duration and cost of any new project that is similar to past projects based on their own historical data. Smaller project will be subject to smaller estimation error than larger projects. That may be sufficient for purposes of planning and portfolio management.

    If a company carries out fundamentally new or creative types of projects, then any estimates for project duration and cost are likely to have a large margin of error. In this case it might be advisable to break larger projects into a series of smaller ones and to adjust plans based on feedback from each small project.

    I’m not sure why an “experiment” would be helpful in any of these situations.

  3. Hi, Vasco,

    Dave has a good point. An experiment should be trying to disprove the hypothesis, not prove it. Also, I think your selection of n=20 is likely too low for any statistically significant results. I think the experiment design is flawed, and cannot test your hypothesis.

    More to the point of your proposed experiment, I think it misses several key points of estimation. It presumes that the output is a single point defined by date and team size. It assumes that the estimation done by uninvolved individuals will be similar to the estimation done by people trying to create a successful project given the circumstances in which they find themselves. It ignores the fact that the manner of proceeding during the project is related to the estimation.

    Each of these issues could be a discussion in itself. It’s far too much to handle in this little comment box, though.

  4. @Dave Do you have specific objections to the experiment design (not my personal views – which are, and should be transparent to anyone)?

    An experiment *can* be free from bias *even* when you are not. That’s the point of creating experiments.

    Let me know if you have improvements to the experiment

  5. @Markus I understand your point that Estimates have the potential to create a conversation that goes beyond the estimation itself. Actually, even I have written about it right here on the blog:

    But those potential side-benefits will have to wait for other experiments. I’m focusing on expert vs. non-expert estimates here.

  6. @George You make one good point: the involvement in a project changes your view of the estimate. I agree that this is a possibility, however, I’m not satisfied with your conclusion that this invalidates the experiment. IN fact we have Mike Cohn himself arguing for this approach in his blog. Check it yourself:

    He is arguing for the validity of estimations in situations where you have no team, a vague understanding of requirements and technology and you may never work on that project (bidding as contractor).

    So, as far as I can see at least my experiment design gives the same motivation to both experts and non-experts. None should be at an advantage or disadvantage, right?

  7. “Do you have specific objections to the experiment design (not my personal views…”

    The stated purpose of the experiment is to prove estimation is a “scam.” That is both your personal view and the goal of the experiment.

    At the moment I can’t see how to separate the two. Can you re-state the parameters of the experiment without introducing your own bias?

    Your subsequent discussion of the question of expert vs. non-expert estimation is different from the originally-stated purpose of the experiment. It is completely different from the idea of a “scam.”

    IMO the question of expert vs. non-expert estimation is not mysterious or interesting. I need no experiment to tell me that I am not equipped to estimate how long a session of reconstructive surgery will take, or that a surgeon cannot estimate how long a software development project will take. It is a waste of time and effort to try and perfect the design of such an experiment.

    If you want to prove estimation is a “scam,” you need a legal case and not a scientific experiment. The word “scam” implies ill intent, whether the resulting estimate is accurate or not.

  8. Hi Vasco,

    I think you raise some good points and your experiment is a sound one. It looks to me as though you are controlling the variables by using completed projects, and regardless of how long they took to deliver must be the right length. Secondly your article raises the fact that often these estimates are provided before a team is even formed so assumptions about the knowledge of the team is largely irrelevant. Also, by gathering estimates from those responsible for providing estimates now you essentially create a control group as baseline. I guess for certain projects there might even be data available that experts provided in advance of the project rather than after the event.

    There may be other experiments to uncover who actually provides initial estimates on projects, how far in advance, whether this was based on a certain implementation etc.

    The outcome from this experiment should be an unbiased correlation or not between the groups. Either way this strikes me as valid.

  9. @George your comment on Cohn’s article is not really pertinent. Yes, he deducts 40% from the team velocity, except he does not mention the *existence* of a team. So, 40% of an imaginary number is *still* an imaginary number.
    Your comment on the experiment and the project length: I understand your point, however in a bidding situation experts and non-experts would be in the exact same situation. We are not trying to see which one is *more* accurate, we are just comparing relative accuracy for both groups, therefore not affecting either group negatively.

  10. @Dave the bias of the researcher is *always* a factor, but it should be accounted for in the design of the experiment. This same experiment could be run by a pro-estimates person and the results (Assuming we’ve controlled for the right variables) would be the same.

    Your comment on the non-relevance of the expert v non-expert experiment does not reflect my view. So I’m not in agreement with you there. IN my mind (and in estimation literature) the expert vs non-expert is very much an assumption worth testing. However, we will also account for other variables and publish the results so that you can look at it later and investigate other possible correlating variables 😉

  11. @Jevers
    Re: your comment on the knowledge of the team being irrelevant. This is not my assumption, this is what companies bidding on a project do in practice. I’m just investigating the results of doing estimations when you don’t know what team will be available.

  12. The experiment you imagined is a good starting point, but there are some factors that could influence the results indirectly.
    One aspect is the team size, because this highly influences the numbers people will give out. I observed that a small team produces a more restrained and cautious set of estimates, and on the other side, when provided with larger teams, they tend to be quite optimistic, disregarding the problems and risks that come up with larger teams.
    The number of people you considered seems too low for a meaningful result. Out of 10 people, you will have hard time eliminating the outliers and still having some statistically relevant results. I believe that a much larger set is needed.
    The estimate you try to obtain is a far fetch..In practice and real world, a person is rarely asked to estimate a whole project. The more common approach is to do a bottom-up estimation, aggregating multiple smaller estimates. I do not say that top-down estimates do not exist, but many sources indicate they are the most dangerous ones from a project management point of view.
    At the same time, the experiment does not specify what type of estimates, or what estimation technique will be used. I believe the experiment you devised targets analogy based estimates, because I believe these is the type of estimates people usually produce when asked for estimates.
    In the end, if you perform the experiment as it is now, you will get some information on the power of subjective perception on project estimates.
    Another issue I see is the naming and title. You mention that you investigate expert vs non-expert. When can one claim expertise in one field ? This is highly subjective and could skew the results of the experiment.
    Given the small sample set, what correlation factor do you expect to find ? How small or large should this value be? The value of the correlation factor should be identified before the experiment is started, because otherwise one might attempt to set this level so that the experiment has the desired outcome.
    I believe you want to publish it, and this is why the experiment needs to be completely defined before starting it, to ensure your personal biases will not affect the experiment.

  13. I respect your view regarding expert vs. non-expert estimation. Nevertheless, please understand if I do not wait for the results of a study before declining to ask a surgeon for a software estimate.

  14. Vasco, I agree with George’s comments:
    – 10 + 10 is a too small sample to draw any conclusion.
    – the actual length of a project is a statistical variable also, not only the estimate itself. If you run the same project twice (that’s assuming that you could reset people to a certain state in time) you would get different results depending on what season you run the project, because flu is more frequent during certain periods of the year.

    One way to take these into account would be to extend the experiment and maybe use data from 100 completed projects and for each project ask estimates from a different set of 10+10 people.

    Another sensitive point in the experiment is the definition of experts. I understand the idea behind, but how to translate that into an expert selection process is not very clear to me. For example, detailed expertise in the business and processes of a company comes from working closely with or within that company. That makes it difficult to have sample experts that have not been connected to the original project. If you find experts from the competition, that would introduce all sort of biases. If you try to anonymize the project information then you loose detailed information that experts might otherwise use for their estimates.

Comments are closed.