History is the Key to Estimation Success

Source: Shutterstock
Source: Shutterstock

Posted: March 14, 2016 | By: Kate Armel

Understanding and Assessing Tradeoffs

An old project management maxim succinctly summarizes the choices facing software development organizations: “You can have it fast, cheap, or good. Pick two.” Given that estimates (and therefore, commitments) are made early in the project lifecycle when uncertainty is high and the range of possible solutions is still wide, how do we select plans with a high probability of success?  A thorough understanding of management tradeoffs can help. The idea behind the infamous Project Management Triangle is simple but powerful: the tradeoffs between software schedule, effort or cost, and quality are both real and unforgiving. Thanks to the work of pioneers like Fred Brooks, most software professionals now accept the existence and validity of these tradeoffs but as Brooks himself once ruefully observed, quoting famous maxims is no substitute for managing by them.


With so many unknowns out there, why don’t we make better use of what we do know? Most software “failures” are attributable to the human penchant for unfounded optimism. Under pressure to win business, organizations blithely set aside carefully constructed estimates and ignore sober risk assessments in favor of plans that just happen to match what the company needs to bid to secure new business. Lured by the siren song of the latest tools and methods, it becomes all too easy to elevate future hopes over past experience. This behavior is hardly unique to software development. Recently two economists (Carmen Reinhart and Kenneth Rogoff) cited this tendency to unfounded optimism as one of the primary causes of the 2008 global financial crisis. Their exhaustive study of events leading up to the crash provides powerful evidence that optimism caused both banks and regulators to dismiss centuries-old banking practices. They dubbed this phenomenon the “This Time Is Different” mentality4. Citing an extensive database of information gleaned from eight centuries of sovereign financial crises, bank panics, and government defaults, Reinhart and Rogoff illustrate a pattern that should be depressingly familiar to software professionals: without constant reminders of past experiences, our natural optimism bias makes us prone to underestimate risk and overestimate the likelihood of positive outcomes.

The best counter to unfounded optimism is the sobering voice of history, preferably supported by ample empirical evidence.  This is where a large historical database can provide valuable perspective on current events.  Software development is full of complex, nonlinear tradeoffs between time, effort, and quality. Because these relationships are nonlinear, a 20% reduction in schedule or effort can have vastly different effects at different points along the size spectrum.  We know this, but the human mind is poorly equipped to account for non-intuitive exponential relationships on the fly.

Without historical data, estimators must rely on experience or expert judgment when assessing the potential effects of small changes to effort, schedule, or scope on an estimate. They can guess what effect such changes might have, but they cannot empirically prove that a change of the same magnitude may be beneficial in one case but disastrous in another. The presence of an empirical baseline removes much of the uncertainty and subjectivity from the evaluation of management metrics, allowing the estimator to leverage tradeoffs and negotiate more achievable (hence, less risky) project outcomes. One of the most powerful of these project levers is staffing. A recent study of projects from the QSM database5 used 1060 IT projects completed between 2005 and 2011 to show that small changes to a project’s team size or schedule dramatically affect the final cost and quality.  To demonstrate the power of the time/effort tradeoff, projects were divided into two “staffing bins”:

  • Projects that used small teams of 4 or fewer FTE staff
  • Projects that used large teams of 5 or more FTE staff.


The size bins span the median team size of 4.6, producing roughly equal samples covering the same size range with no overlap in team size. Median team size was 8.5 for the large team projects and 2.1 for the small team projects, making the ratio of large median to small median staff approximately 4 to 1.  The wide range of staffing strategies for projects of the same size is a vivid reminder that team size is highly variable, even for projects of the same size. It stands to reason that managers who add or remove staff from a project need to understand the implications of such decisions.

Regression trends were run through each sample to determine the average Construct & Test effort, schedule, and quality at various points along the size axis.  For very small projects (defined as 5000 new and modified source lines of code), using large teams was somewhat effective in reducing schedule. The average reduction was 24% (slightly over a month), but this improved schedule performance carried a hefty price tag: project effort/cost tripled and defect density more than doubled.  

For larger projects (defined as 50,000 new and modified source lines of code), the large team strategy shaved only 6% (about 12 days) off the schedule but effort/cost quadrupled and defect density tripled.


The relative magnitude of tradeoffs between team size and schedule, effort, and quality is easily visible: large teams achieve only modest schedule compression while causing dramatic increases in effort and defect density.


What else can the data tell us about the relationship between team size and other software management metrics?  A 2010 study by QSM consultant and metrics analyst Paul Below found an interesting relationship between team size and conventional productivity (defined as effective SLOC per unit of construct and test effort). 6  To make this relationship easier to visualize, Paul stratified a large sample of recently completed IT projects into 4 size quartiles or bins, then broke each size bin into sub-quartiles based on team size. The resulting observations held true across the entire size spectrum:

  • In general, productivity increased with project size
  • With any given size bin productivity decreased as team size went up.

To see the relationship between average productivity and project size, compare any four staffing quartiles of the same color in the graph below from left to right as size (bottom or horizontal axis) increases:


As the quartiles increase in size (bottom axis), average productivity (expressed as SLOC per Person Month of effort on the left-hand axis) rises.  The slope is reversed for projects of the same size (i.e., within a given size quartile). To see this, compare the four differently colored box plots in the second size quartile highlighted in blue.  The size and staffing vs. productivity relationships hold true regardless of which Productivity measure is used: SLOC per Person Month, Function Points per Person Month, and QSM’s PI (or Productivity Index) all increase as project size goes up but decrease as team size relative to project size increases. The implication that the optimal team size is not independent of project scope should not surprise anyone who has ever worked on a project that was over or under staffed but the ability to demonstrate these intuitively sensible relationships between scope and team size with real data is a valuable negotiation tool.

Want to find out more about this topic?

Request a FREE Technical Inquiry!