17  Models

A common tool for improving decision making is the use of models.

17.1 What is a model?

A model is a mathematical construct representing the process of interest, made up of a set of variables and the mathematical relationships between them. A model can be used to predict future or unseen outcomes.

An example of a model we discussed in Chapter 12 involved predicting the level of drop out in a school using variables such as attendance rates, family socio-economic status, the school’s average SAT score, and the degree of parental involvement in the child’s schooling. This set of variables could be input into the model to give an output of the predicted level of drop out.

A model is typically developed using a set of observations for which we know both the input and output variables. (This is often called the training data set.) We then use the model to predict the outcome for a new observation where we know the input variables but not the outcome of interest.

17.2 Intuition versus statistical prediction

There is a considerable literature that models outperform expert judgment.

17.2.1 An illustration: predicting the quality of Bordeaux wines

Wine has been produced in the Bordeaux region of France since the Romans planted the first vineyards there almost 2000 years ago. Today, Bordeaux is famous for its red wine, and it’s the source of some of the most expensive wine in the world.

Yet the quality and price of that wine varies considerably from year to year. So each year the questioned is asked. Will this vintage be a classic, like the great 1961? Or will it be more like the disappointing vintage of 1960?

To answer those questions, four months after a new Bordeaux vintage is barrelled, the experts take their first taste. This early in its life, the immature wine tastes like swill. The wine is still over a year away from being bottled, and it’s years away from its prime.

Despite the difficulty in determining the quality of a wine when it is so young, the experts use these first tastes to predict how good this wine will be when it matures. The wineries hang on these first expert predictions of quality. The predictions appear in wine guides and drive early demand. They influence the price of the early wine sales.

The economist Orley Ashenfelter (2008) proposed an alternative way to predict wine quality. A simple statistical model. There were only three inputs to his model - the temperature during the summer growing season, the previous winter’s rainfall, and the harvest rainfall. Ashenfelter circulated the predictions derived from his model in a newsletter to a small circle of wine lovers.

You can see in this story two contrasting ways of informing or making a decision - expert or human judgement in the first case and a model in the second. Which approach worked better?

Ashenfelter’s model could predict more of the variation in vintage price than the expert judgements (Ashenfelter and Jones (2013)). This is despite the fact that those expert judgements affected the price. When Ashenfelter added weather information to the expert predictions, he improved them. To top it off, he didn’t even need to taste the wine. He could make predictions months before the experts.

17.2.2 Evidence

The story of Orley Ashenfelter’s prediction of wine quality is not an isolated example.

In 1954, the experimental psychologist Paul Meehl (2013) published Clinical versus Statistical Prediction. Meehl catalogued twenty empirical competitions between statistical methods and clinical judgement, involving prediction tasks such as academic results, psychiatric prognosis after electroshock therapy, and parole violation. The results were consistently victory for the statistical model or a tie with the clinical decision maker. In only one study could Meehl generously give a point to the experts.

Similarly, Grove et al. (2000) looked at 136 studies in medicine and psychiatry in which models had been compared to expert judgement. In 63 of these studies, the model was superior. In 65 there was a tie. This left 8 studies, out of 136, in which the expert was the better option.

This does not, however, mean that statistical methods are perfect. They have flaws. Models can be biased. There are many circumstances where they should not replace human decision making. Ultimately, decision making methods should be tested. Compare their accuracy. Examine the errors they make and the costs of those errors. And choose based on the empirical evidence that you can generate.

17.3 Why might some models outperform?

Many of the decision making problems that we discussed in the first module are eliminated by the use of models. By creating a formalised structure concerning what information is used and how it is incorporated, the heuristics that can lead to human error are removed as factors. Models confidence intervals can be calculated and calibrated.

I will now illustrate this as it relates to the problem of noise that I discussed in Chapter 16.

17.3.1 Noise

One of the major reasons that models can outperform is the noise in human decision making. Model, in contrast to humans, are typically consistent, returning the same decision each time.

An interesting implication of this difference in consistency is that models designed to predict the decisions of human decision makers typically outperform the decision makers whose decisions are used to develop the model (for example, Goldberg (1970)). When developing a predictive model, you normally develop it using the outcome of interest. For example, if seeking to predict whether a loan applicant will default on a loan, develop a model using long-term borrower outcomes and their characteristics. In a technique called bootstrapping (not to be confused with the statistical term bootstrapping), you don’t use the loan outcomes to develop the model, but rather historic predictions by loan assessors of whether a borrower will default. This use of predicted rather than actual default means that there will be errors in the data on which you are building your model. Despite this, bootstrapping can be effective, largely due to the removal of noise. For example, in one study, models developed from decisions of clinical psychologists tended to outperform most of those same psychologists in differentiating psychotic from neurotic patients.