hero image

A System That Creates Noise- what can go wrong in human judgement?

research by Syntia

Mauna Loa, the world’s largest active volcano, began erupting this week for the first time since 1984.Credit…Bruce Omori/EPA, via Shutterstock

Wherever you see human judgements, you are likely to find a noise. To improve the quality of our judgements, we need to overcome noise and bias in our decisions.

The shooting range is a metaphor for what can go wrong in human judgement, especially in diverse decisions that people make on behalf of organisations. In these situations we will find the two types of error. Some judgements are biased; they are systematically off target. Other judgements are noisy, as people who are expected to agree end up at very different points around the target. Both private and public organisations are urged to conduct the noise audits and to undertake, with unprecedented seriousness, stronger efforts to reduce the noise. If they do so, organisation’s could reduce widespread unfairness- and reduce costs in many areas. 

To understand error in judgement, we must understand both bias and noise and try to solve it as a work in progress and a collective endeavor.

Sometimes noise is more important problem. But in public conversations about human error and in organisations all over the world, noise is rarely recognized. Here are few examples of the alarming amount of noise in situations which matters:

Asylum decisions are noisy as roulette. Whether an asylum seeker will be admitted into the United States is something like a lottery win. A study of cases that were allotted 5% of applicants, while another admitted 88%.

Personnel decisions are noisy. Interviewers of job candidates make widely different assessments of the same people. Performance ratings of the same employees are also high variable and depend on the person doing the assessment than on the performance being assessed.

Decisions to grant patents are noisy. Whether the patent office grants or rejects a patent is significantly related to which examiner is assigned the application. This variability is troublesome from the standpoint of equity.

Insurance companies are noisy. The underwriters have the task of setting insurance premiums for potential clients, and claims adjusters must judge the value of claims. You might predict that these tasks would be simple and mechanical, and that different professionals would come up with roughly the same claims. Conducting a noise audits would testify that prediction and dismay the organisation’s leadership.

These examples involve studies of large number of people making a large number of judgements. But many important judgements are singular rather than repeated: how to handle an apparently unique business opportunity, whether to launch a new product, how to deal with pandemic, whether to hire someone who doesn’t meet the standard profile. 

In the 1970s the universal enthusiasm for judicial discretion started to collapse for one simple reason: startling evidence of noise. In 1973, a famous judge, Marvin Frankel, draw public attention to the problem. He became a legal scholar whose views helped to establish sentencing guidelines for the federal courts. Before he became a judge, Frenkel was a defender of freedom of speech and a human rights advocate who helped found the Lawyers’ Comittee for Human Rights.

Frankel could be fierce, and with respect to noise in the criminal justice system, he was outraged. Here is how he describes his motivation.

If a federal bank robbery defendant was convicted, he or she could receive a maximum of 25 years. That meant anything from 0 to 25 years. And where the number was set, I soon realised, depend on les on the case or the individual defendant than on the individual judge, i.e. on the views, predilections, and biases of the judge. So the same defendant in the same case could get widely different sentences depending on which judge got the case.

Writing in the early 1970s, he did not go quite so far as to defend what he called ‘displacement of people by machines.’ But startlingly, he came close. He believed that ‘the rule of law calls of a body of impersonal rules, applicable across the board, binding on judges as well as everyone else.’

Frankel’s book became one of the most influential in the entire history of criminal law- not only the United States but also throughout the world.

The price of reducing noise was to make decisions unacceptably mechanical. Yale law professor Kate Stith and federal judge Jose Cabranes wrote that ‘the need is not for blindness, but for the insight, for equity,’ which ‘can only occur in a judgement that takes account of the complexities of the individual case.’ This objection led to vigorous challenges to the guidelines, some of them based on law, others based on policy. These challenges failed when the Supreme Court struck it down in 2005 becoming merely advisory. 

Harward law professor Crystal Yang investigated the effects of changing the guidelines from mandatory to advisory from a large data set of actual sentences, involving four thousand criminal defendants. Her central finding is that multiple measures, inter-judge disparities doubled after the guidelines became advisory. She writes that “her findings raise large equity concerns because the identity of the assigned sentencing judge contributes significantly to the disparate treatment of similar offenders convicted of the same crimes.”

During the same year in 2005 after Frankel’s death the mandatory guidelines were dismissed following a return to the law without order.

Judiciary is difficult because complexity and uncertainty and in most situations holds judgement made by large group of professionals to make disagreement unavoidable.

However, efforts at noise reduction often raise objections and run into serious difficulties. Those issues must be addressed for adapting the method to reduce noise, or the fight against noise will fail. 

While few people object to the principle of judicial discretion, almost everyone disapproves the magnitude of the disparities it produces. 

Many professionals by in any large company are authorised to make judgements that bind the company. For instance, insurance company who employs numerous underwriters who quote premiums for financial crisis. It also employs many claims adjusters who forecast the cost of future claims and also negotiate with claimants if disputes arise.

When a quote is requested, anyone who happens to be available may be assigned to prepare it. In effect, the exact value of the quote has significant consequences for the company. A high premium is advantageous if the quote is accepted, but such a premium risks losing business to competitor. A low premium is more likely to get accepted, but is less advantageous to the company. For any risk, there is a Goldilocks price that is just right- and there is a good chance that the average judgement of large group of professionals is not too far from it. Prices that are higher or lower than it are costly, and is a variability of how nosy judgement hurts the bottom line.

The early estimate matters because it sets an implicit goal for the adjuster in future negotiations with the claimant. The insurance company is also legally obligated to reserve the predicted cost of the each claim. The adjuster’s judgement is consequential for the company and event more consequential for the claimant who guarantees the settlement.

The word lottery emphasise the role of chance in the selection of one underwriter or adjuster. In the normal operation of the company a single professional is assigned to a case, and no one can ever know what would have happened if another colleague had been selected instead.

In judgement lotteries allocate nothing but uncertainty. There is no justification for a system in which the outcome depends on the identity of the person randomly chosen to make a professional judgement.

Find a way how to handle an apparently unique decisions to improve the quality of your judgements and overcome bias and noise.

References from Flaw in Human Judgement, Daniel Kahneman, Olivier Sibony, Cass R. Sunstein.