close
close

Pasteleria-edelweiss

Real-time news, timeless knowledge

Judges already use algorithms to justify doing what they want
bigrus

Judges already use algorithms to justify doing what they want

When Northwestern University graduate student Sino Esthappan started researching how algorithms decide who stays in prison, he was expecting “a story about people versus technology.” On one side will be the human judges, whom Esthappan has interviewed extensively. On the other side will be risk assessment algorithms used in hundreds of U.S. counties to assess the danger of defendants being granted bail. What he found was more complex, and suggested that these tools could hide larger problems with the bail system.

Algorithmic risk assessments aim to calculate the risk that a guilty defendant, if released, will not return to court or, worse, harm others. By comparing the backgrounds of criminal defendants to a large database of past cases, they are supposed to help judges gauge how risky it might be to release someone from prison. Along with other algorithm-based tools, they are playing an increasingly larger role in the often overburdened criminal justice system. And in theory, these should help reduce bias from human judges.

But Esthappan’s work, published in the magazine Social Problemsfound that judges did not adopt or reject the recommendations of these algorithms wholesale. Instead, they report using them selectively, motivated by deeply human factors to accept or ignore their points.

Pretrial risk assessment tools estimate the likelihood of accused criminals returning for court dates if they are released from prison. The tools take details provided to them by pre-trial officers, including information such as criminal history and family profiles. They compare this information to a database that holds hundreds of thousands of previous case records, looking at how defendants with similar backgrounds behave. They then provide an assessment, which could be in the form of a “low,” “medium,” or “high” risk label or a number on a scale. Judges are given points for use in pretrial hearings: brief meetings held immediately after a defendant’s arrest that will determine whether (and under what conditions) the defendant will be released.

As with other algorithmic criminal justice tools, supporters position them as neutral, data-driven correctives to human whims and biases. Opponents raise issues such as the risk of racial profiling. “Because many of these tools are based on criminal history, the implication is that criminal history is also racially coded based on law enforcement surveillance practices,” Esthappan says. “So there is already an argument that these tools reproduce past biases and encode them into the future.”

It’s also unclear how well they work. 2016 ProPublica investigation A risk score algorithm used in Broward County, Florida, was found to be “extremely unreliable in predicting violent crime.” Only 20 percent of those predicted by the algorithm to commit violent crimes actually committed crimes within two years of their arrest. The program was also more likely to label Black defendants as future criminals or at higher risk compared to white defendants. ProPublica to create.

Both fears and promises about algorithms in the courtroom assume judges use them consistently

Yet University of Pennsylvania criminology professor Richard Berk argues that human decision-makers may be just as flawed. “These criminal justice systems are made up of human institutions and people who are all flawed, and not surprisingly, they don’t do a very good job of describing or predicting people’s behavior,” Berk says. “So the bar is really quite low, and the question is: Can algorithms raise the bar? And if accurate information is provided, the answer is yes.”

But both fears and promises about algorithms in the courtroom assume that judges use them all the time. Esthappan’s work shows that this is a flawed assumption at best.

Esthappan interviewed 27 judges in four criminal courts in different parts of the country for a year between 2022 and 2023, asking questions such as: “When do you find risk scores more or less useful?” and “How and with whom do you discuss risk scores at pretrial hearings?” He also analyzed local news and case files, observed 50 hours of bond court, and interviewed others who work in the judicial system to help contextualize the findings.

Judges told Esthappan that they used algorithmic tools to quickly process low-risk cases, relying on automated scores even when they were unsure of their legitimacy. In general, they were cautious about pursuing low risk scores for defendants accused of crimes such as sexual assault and intimate partner violence; This was sometimes because they believed the algorithms were under- or over-weighting various risk factors, and also because their own reputation was at stake. And conversely, some explained using systems to explain why they made an unpopular decision; They believed that risk scores added authoritarian weight.

“Many judges used their own moral views on particular charges as criteria to decide when risk scores were or were not legitimate in the eyes of the law.”

The interviews revealed recurring patterns in judges’ decisions to use risk assessment scores, often based on defendants’ criminal history or social background. Some judges believed the systems underestimated the importance of some red flags, such as extensive juvenile records or certain types of weapons charges, or overemphasized factors such as past criminal records or low education levels. “Many judges used their own moral views on particular charges as criteria for deciding when risk scores were or were not legitimate in the eyes of the law,” Esthappan writes.

Some jurors also said they used points as a matter of efficiency. These pretrial hearings are brief (often less than five minutes) and require rapid decisions based on limited information. The algorithmic score provides at least one more factor to consider.

But jurors were also acutely aware of how a verdict would reflect on them, and according to Esthappan, that was a huge factor in whether they would trust the risk scores. When judges saw a charge they believed was a result of poverty or addiction rather than a matter of public safety, they often deferred to their risk scores, seeing little risk to their own reputations if they got it wrong, and reconsidered their role. , as one judge described him, is more of a “balls and strikes” than a “social engineer.”

Judges are more likely to be skeptical of high-profile charges that carry some sort of moral weight, such as rape or domestic violence, he said. This was partly because they identified problems with how the system weighted information for certain crimes; For example, in cases of intimate partner violence, they believed that even defendants without a long criminal history could be dangerous. But they were also aware that the risk to themselves and others was higher. “Your worst nightmare is that you let someone get out on a lower bond and then they go and hurt someone. “So all of us, when I see these stories on the news, I think it could be any of us,” he said.

There are also costs to keeping a truly low-risk defendant in prison. It keeps people who are unlikely to harm anyone away from work, school, or family before they are convicted of a crime. But there is little reputational risk for judges, and adding a risk score does not change this calculus.

The deciding factor for juries was often not whether the algorithm appeared reliable, but whether it would help them justify a decision they wanted to make. Judges who release a defendant based on a low risk score, for example, “may shift some of the responsibility from themselves to the score,” Esthappan said. If the alleged victim “wants someone to be put in jail,” one subject said: “What you do as a judge is, ‘We’re guided by a risk assessment that gives a success score to the likelihood that the defendant will be discovered and re-arrested. And based on the law and that score My job is to create a bond that protects others in society.’”

“In practice, risk scores expand the discretion of judges, who use them strategically to justify criminal sanctions.”

Esthappan’s work pokes holes in the idea that algorithmic tools result in fairer, more consistent decisions. If jurors are choosing when to trust scores based on factors like reputational risk, they may not be reducing human bias, Esthappan said; actually this situation legitimation This is bias and makes it difficult to detect. “While policymakers tout their ability to restrict judicial discretion, in practice risk scores expand the discretion of judges, who use them strategically to justify criminal sanctions,” Esthappan writes in the study.

Risk assessments are “a technocratic plaything of policymakers and academics,” says Megan Stevenson, an economist and criminal justice expert at the University of Virginia School of Law. This seems like an attractive tool to try to “remove randomness and uncertainty from this process,” he says, but studies of their effects show that they generally don’t have a big impact on outcomes.

A bigger problem is that judges are forced to work with extremely limited time and information. Berk, the University of Pennsylvania professor, says collecting more and better information could help algorithms make better assessments. But this would require time and resources that court systems may not have.

But when Esthappan interviewed public defenders, they raised an even more fundamental question: Should pretrial detention in its current form really exist? Judges don’t just work with incomplete data. Often relying heavily on guesswork, they determine someone’s freedom before that person even has a chance to fight the charges. “In this context, I think it makes sense for judges to rely on a risk assessment tool because they have very limited information,” Esthappan said. threshold. “But on the other hand, I find it a bit distracting.”

Algorithmic tools aim to solve a real problem with imperfect human decision-making. “The question is, is that really the problem?” Esthappan tells threshold. “Are the judges being biased, or is there something more structurally problematic about the way we listen to people at the preliminary hearing?” His answer is: “There is a problem that cannot necessarily be fixed by risk assessments, but it gets into a deeper cultural issue in the criminal courts.”