Algorithms in the Criminal Justice System
Criminal justice algorithms—sometimes called “risk assessments” or “evidenced-based methods”—are controversial tools that purport to predict future behavior by defendants and incarcerated persons. These proprietary techniques are used to set bail, determine sentences, and even contribute to determinations about guilt or innocence. Yet the inner workings of these tools are largely hidden from public view.
Many “risk assessment” algorithms take into account personal characteristics like age, sex, geography, family background, and employment status. As a result, two people accused of the same crime may receive sharply different bail or sentencing outcomes based on inputs that are beyond their control—but have no way of assessing or challenging the results.
As criminal justice algorithms have come into greater use at the federal and state levels, they have also come under greater scrutiny. Many criminal justice experts have denounced “risk assessment” tools as opaque, unreliable, and unconstitutional. The Supreme Court is also considering whether to take a case on the use of a secretive technique to predict possible recidivism.
"Risk assessment" tools are algorithms that use socioeconomic status, family background, neighborhood crime, employment status, and other factors to reach a supposed prediction of an individual's criminal risk, either on a scale from “low” to “high” or with specific percentages. See Wisconsin’s COMPAS risk assessment questionnaire, from ProPublica. In 2014, then-U.S. Attorney General Eric Holder called for the U.S. Sentencing Commission to study the use of algorithms in courts, concerned that the scores may be a source of bias. At the same time, the Justice Department expressed concern about the use of factors such as education levels, employment history, family circumstances, and demographic information. While the Sentencing Commission has studied the recidivism risk for federal offenders, it has not commissioned a study of risk scores.
Criminal justice algorithms are used across the country, but the specific tools differ by state or even county. In addition, because such algorithms are proprietary, they are not subject to state or federal open government laws. Jurisdictions have generally used one of three main systems, or adapted their own version of each: Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), Public Safety Assessment (PSA) and Level of Service Inventory Revised (LSI-R). COMPAS, created by the for-profit company Northpointe, assesses variables under five main areas: criminal involvement, relationships/lifestyles, personality/attitudes, family, and social exclusion. The LSI-R, developed by Canadian company Multi-Health Systems, also pulls information from a wide set of factors, ranging from criminal history to personality patterns. Using a narrower set of parameters, the Public Safety Assessment, developed by the Laura and John Arnold Foundation, only considers variables that relate to a defendant’s age and criminal history.
A 2016 investigation by ProPublica tested the COMPAS system adopted by the state of Florida using the same benchmark as COMPAS: a likelihood of re-offending in two years. ProPublica found that the formula was particularly likely to flag black defendants as future criminals, labeling them as such at almost twice the rate as white defendants. In addition, white defendants were labeled as low risk more often than black defendants. But the investigators also found that the scores were unreliable in forecasting violent crime: only 20 percent of the people predicted to commit violent crimes actually went on to do so. When considering a full range of crimes, including misdemeanors, the correlation was found to be higher but not exceedingly accurate. Sixty-one percent of the candidates deemed liked to reoffend were arrested for any subsequent crimes within two years. According to ProPublica, some miscalculations of risk stemmed from inaccurate inputs (for example, failing to include one’s prison record from another state), while other results were attributed to the way factors are weighed (for example, someone who has molested a child may be categorized as low risk because he has a job, while someone who was convicted of public intoxication would be considered high risk because he is homeless).
Prediction Fails Differently for Black Defendants
|Labeled Higher Risk, But Didn't Re-Offend||23.5%||44.9%|
|Labeled Lower Risk, Yet Didn't Re-Offend||47.7%||28.0%|