Algorithms in the Criminal Justice System: Pre-Trial Risk Assessment Tools

Summary

Artificial Intelligence is used widely throughout the criminal justice system. The most commonly used are "pretrial risk assessment" algorithms, used in nearly every state. Criminal justice algorithms—sometimes called “risk assessments” or “evidenced-based methods”—are controversial tools that purport to predict future behavior by defendants and incarcerated persons. The tools vary but estimate using “actuarial assessments” (1) the likelihood that the defendant will re-offend before trial (“recidivism risk”) and (2) the likelihood the defendant will fail to appear at trial (“FTA”).

These proprietary techniques are used to set bail, determine sentences, and even contribute to determinations about guilt or innocence. Yet the inner workings of these tools are largely hidden from public view.

Many “risk assessment” algorithms take into account personal characteristics like age, sex, geography, family background, and employment status. As a result, two people accused of the same crime may receive sharply different bail or sentencing outcomes based on inputs that are beyond their control—but have no way of assessing or challenging the results.

As criminal justice algorithms have come into greater use at the federal and state levels, they have also come under greater scrutiny. Many criminal justice experts have denounced “risk assessment” tools as opaque, unreliable, and unconstitutional.

Background

"Risk assessment" tools are algorithms that use socioeconomic status, family background, neighborhood crime, employment status, and other factors to reach a supposed prediction of an individual's criminal risk, either on a scale from “low” to “high” or with specific percentages. See Wisconsin’s COMPAS risk assessment questionnaire, from ProPublica. In 2014, then-U.S. Attorney General Eric Holder called for the U.S. Sentencing Commission to study the use of algorithms in courts, concerned that the scores may be a source of bias. At the same time, the Justice Department expressed concern about the use of factors such as education levels, employment history, family circumstances, and demographic information. While the Sentencing Commission has studied the recidivism risk for federal offenders, it has not commissioned a study of risk scores.

Criminal justice algorithms are used across the country, but the specific tools differ by state or even county. In addition, because such algorithms are proprietary, they are not subject to state or federal open government laws. Jurisdictions have generally used one of three main systems, or adapted their own version of each: Correctional Offender Management Profiling for Alternative Sanctions (COMPAS), Public Safety Assessment (PSA) and Level of Service Inventory Revised (LSI-R). COMPAS, created by the for-profit company Northpointe, assesses variables under five main areas: criminal involvement, relationships/lifestyles, personality/attitudes, family, and social exclusion. The LSI-R, developed by Canadian company Multi-Health Systems, also pulls information from a wide set of factors, ranging from criminal history to personality patterns. Using a narrower set of parameters, the Public Safety Assessment, developed by the Laura and John Arnold Foundation, only considers variables that relate to a defendant’s age and criminal history.

A 2016 investigation by ProPublica tested the COMPAS system adopted by the state of Florida using the same benchmark as COMPAS: a likelihood of re-offending in two years. ProPublica found that the formula was particularly likely to flag black defendants as future criminals, labeling them as such at almost twice the rate as white defendants. In addition, white defendants were labeled as low risk more often than black defendants. But the investigators also found that the scores were unreliable in forecasting violent crime: only 20 percent of the people predicted to commit violent crimes actually went on to do so. When considering a full range of crimes, including misdemeanors, the correlation was found to be higher but not exceedingly accurate. Sixty-one percent of the candidates deemed liked to reoffend were arrested for any subsequent crimes within two years. According to ProPublica, some miscalculations of risk stemmed from inaccurate inputs (for example, failing to include one’s prison record from another state), while other results were attributed to the way factors are weighed (for example, someone who has molested a child may be categorized as low risk because he has a job, while someone who was convicted of public intoxication would be considered high risk because he is homeless).

Prediction Fails Differently for Black Defendants

WHITE AFRICAN-AMERICAN
Labeled Higher Risk, But Didn't Re-Offend 23.5% 44.9%
Labeled Lower Risk, Yet Didn't Re-Offend 47.7% 28.0%

Source: ProPublica

COMPAS is one of the most widely used algorithms in the country. Northpointe published a validation study of the system in 2009, but it did not include an assessment of predictive accuracy by ethnicity. It referenced a study that had evaluated COMPAS’ accuracy by ethnicity, which reported weaker accuracy for African-American men, but claimed the small sample size rendered it unreliable. Northpointe has not shared how its calculations are made but has stated that the basis of its future crime formula includes factors such as education levels and whether a defendant has a job. Many jurisdictions have adopted COMPAS, and other "risk assessment" methods generally, without first testing their validity.

Defense advocates are calling for more transparent methods because they are unable to challenge the validity of the results at sentencing hearings. Professor Danielle Citron argues that because the public has no opportunity to identify problems with troubled systems, it cannot present those complaints to government officials. In turn, government actors are unable to influence policy.

Unanswered Questions

  • How much should judges rely on these algorithms?
  • Some argue that "risk assessment" should be limited to probation hearings or pre-trial release and not used in sentencing at all. In fact, the COMPAS system specifically was created, not for use in sentencing, but rather to aid probation officers in determining which defendants would succeed in specific treatment types. Others caution against overreliance in sentencing, which may be a natural tendency when given data that appears to be based on concrete, reliable calculations. At least one judge has set aside an agreed upon plea deal and given a defendant more jail time because of the defendant’s high "risk assessment" score. Judge Babler in Wisconsin overturned the plea deal that had been agreed on by the prosecution and defense (one year in county jail with follow-up supervision) and imposed two years in state prison and three years of supervision after he saw that the defendant had high risk for future violent crime and a medium risk for general recidivism.

    Professor Sonja Starr argues that "risk assessment" results represent who has the the highest risk of recidivism, but the question most relevant to judges is whose risk of recidivism will be reduced the most by incarceration. Therefore, the consideration of risk in the abstract in sentencing may not advance the goal of deterrence. In addition, the recidivism rate produces a risk score within a particular period (ex: 2 years) from the time of release or from the sentence of probation. It does not convey information about the amount of crime one may commit if given one length of incarceration over another (ex: 2 years rather than 5 years). Starr rejects the assumption that incarcerating those who are considered riskiest will prevent more crimes as an oversimplification, because this view does not consider the effect of crimes undertaken by other individuals, nor that incarceration may make someone who is already risky even more dangerous by increasing their risk of recidivism.

  • What factors should be considered?
  • Factors such as demographic, socioeconomic background and family characteristics may serve as a proxy for race. Because these variables are highly correlated with race, they will likely have a racially disparate impact. In addition, because of de facto segregation and the higher crime rate in urban neighborhoods, including neighborhood crime rates will further compound the inequality. As a public policy matter, Starr argues that "risk assessment" factors based on demographic, socioeconomic background and family characteristics may not serve its intended goal of reducing incarceration because mass incarceration already has a racially disparate impact, which means that "risk assessment" algorithms produce higher risk estimates, all other things equal, for subgroups whose members are already disproportionately incarcerated.

    Another arguable flaw with the input questions is that the consideration of employment history and financial resources result in extra, unequal punishment of the poor which may violate the equal protection clause, based on the precedent case Bearden v. Georgia in which the Supreme Court rejected Georgia’s argument that poverty was a recidivism factor that justified additional incapacitation. To prevent perpetuating a racially disparate impact, advocates are arguing for a narrow range of questions, such as strictly based on past or present criminal behavior, or an individual assessment of a defendant’s conduct, mental states, and attitudes.

  • Do proprietary algorithms violate a defendant's right to due process?
  • Since the specific formula to determine "risk assessment" is proprietary, defendants are unable to challenge the validity of the results. This may violate a defendant’s right to due process. The use of COMPAS in sentencing has been challenged in Loomis v. Wisconsin as a violation of the defendant’s right to due process on two grounds. The first part of the challenge is that the proprietary nature of COMPAS prevents defendants from challenging the COMPAS assessment’s scientific validity. The state does not dispute that the process is secret and non-transparent, but contends that Loomis fails to show that a COMPAS assessment contains or produces inaccurate information. Second, Loomis argues that the algorithmic is unconstitutional because of the way it considers gender. COMPAS has a separate scale for women and men, so all other factors being equal, assessment results will differ based on gender alone.

    'Risk Assessment' Information (state-by-state)

    The following table is based on a survey of state practices by EPIC performed September 2019.

    STATE TYPE/SCOPE OF USE (if known) Has the state conducted a validity study?
    Alabama VPRAI / Jefferson County Yes
    Alaska State Created / Statewide Yes
    Arizona PSA / Statewide | VPRAI / 2 County Superior Courts Unknown
    Arkansas State Created / Statewide Yes
    California (Sample risk assessment documents from San Francisco, and Napa County) PSA / 3 counties | PRRS II / 2 Counties In Progress
    Colorado (sample risk assessment documents) CPAT / Statewide | ODARA for DV / Statewide In Progress
    Connecticut State created / Statewide Yes
    Delaware State created (DELPAT) / Statewide Yes
    Florida PSA / Volusia County | COMPAS - Sentencing / Statewide | State Created FPRAI Being piloted / 6 Counties Yes
    Georgia State created / Some counties Unknown
    Hawaii PSA / Statewide | ORAS-PAT / Statewide Yes
    Idaho State created / Statewide In Progress Transparency Law Passed*
    Illinois PSA / 3 counties | VPRAI/RVRA / Most Courts Yes
    Indiana (sample risk assessment documents) Mandatory use of IRAS and IYAS / Statewide Yes
    Iowa PSA / 4 Counties via Pilot Program | IRR Yes
    Kansas State created / Johnson County Unknown
    Kentucky PSA / Statewide Yes
    Louisiana PSA / Orleans Parish Yes
    Maine ODARA (sex offenders) / Statewide | 2019 Task Force for expansion Yes
    Maryland State created / Most counties Yes
    Massachusetts Currently under debate, however not used yet N/A
    Michigan COMPAS for Sentencing / Statewide Yes
    Minnesota MNPAT / Statewide In Progress
    Mississippi State created / Statewide Unknown
    Missouri Statewide / State created | Separate statewide system for Juvenile and Sex Offenders | Use Oregon Public Safety Checklist for Sentencing Yes:
    Montana PSA / Five Counties Yes
    Nebraska (sample assessments) Site of Federal PSRAT Piloting/Testing N/A
    Nevada State created / Statewide Mar. 2019 by NV Supreme Court Yes
    New Hampshire Currently studied for deployment N/A
    New Jersey PSA / Statewide Yes
    New Mexico PSA / 1 County | ODARA for DV Yes
    New York (NYC) City Created / Citywide |State Created / State-wide for Parole Yes
    North Carolina PSA / 1 County | Developing another statewide one Yes
    Ohio PSA / 1 County | ORAS-PAT / Statewide Yes
    Oklahoma ORAS for Pretrial Services Program + LSI/R / Statewide Yes
    Oregon (sample assessments) Public Safety Checklist Yes
    Pennsylvania PSA / Statewide | State created / 1 County Yes
    Rhode Island PSA / Statewide Yes
    South Carolina State Created - Cash Bail Use Unknown
    South Dakota PSA / 2 Counties Yes
    Tennessee State Created / One Judicial District Test In Progress
    Texas (sample assessments) PSA / Harris County | PRAISTX (derivative of ORAS) / Statewide Parole Board Yes
    Utah PSA / Statewide Yes
    Vermont Statutory Authorization for Risk & Needs Assessment Unknown
    Virginia VPRAI revised by Luminosity / Statewide | Use Oregon Public Safety Checklist for Sentencing Yes
    Washington PSA / One County Yes
    West Virginia LS/CMI Yes
    Wisconsin (sample assessment documents) PSA / 2 Counties | COMPAS / Statewide Yes
    Wyoming COMPAS for Prisoners / Statewide Unknown
    Federal PTRA Yes

    * Bill enacted Mar. 2019: requires transparency, notification, and explainability.
    **There is no official compendium of Risk Assessments used by states.

    Abbreviations Key:

     
    DV - Domestic Violence
    COMPAS - Correctional Offender Management Profiling for Alternative Sanctions
    PSA - Pretrial Safety Assessment
    PTRA - Pretrial Risk Assessment Instrument
    CPAT - Colorado Pretrial Assessment Tool
    PRRS - Pretrial Release Risk Scale
    DELPAT - Delaware Pretrial Assessment Tool
    ODARA - Ontario Domestic Assault Risk Assessment Tool
    MNPAT - Minnesota Pretrial Assessment Tool
    ORAS - Ohio Risk Assessment System
    LS/CMI - Level of Service/Case Management Inventory
    PRAISTX - Pretrial Risk Assessment Information System
    VPRAI - Virginia Pretrial Risk Assessment Instrument
    IRAS - Indiana Risk Assessment System

    EPIC's Interest

    EPIC has a strong interest in open government. Public disclosure of this information improves government oversight and accountability. It also helps ensure that the public is fully informed about the activities of government. EPIC routinely files lawsuits to force disclose of agency records that impact critical privacy interests.

    EPIC also has a strong interest in algorithmic transparency. Secrecy of the algorithms used to determine guilt or innocence undermines faith in the criminal justice system. In support of algorithmic transparency, EPIC submitted FOIA requests to six states to obtain the source code of "TrueAllele," a software product used in DNA forensic analysis. According to news reports, law enforcement officials use TrueAllele test results to establish guilt, but individuals accused of crimes are denied access to the source code that produces the results.

    The Universal Guidelines for Artificial Intelligence, grounded in a human rights framework, set forth twelve principles that are intended to guide the design, development, and deployment of AI, and  frameworks for policy and legislation.  Broadly, the guidelines address the rights and obligations of: 1) fairness, accountability, and transparency; 2) autonomy and human determination; 3) data accuracy and quality; 4) safety and security; and 5) minimization of scope.  These principles can also guide the use of algorithms in the pre-trial risk context.  

    The very first principle, transparency, is seldom required with pre-trial risk assessments. One of the primary criticisms of these risk assessment tools is that they are proprietary tools, developed by technology companies that refuse to disclose the inner workings of the “black box.” Trade secret and other IP protection defenses have been given to demands of the underlying logic of the systems. In March 2019, Idaho became the first state to enact a law specifically promoting transparency, accountability, and explainability in pre-trial risk assessment tools. Pre-trial risk assessments are algorithms that help inform sentencing and bail decisions for defendants. The law prevents a trade secrecy or IP defense, requires public availability of ‘all documents, data, records, and information used by the builder to build or validate the pretrial risk assessment tool,’ and empowers defendants to review all calculations and data that went into their risk score.

    EPIC FOI Documents

    EPIC obtained the following documents concerning criminal justice algorithms through state freedom of information requests.

    Resources

    Legislation and Regulations

    Government Studies

    Notable Cases

    • EPIC v. DOJ (Suit for records of Criminal Justice Algorithms by the Federal Government)
    • EPIC v. CPB (Suit for documents of secret analytical tools to assign risk assessments to travelers)
    • EPIC v. DHS (Suit for records of DHS program that predicts crime risk based on “physiological and behavioral signals”)
    • United States v. Booker, 125 S. Ct. 738 (2005)
    • Mistretta v. United States, 109 S. Ct. 647 (1989)
    • State v. Loomis, No. 16-6387 (U.S.) (Wisconsin case in which defendant has petitioned U.S. Supreme Court for certiorari)
    • Iowa v. Guise, 921 N.W.2d 235 (2016).
    • Doe v. Sex Offender Registry Board, 466 Mass. 594, 999 N.E.2d 478 (2013) (holding that the Sex Offender Registry board arbitrarily ignored scientific evidence that female offenders generally pose a much lower risk of re-offense; SORB was empowered to consider any useful information including scientific evidence introduced by offender in arriving at a classification decision, and authoritative evidence was introduced suggesting that establish "risk assessment" guidelines, developed from studies of male offenders, could not predict accurately the recidivism risk of a female offender, and that such risk could not be evaluated without examining the effect of gender)
    • Malenchik v. State, No. 79A02-0902-CR-133 (Ind. Ct. App. June 5, 2009) (holding that it was not improper for the trial court to take into consideration a defendant’s LSI-R score at sentencing)
    • In re CDK, 64 S.W.3d 679 (Tex. App. 2002) (holding that admitting an assessment report on a father’s sexual deviancy as expert witness testimony was an abuse of discretion because the plaintiff did not provide how the formulas were derived and whether they have ever been subjected to analysis or testing.

    Academic Articles

    Other resources

    Books

    Documents and Reports

    • Sample COMPAS risk assessment questionnaire - Wisconsin's 137 question risk assessment
    • Sample sentencing reports judges receive that includes risk assessment results
    • Jennifer Elek, Roger Warren & Pamela Casey, Using Risk and Needs Assessment Information at Sentencing: Observations from Ten Jurisdictions, National Center for State Courts’ Center for Sentencing Initiatives
    • Tara Agense & Shelley Curran, The California Risk Assessment Pilot Project: The Use of Risk and Needs Assessment Information in Adult Felony Probation Sentencing and Violation Proceedings, Judicial Council of California Operations and Programs Division Criminal Justice Services (December 2015)

    News

    Share this page:

    Defend Privacy. Support EPIC.
    EPIC Mueller Report book
    US Needs a Data Protection Agency