Data Scientist Dug Through 2.2M of Virginia’s Sentencing Records to Determine if Income or Race was a Dominant Factor: Guess What he Uncovered? 

Photo courtesy of

Photo courtesy of

It’s a deeply held belief that race plays a major role in criminal sentencing outcomes. A data scientist’s probe into Virginia’s 2.2 million sentencing records just solidified that belief even more.

Lawyer turned data scientist David Colarusso said he and his colleagues got into a heated debate about which is a bigger issue in the U.S. criminal justice system: bias against defendants of color or bias against defendants of lower socioeconomic class. The key to answering this question came in the form of a tweet from data analyst Ben Schoenfeld.

According to Mic, Colarusso then ran a regressive analysis on more than a million sentencing records from the Virginia Criminal Courts between 2006 and 2010. The answer that would put an end to the dispute with his colleagues wasn’t all that surprising.

The former lawyer’s analysis revealed that a defendant’s race has a much stronger correlation with negative outcomes in regards to criminal sentencing. For example, Colarusso found that a Black man would have to make nearly $900,000 a year to receive the same treatment as his white counterpart.

While socioeconomic class was a factor, it didn’t have as much of an impact, Mic reports.

Colarusso’s study comes on the heels of a similar analysis conducted by Manhattan-based news organization ProPublica. According to Atlanta Black Star, investigative reporters at the publication examined an algorithm designed to predict the future criminality of offenders. What they found was that Black defendants were almost twice as likely to gain future criminality predictions than white defendants.

Racial disparities in defendant risk scores were also discovered, as Black offenders who didn’t go on to commit new crimes were mislabeled as high-risk 45 percent of the time compared to 23 percent of the time for whites.

Northpointe Inc., the research firm that developed the faulty algorithm, vehemently denied ProPublica’s findings.

“Northpointe does not agree that the results of your analysis, or the claims being made based upon that analysis, are correct or that they accurately reflect the outcomes from the application of the model,” the company wrote in a letter.

Mic reports that Colarusso hopes to use this new-found proof of racial bias in the criminal justice system to actually diagnose the root of the problem.

“One thing important to recognize is that this doesn’t necessarily mean this system has people acting with malice in it,” Colarusso told Mic over the phone. “So you really need to drill down to see where the disparities are coming from.”

He plans to apply the same formula to different points of a case (arraignment, plea bargaining, acquittal, etc.) to identify specific policies and processes that intentionally target minorities.

“It’s time we stop pretending race isn’t a major driver of disparities in our criminal justice system,” Colarusso wrote. “Once you recognize the bias in a system, you have a choice: You can do something to push back, or you can accept the status quo.”


Back to top