How software can be racist (and what you can do to stop it)

Glenn Stovall - Jun 16 '20 - - Dev Community

Quick top-hat: This article originally appeared on glennstovall.com; You can also listen to this essay on the Production Ready podcast.

Software is powerful and can shift the balance of power in society. Look at how social media has impacted how we talk about politics, and everything else for example. If politics is the negotiation of power in society, then software is inherently political. If software is powerful, then we, as people who make software, have power. Software can be racist, and as software developers, I believe we all have a responsibility to use our power to prevent racist software from existing.

If this topic makes you uncomfortable, I ask that you don’t stop just yet. Pause and ask yourself why you feel that way. We all have decisions we have to make, and these topics are unavoidable. Get comfortable being uncomfortable.

Part of why conversations about race are challenging is that we aren’t working from a shared vocabulary. So, I want to clarify a few terms upfront. I’m borrowing my terminology here from White Fragility: Why It’s So Hard for White People to Talk About Racism.

The difference between prejudice, discrimination, and racism

Prejudice consists of thoughts, feelings, and generalizations based on little to no experience and then projected onto a group of people. Everyone has prejudices, whether its against people of a certain race, gender, where they are from or where they went to school. If you deny that prejudices exist in others or yourself, then you are powerless to correct them. Prejudice is internal.

Discrimination is an external action based on prejudice, including ignoring, exclusion, threats, ridicule, slander, and violence.

What distinguishes Racism from discrimination is that it is backed by legal authority and institutional control. It comes from a system of power that acts independently and above the actions of any individual, company, or piece of technology.

Racism is hard to define and hard to talk about. It isn’t just people in white hoods committing acts of violence. It’s complicated, and it’s vague. You may imagine “racist” as someone who openly does and says things targeting people. But racism doesn’t have to be overt. It can be subtle. It can happen unintentionally. That’s why it’s easy for people in places of privilege to ignore. Like prejudice, if we don’t do work to acknowledge, identify, and talk about racism, we can’t do anything to stop it. “The greatest trick the Devil ever pulled was convincing the world he didn’t exist.”

Any act of discrimination that props up this system and maintains it as the status quo is de facto racist. Understand when you work on software, it could be racist, and you have a responsibility to prevent that from happening.

How can technology be racist?

When we build software, we make conscious and implicit choices about how we imagine the end-user, what data we will use, and how the algorithms work. Since everyone has unconscious biases and prejudice, there is a possibility they influence said decisions. Coding biases into software is known as “algorithmic bias.” This bias is most apparent when we look at attempts to use machine learning classification algorithms on other people, such as facial recognition.

When engineers on the Google photos team built a system to tag photos automatically, it tagged black people’s faces as ‘Gorilla’ or ‘Chimp.’ Google’s response was to remove those words as valid tags instead of fixing the root cause.

In her TEDx talk, Joy Buolamwini speaks about the discrimination she faced using facial recognition software. It couldn’t recognize her at all. Joy ran into this problem with multiple systems. She discovered they were all based on the same open-source libraries and data sets. I’ll give the developers that built these tools the benefit of the doubt and assume they didn’t mean to create software that ignores black people. But it doesn’t matter. In this situation, the outcome is more important than the intent.

Software can also enforce existing discrimination polices. If your company has a history hiring people of a particular race or gender, and then use their resumes as training data for software that judges incoming candidates, you’ll end up with a system that has the same prejudices. Amazon did exactly this when they built a system that discriminated against women in the hiring process and later faced legal action because of it.

Police departments have used a software tool called Compas before that claims it used to identify suspects and predict recidivism rates. Studies show that it’s not only more likely to be incorrect about black suspects but more likely to be used against them. Here there is discrimination not only in the software itself but how the customers of the software use it.

Racism in software can also take the form of digital redlining. You abstract away discrimination by making decisions based on zip code. Delivery services such as Uber Eats and Doordash are less likely to service predominately black neighborhoods. Those neighborhoods, on average, have fewer Pokestops in Pokemon Go. The data doesn’t have to include specific data points about race to disproportionately affect people of race. Machine learning is all about finding patterns, after all.

See how it works? When the data doesn’t explicitly involve race, everyone who built these systems is allowed to either remain ignorant or has plausible deniability about how their software might affect minorities. We have to assume that what we build could potentially have negative consequences.

Software can’t abstract away the humans behind it

We can’t hold on to a romantic view that technology is purely logical and immune to humanity’s shortcomings. We can’t assume that since computers don’t think and feel, they can’t be biased. Or that we can overcome our biases by being “data-driven.” Our biases shape the design of software, and that software repeats our mistakes. Data can never be entirely objective; our biases, which bears repeating, everyone has, will seep in via decisions influencing what data we collect, what data we ignore, and how we interpret it.

Technology is a reflection of who builds it.

According to a 2011 study by the National Institute of Standards and Technologies shows that facial recognition software built by Asian software companies was more likely to be accurate at identifying Asian faces. Who creates the software has an impact on how it works.

More diverse teams equal fewer blind spots and fewer errors like the ones above, making it to production, or worse, the news. Diversity is a competitive advantage. Creating a more diverse team isn’t about doing it to be “fair”; it is about bringing more information and insight to that table than homogeneous teams can.

As Rachel Goodman, a staff attorney for the ACLU’s racial justice program, told Fast Company: “Many of the ill effects are not intentional. It comes from people designing technology in closed rooms in close conversations and not thinking of the real world.

Do you remember Tay? Microsoft’s attempt at an AI-powered Twitter bot? In 2016, Microsoft launched a Twitter bot that would model its behavior based on the tweets of people that interacted with it.

I bet you can guess where this story is going.

“Tay is designed to engage and entertain people where they connect with each other online through casual and playful conversation,” Microsoft said. “The more you chat with Tay, the smarter she gets.”

You can’t invite a group of people to do anything on the internet, without attracting trolls. In less than a day, Tay was spouting rhetoric about how “Hitler was right” and “9/11 was an inside job.” Microsoft has to shut the project down in short order.

This story is more of a warning of giving trolls opportunity than anything else; the main lesson still stays the same: artificial intelligence only knows what we tell it, and only acts based on our instructions.

As software developers, what can we do?

Systemic racism is complicated and insidious. It doesn’t come from one place. It comes from everywhere. It happens automatically from systems put in place before we were born and continues to spread through American society through hateful acts and maintaining an inequitable status quo.

As software developers, we can shape the tools that directly impact people’s lives, or if they get built at all. This power means that we have a choice to make: Are you going to take action, or are you going to be complicit?

There are no two ways about this decision, nor is there any avoiding it. The work you do has an impact on the world, full stop. The nature of that impact is up to you.

Outside of your job, there are actions that anyone can take, regardless of job title. Get out in the streets. Donate your time or money. Vote.

As a software developer, you can advocate for change at your job. We need to be more skeptical of our data sources and cynical about the use cases and effects of what we build. So you should ask yourself:

  • Are the data sets you are using accurate? How can you be sure?
  • In the worst-case scenario, are there ways our software could disproportionately impact a specific group of people? (This includes accessibility, by the way.)
  • Do we hear a diverse set of voices, both before and after building our product?
  • If you are a manager, are you hiring a diverse team? And if not, why not?

These are the types of questions we all need to be asking more. Technology is not amoral, and it is not apolitical. Despite all of these stories, I still believe that’s influence can be used for positive change. But that can only happen if we collectively start doing business differently than we have up to this point. So I’d like to take this opportunity to ask you:

What are you going to do differently?

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .