How can Minimum Data Science Changes help Combat Racism?

Data Science

Data ScienceBias in data science exists, let’s understand how some minimal changes can transform it. 

Data science is a data steward. We collect data, store it, transform it, visualize it, and ultimately influence how it is used. In our data-driven world, we feel diversity in data science is our responsibility to use data to tell stories and make a difference. But because of that responsibility, it’s not enough that non-black data scientists don’t become racists. It’s not enough to sit behind the computer screen to write code or get angry, but it doesn’t do anything after the death of George Floyd, Breona Taylor, Armor Berry, and many other blacks. Recognizing racial bias in data science that continues to exist in the United States is not enough, but we are not actively doing anything to counter them.

As Angela Davis said, “In a racist society, it’s not enough to be non-racist. Both diversity and inclusion in data Science recognize that the principle is wrong. Anti-racists are those who do something about it.

As non-black data scientists, we must be anti-racist data scientists. We must take responsibility for our strengths and privileges. We must use data and algorithms to perpetuate racism and face ways to eliminate racist decisions and algorithms in our work. We need to be aware of the lack of diversity in our field (only about 3% of data scientists identify as blacks) and contribute to ways to change this. Being an anti-racist data scientist is not a one-time decision, and we have never fully achieved it, but now to build a new world. A daily commitment to the future.


Here are 5 steps we can take to combat racism:

Step 1: Educate ourselves about becoming antiracist

To become an anti-racist data scientist, you need to take steps to become an anti-racist individual. Being an anti-racist is different for whites and people of color. As written in this toolkit of the National Museum of African American History and Culture, “For whites, anti-racism evolves as racial identity develops. They recognize their privileges. We must try to understand, change the internalized racism, and confuse it when we see it. For people of color, it is how racial discrimination is internal. It means recognizing whether they have been transformed and whether they have been applied to people of other colors. Six responsibilities that individuals can take in the process of becoming anti-racist. That is a good starting point for outlining reading, remorse, memory, risk, rejection, and building relationships.

Start with the many resource lists currently available online. Especially for white readers who have begun to recognize privileges and begin to read and ponder before relying on black, indigenous, or color people (BIPOC) friends for resource reading requests or tense conversations and contact a white friend on this journey for a conversation.


Step 2: Learn about how data and algorithms have been used to perpetuate racism

Data science answer questions and solve a problem and use data if it happens to play a positive effect. However, history is repeatedly indicated that good intentions are not enough. Data and algorithms were used to the permanent social structure of perpetuating racism and racist societal structures. Diversity in data science helps us with these realities and the uneven impact that they were on black life. This list is intended as a starting point and is not exhaustive. We need to learn to contribute to research and reporting about this work to contribute to these issues. It also has a long history of data-driven discrimination based on race, gender, sexuality, and other demographics. 


Step 3: Eliminate racist decisions and algorithms in our work

As racial bias in data science, we must promise to take action every day in our work to eliminate racist decisions and algorithms.

Start with the data you have. Review the data and always reach out to subject-matter experts to better understand:

• How was the data obtained?

• For whom was the data obtained?

• By whom was the data obtained?

• Was permission granted to obtain the data?

• Would individuals be comfortable if they knew this data was being obtained?

• Would individuals be comfortable if they knew how this data was being stored or shared?

• To what end was the data obtained?

• How might this data be biased?


When you’re building a model, think like an adversary:

• How could this system be gamed?

• How could it be used to harm people, especially those in BIPOC communities?

• What could be the unintended consequences of this model?

• As the model “learns” from new data, how might this new data introduce new biases?


When you’re communicating the results of the model:

• Is the model communicated such that the community who contributed the data can view and understand the results?

• Have you communicated how the model was tested to uncover racial bias?


Learn the Technical Details:

There is increasing research on technological approaches to addressing racial bias in data science, in a way that respects fairness. It is unacceptable to say that there is “fairness by ignorance” without including race as a variable in the algorithm. Just because an algorithm doesn’t consider race as a predictor doesn’t mean it’s bias-free. Instead, Data science must explicitly consider the sensitivity of the algorithm to race. It includes the fairness of algorithms, including the concepts of demographic parity, balance odds, and predicted rate parity, and the tools you can use to reduce parallax during pre-processing, training, and post-processing.


Step 4: Commit to increasing diversity in the data science field

The 2020 Harnham US Data and Analytics Report found that only 3% of Data and Analytics professionals identified as Black, and even fewer in leadership positions. This is unacceptable, particularly as we (non-Black data scientists) continue to use data collected from and write algorithms that impact Black communities.

To push the organizations we work for and the data science community-at-large to change, we must commit to:

• Confronting our own unconscious biases and how they manifest themselves in the workplace to make our field a more inclusive space

• Inventorying our internal company practices and making changes to advance equity, diversity, and inclusion at all levels of our organizations

• Reviewing and updating our hiring processes so they don’t reflect unconscious biases of the individuals/teams responsible for hiring

• Demanding representation on executive leadership teams, boards, and expert panels

• Developing leadership pathways to support emerging leaders from historically underrepresented backgrounds


Step 5: Contribute financially to Black-led and community-driven organizations committed to data awareness and increasing diversity in data science

It’s no secret that data science is a lucrative field with an average annual salary of about $100,000. We weren’t born in data science, so many of us probably entered this field thanks to our solid educational experience. As anti-racist data scientists, we must recognize that we live in a racist society where educational opportunities are evenly distributed. Because data science affects everyone, it enhances the diversity in the data science workforce (and makes this profitable area more accessible), supports educational experiences, and enables data awareness for everyone. To assist, we need to promise to use the financial resources we receive for our work.

Our job as non-black data scientists is not only needed to be aware of racism but also to become and respond to anti-racist data scientists. We cannot wait lazily while the decisions we make as data stewards continue to cause irreparable harm to the black community. Knowing that work will not end as long as racism continues.