By Jonathan Woods
Sept 7, 2022
Understanding and mitigating bias is a vital issue in applying AI systems, particularly for businesses. It’s also a complex topic. It not only demands proficiency with technical bias correction techniques, but also requires knowledge of AI-specific governance practices and thoughtful consideration about what fairness means in any given context. Left unaddressed, bias in AI systems can reinforce historical discrimination embedded in datasets, unjustly harm users, and damage an organization’s reputation ― all unacceptable outcomes.
To help Canadian small and medium businesses tackle this challenge, the Vector Institute, with funding support from the National Research Council of Canada Industrial Research Assistance Program (NRC IRAP), created Bias in AI, an online, industry-agnostic program designed to help business professionals develop the tools and skills needed to identify and reduce bias in natural language processing (NLP) and computer vision (CV) applications.
Eighteen participants from six companies attended the Bias in AI program over five weeks between February and April 2022. The interactive program featured workshops and tutorials led by experts from the Vector community, and assignments in which those participants:
- Explored AI tools and best practices
- Covered NLP and CV fundamentals
- Trained a GPT2/BERT text generation model, analyzing bias factors
- Developed a CV classification model to understand bias-related influences
- Applied techniques to reduce bias in both NLP and CV models
Finally, each company applied their learnings to their own business in a final project that was presented for evaluation at the end of the program.
The program is tailored to data science teams within small and medium businesses because this group, while increasingly using AI, can find it hard to keep up with the research and best practices for mitigating bias. Melissa Valdez, the Course Manager, said, “We know that startups, scaleups, and small and medium businesses don’t necessarily have the resources to focus on this at the level that they need to.”
Dan Adamson, CEO of Armilla ― a company that joined the Winter 2022 Bias in AI cohort ― appreciates that the program delves into the complexity of the problem. “We think it’s really important that every data scientist has this underlying training as a base, because the topic is way more complicated than they probably learned coming out of computer science class or their core data science class,” Adamson said.
Adamson highlighted the inclusion of non-technical elements that are crucial to managing bias as a key strength of the program. “I think the course had a few dimensions that it tackled bias from, which is really important because it’s not just a technical problem,” he said. “There needs to be a governance aspect to it, and you need to be able to understand the business and ethical principles behind it as well. The course did a very good job of outlining each of those in an integrated fashion.”
Armilla helps data scientists perform quality assurance on machine-learning models (which includes looking for biases) through their platform and offers auditing services to help maintain trust in those models. Through this course, the Armilla team worked on a project focused on human resources data and systems – something the company gets many inquiries about – and a question about what they could detect about whether promotions were being distributed fairly among men and women in one dataset. The results “brought up both ethical and technical issues” around fairness in hiring and promotion systems, Adamson said, which required experimentation with correction techniques, including model retraining.
“It was an interesting case study,” Adamson said. “We wanted to make sure that our data scientists stayed current with the state-of-the-art, and we used [the project] as an exercise for our team to get on the same page.”
EthicalAI, an AI consulting company that helps businesses use AI responsibly, took a different approach to correcting bias in their course project: they created synthetic data to make more balanced datasets. To do this, they experimented with multi-modal generative models – unsupervised AI models that can generate new data similar to that of an original dataset – to create new variations of existing facial images. Their idea was that a set of new variations could include the characteristics of race, sex, or other protected variables that may be underrepresented in the original dataset. These variations would be inserted into the original dataset, which could then be used to train a model downstream in a way that would be less likely to inappropriately favour one type of face over another.
Thor Jonsson, an EthicalAI co-founder, said, “I can use the generative model to make this equal sample of different looks of people. So essentially, it’s injecting new data into the dataset that allows us to create an equal balance of people of different ages, ethnicities, genders, and so on.”
The value, Jonsson explains, is that “it’s not contingent on me labeling pictures of a man or a woman. It’s more about just asking ‘What are the different ways faces can be,’ and can we make sure that those rare faces, according to whatever dataset someone has, are represented? It’s really about creating an automated way of making sure that nobody is ever affected from being in the minority.”
The team at EthicalAI is now expanding on the work from the course, going beyond typical 2D images to see how the mitigation work applies to 3D pictures of people. They’ve also expressed an interest in developing AI products based on the project.
Jonsson, an AI researcher himself, credits the program for enabling the team to further explore the cutting edge. “The course really advanced our understanding of this niche in the research community,” Jonsson said. “This course was all about getting off the ground running in terms of understanding these newest and greatest recent ideas that have emerged to tackle these kinds of problems.”
For other small and medium-sized companies using AI, Jonsson has a message: “Anyone who’s using AI today and is not doing bias mitigation needs to be doing it.”
Adamson echoes the sentiment: “We’d definitely encourage every organization to start sending their folks to get to that level. The more we can push industry in this direction, the better everyone is served,” he said.
“And kudos to Vector for making it available.”
To learn about the next Bias in AI course speak with your NRC IRAP Industrial Technology Advisor (ITA), visit the program website, or sign up for Vector’s newsletter.