CIBC Analytics Day Recap: Understanding and Operationalizing Trustworthy AI
June 1, 2022
June 1, 2022
By Jonathan Woods
June 1, 2022
Trustworthy AI is an evolving, but crucial topic for organizations using AI today. On May 29, CIBC hosted a fireside chat called AI Research: Trustworthy AI & MLOps to delve into this subject. The chat featured three speakers:
The event was part of CIBC’s Analytics Day series, an internal forum in which various employees in data and analytics roles come together to share best practices and key learnings, showcase analytical work and projects, and learn in an interactive environment. With 70 CIBC team members in attendance, this vibrant discussion touched on general notions of Trustworthy AI, approaches to common challenges, and how MLOps and operationalization fit in. Here are some of the insights shared in the conversation:
What is Trustworthy AI?
AI systems can introduce new and unique risks for companies that use them. Left unchecked, AI-specific challenges related to bias, privacy, safety, explainability, and robustness can harm companies, customers, and society at large. Trustworthy AI refers to the collection of principles and practices that are meant to account for these new risks and guide the ethical development and management of AI systems.
“Broadly speaking, Trustworthy AI is AI that’s developed in a manner according to principles that foster trust among all the stakeholders involved,” said Anne. “However, there’s no one, standard, universally-accepted definition of Trustworthy AI.”
While best practices exist, the responsibility for determining exactly what satisfies the notion of Trustworthy AI, what precise risks it seeks to mitigate, and how it’s operationalized to ensure AI doesn’t harm stakeholders lies primarily with the organizations using it.
Trust is necessary for AI innovation
AI progress depends on gaining and maintaining confidence among stakeholders that AI-specific risks will be mitigated and monitored to avoid unintended outcomes. Anne explained, “If AI is not developed in a trustworthy manner, if people don’t trust it, then it loses its credibility. You’re actually inhibiting innovation. Trustworthy AI is innovation. If you’re not building AI in a trustworthy manner, executives might stop endorsing it. Clients will lose trust and move to another company, churn out.”
This attention to numerous stakeholders is essential. For innovation to continue, all people affected by an AI system must have implicit trust in it. Anne elaborated: “It’s not just trust toward the clients, but also data scientists have to trust the AI, regulators have to trust the AI, the business team that is using the AI system has to trust it, and also the executives that are endorsing it.”
The “socio-technical” challenge of Trustworthy AI
Trustworthy AI is not a strictly technical issue. Pandya emphasized that it involves “ethical requirements” based on human values in addition to the data sets, architectures, and processes that AI teams work with. These values require social negotiation, and can’t always be unilaterally defined by any one team or company. Because of this, determining the technical guardrails for each AI use case requires rigorous thinking about these values, a diversity of perspectives and input, and a continuous revisiting of assumptions.
A prime example of this challenge is fairness, a complex concept in AI with several competing definitions. Generally, fairness involves ensuring that AI systems don’t produce inappropriately biased or discriminatory outcomes for individuals or groups. The speakers agreed that fairness is likely the most difficult Trustworthy AI concept to tackle due to the fact that many different, yet valid, approaches to the notion exist, and that they are typically negotiated in a broader social context, beyond any one team or company.
Anne said, “The toughest [Trustworthy AI issue] in terms of the most diverse perspectives and the most debates is probably around fairness, because that’s so social. There are social aspects and technical aspects to fairness. There’s going to be a lot of debates about what the right way to look at fairness for [any given] use case is, and what the right definition and metric is.”
How financial institutions stack up
Both speakers ranked financial institutions as very advanced in certain aspects of Trustworthy AI. “Accountability, responsibility, and privacy are probably well ahead in the banks,” Anne said. “Banks have [emphasized] many of these principles maybe since the first bank started in Italy.” He continued: “In terms of innovation, after tech companies, it’s the financial sector that innovates the most in terms of AI.” Similarly, Pandya credited the fact that issues like “privacy and security have long been the backbone of anything involving data in the finance industry.”
Anne then enumerated the concepts that teams at CIBC focus on. ”Privacy is the main one,” he said, referencing privacy design work being done with differential privacy, federated learning, and k-anonymity. But also, he said, “there’s security and safety, there’s fairness, there’s transparency and explainability, there’s robustness and reliability, and accountability and responsibility.”
Trust and third-party vendors
Products with AI inside present a tricky problem for organizations procuring products and services from third party vendors: how can buyers know these products measure up on key trust dimensions like fairness, privacy, and robustness? Exacerbating the challenge, vendors may resist disclosing pertinent details if they feel that intellectual property is at risk.
During procurement, clear communication with vendors about expectations is paramount. Pandya said, “You have to share what the principles of Trustworthy AI are for an organization.” Anne agreed. “We [have to] communicate our standards for Trustworthy AI, our minimum set of expectations for fairness, for explainability. And they would also have to do the same,” he said.
However, a problem remains, according to Anne: “We don’t have access to their evaluation data or the data that they trained their algorithm on.” This means that due diligence must include explicit questions about that data. He explained, “So we ask them, ‘Is your data representative of the population? What fairness performance metrics are you using? Why are you using those? How often are you monitoring your model for these metrics? What do you do once your fairness metric passes a certain acceptable threshold?’ In asking these questions we’re trying to assess whether their fairness assessment process is similar to our assessment process ― if it’s acceptable to us ― because without access to the data, we can’t do these calculations ourselves. We have to rely on them.”
Also, Pandya expressed skepticism about the merit of concerns regarding IP: “I think a lot of companies have an idea of what IP is that is not differentiating at all. When I think of the advances in machine learning and deep learning specifically, everything is open source and very few organizations outside academia are truly developing novel algorithms or methods. So I think I would first go back and challenge them: ‘Is this really about IP or are you just not confident whether your systems are trustworthy or not?’”
MLOps is key to realizing Trustworthy AI
MLOps ― a set of practices for the efficient end-to-end development of reliable, scalable, and reproducible machine learning solutions ― “is the bridge from [Trustworthy AI] principles to practice,” in Pandya’s words. Anne agreed: “MLOps makes Trustworthy AI possible.” In fact, he went so far as to say that “the only way to operationalize Trustworthy AI at scale is through MLOps.”
There are two main reasons why.
First, MLOps, being concerned with the entire machine learning pipeline, provides more opportunities to address trust issues and measure outcomes than focusing on a model or dataset alone can. Anne explained: “A pipeline consists of multiple components. So you have a training pipeline, and it consists of data extraction, data cleaning, training, and validation as well. Say now you want to incorporate Trustworthy AI principles into a pipeline. You’d have some component in the pipeline that’s now doing some fairness assessment on the data, using a fairness metric as a constraint. Then you can think of explainability as well. You can design a component for explainability, and put that in your pipeline.”
Second, Pandya noted, MLOps introduces a new, crucial development practice. He explained, “In DevOps, there’s focus around continuous integration and continuous deployment. One key additional part for MLOps is continuous monitoring and testing.” Managing issues of uncertainty, data drift, and unintended consequences – all important for maintaining trust – can only be achieved if teams are keeping track of systems and their outcomes.
As a final note on the event, all participants praised the collaboration between Vector and CIBC, which is enabled through the bank’s role as a Vector founding sponsor. Speakers highlighted work in Vector’s Privacy Enhancing Techniques Bootcamp and the Trustworthy AI collaborative and multidisciplinary thought leadership project. Ali Taiyeb, Director of Industry Innovation at Vector, also pointed to CIBC’s participation in the Dataset Shift Project, which yielded insights that are now being further explored and developed within CIBC. Taiyeb also discussed Vector’s Forecasting Project, outcomes from which Andrew Brown, CIBC’s Executive Director of Alternate Solutions Group and Simplii Financial, noted will be the subject of a future Analytics Edge session. “It’s been great working with Vector,” Brown said. “We’ll share our learnings from what Vector taught us about advanced techniques for forecasting, and that should be useful for our teams at CIBC.”
Moderator Ozge Yeloglu closed out the event by articulating the mutual appreciation felt by project participants from CIBC and Vector: “It’s always a pleasure coming up with new ideas, and not necessarily research for the sake of research, but how we can actually apply these things in our businesses,” she said. “It’s very valuable to us.”