Vector researchers help institutions ensure privacy and confidentiality when sharing ML models

October 19, 2021

2021 Insights Machine Learning Research 2021

By Ian Gormely
October 19, 2021

Vector Institute staff and researchers are helping facilitate greater collaboration between banks or hospitals with a new system that allows institutions to jointly work with machine learning (ML) models while offering guarantees of privacy and confidentiality. Called Confidential and Private Collaborative Learning or CaPC, the method is one of several privacy-enhancing technologies (PETs) being showcased for industry, health, and government partners by Vector researchers and AI Engineering team. 

CaPC lets organizations collaborate without revealing to one another the inputs, models, or training data. There are existing systems for sharing ML models between organizations, some of which promise privacy or confidentiality. But Vector researcher Adam Dziedzic, who co-created the system, says that theirs is “the only one to do both.” 

Detailed in the paper “CaPC Learning: Confidential and Private Collaborative Learning” by Dziedzic, Vector researchers Christopher A. Choquette-Choo, Natalie Dullerud, and Faculty Member Nicolas Papernot along with their colleagues Yunxiang Zhang, Somesh Jha, and Xiao Wang, the system combines tools from cryptography and privacy research. It enables collaboration between participants without having to explicitly join their training sets or train a central model. In this way, a model’s accuracy and fairness can be improved while still protecting the confidentiality of the data and the privacy of the person to whom the data belongs. CaPC was specifically designed with hospitals and banks in mind since the health and finance sectors tend to require strict privacy and confidentiality regulations, but it has the potential to be expanded to other industries. Currently, the model remains proof of concept. “We wanted to show that it is possible,” says Dziedzic. “Now we’re working on extending its capacity to be able to apply this in the real world.”

To help make that transition, CaPC will be part of the industry toolkit made available at Vector’s upcoming PETs Bootcamp along with a larger suite of implementations and demos. Along with CaPC, the three-day bootcamp will feature demonstrations of Federated Learning, Differential Privacy, and Homomorphic Encryption with the goal of helping industry, health, and public service partners explore and build basic PET prototypes for potential deployment in their organizations. “We’re trying to bridge the gap between leading research and industry applications by making it easy to implement these state-of-the-art privacy enhancing techniques” says Deval Pandya, Vector’s Director of AI Engineering. 

“An event like this not only allows us to showcase our system,” says Dziedzic, who will be presenting his team’s demo, “it also lets us receive feedback about the specific needs of industry. This kind of collaboration helps us to improve the system and hopefully see it in practice.”

Learn more about Confidential and Private Collaborative Learning here

Learn more about Vector’s work with industry here

Related:

2024
Machine Learning
Research
Research 2024

Vector Institute researchers reconvene for the second edition of the Machine Learning Privacy and Security Workshop

Headshot of Vector Faculty Member Wenhu Chen
2024
Insights
Research
Research 2024

Vector researcher Wenhu Chen on improving and benchmarking foundation models

Vector Faculty Member Gautam Kamath
2024
Insights
Research
Research 2024

Vector researcher Gautam Kamath breaks down the latest developments in robustness and privacy