Vector community explores data privacy research at Machine Learning Privacy and Security Workshop

By Natasha Ali

The recent Machine Learning Security and Privacy workshop brought together a number of distinguished Vector Institute Faculty Members, Faculty Affiliates and Postdoctoral Fellows. Held at the Vector Institute office in July, the event saw researchers discussing emerging trends and research findings in the field of machine learning security and privacy, as rapid developments lead to calls for immediate measures for regulating the process of machine training and preserving private information and user data.

Reducing machine learning models’ susceptibility to attacks

Vector Faculty Member Nicholas Papernot presented a unique take on machine learning model theft and appropriate defense mechanisms. As an assistant professor in the Department of Electrical and Computer Engineering at the University of Toronto and a Canada CIFAR AI Chair, Papernot has made significant contributions to the field of machine learning security, pioneering the development of algorithms that facilitated machine learning privacy research.

He focused on prediction-based machine learning models, which are particularly vulnerable to adversarial attacks – attacks that aim to misuse machine learning models in a way that results in consequences unintended by the model owner.

Model extraction, a common type of adversarial attack, enables attackers to mimic and thus “steal” victim machine learning models, bypassing the costly process of dataset curation. By gaining access to model outputs and their prediction processes, an attacker can reproduce models.

Beyond model extraction, machine learning comes with a broad variety of adversarial risks. Attackers can also modify a machine learning model’s training processes such that it falsely interprets malicious information as benign. In forcing models to learn harmful data, attackers can disrupt the chain of command in deep neural networks and transfer falsified training data from one model to another, creating a domino effect of malicious activity.

Papernot pointed out that common defenses against model extraction attacks involve blocking the attacker’s access to the original outputs to disrupt the training of the attacker model. For example, the owner of a victim model can identify whether its model was stolen by embedding “watermarks” into the training data. Unfortunately, this inevitably comes at the cost of model utility. Alternatively, one can use membership inference techniques to compare the suspected machine learning model against the original model to determine if training data and policies were stolen.

However, these approaches are all reactive: they check whether a model was stolen after the fact. Instead, Papernot suggested that one can take preemptive measures to negatively manipulate the cost-benefit trade-off of stealing a machine learning model, thus disincentivizing model theft.

To fend off model extraction attacks before they infiltrate the models, Papernot proposed a pro-active mechanism that increases the computational costs of model theft without compromising the original models’ functionality or output.

Using an elaborate detection scheme, the victim model can generate puzzles of various difficulty levels that the attacker has to solve to gain access to the model output. As the first defense system to preserve the original model’s accuracy, this method enforces high computational costs for model extraction, thereby deterring attacks.

“If you calibrate the puzzle, then the attacker has to spend a lot more compute power and has to wait for longer in order to steal your machine learning model.”

Nicholas Papernot

Vector Faculty Member, Canada CIFAR AI Chair

Papernot believes that this groundbreaking pro-active defense mechanism can reduce data theft preemptively by restricting user access to machine learning algorithms prior to being infiltrated by malicious attackers.

Developing data poisoning algorithms

Expanding on data manipulation attacks, Vector Faculty Member and Canada CIFAR AI Chair Yaoliang Yu led an informative discussion on data poisoning in neural network models.

In these attacks, a malicious party can inject “poisonous” data into the training datasets of machine learning models. Since modern machine learning techniques rely on large amounts of training data, data poisoning attacks pose a serious threat that compromises the validity of model outputs and leaves them vulnerable to subsequent attacks.

Yu’s research focuses on designing new data poisoning algorithms in order to assess their immediate impact on model accuracy and classify them based on their attack strategies. His two recent papers highlighted indiscriminate attacks for image classification, where attackers can obtain training data and corrupt the process of image labeling, yielding inaccurate results in image generation tasks.

In collaboration with fellow researchers at the University of Waterloo, Yu developed data poisoning attacks that generate data points more efficiently and effectively than before. Unlike previous data poisoning methods, the proposed Total Gradient Descent Ascent (TGDA) model produces thousands of poisoned datasets all at once, boosting the speed and performance of the adversary attack.

These elaborate methods enable researchers to examine the extent of interference in poisoning attacks, making it faster and easier to qualify different types of malicious activities and study their impact on machine training and model output.

Combining public and private data training to enhance privacy

Shifting gears to differential privacy, Vector Faculty Member and Canada AI CIFAR Chair Gautam Kamath discussed the pitfalls associated with using sensitive and personal information to train machine learning models.

“The Vector Institute has so much expertise in ML security and privacy! It’s fantastic to have the opportunity to bring everyone together and see everyone’s perspectives on how we tackle the biggest challenges in the field.”

Gautam Kamath

Vector Faculty Member and Canada AI CIFAR Chair

Given the continuous development of large language models that are trained on publicly available data, he expressed concern that “machine learning models are memorizing things in their training datasets which may be things we don’t want to reveal.” Large language models are generally pre-trained on public datasets which are readily available online, such as downloadable texts, Wikipedia articles, and blog posts. However, a conflict of interest arises when large language models are coerced into generating copyrighted material and sensitive data without proper regulation or consent.

This is where differential privacy comes in. Regularly used as an approach to protect user datasets, it enables public sharing of group data and information patterns, while maintaining the privacy of individual user data points and identifiers. As a valuable data analysis mechanism, differential privacy allows researchers and businesses to collect group data and dissect a machine learning model’s utility without compromising sensitive information about the users.

Kamath’s proposed approach involves two main steps: pretraining using public datasets, followed by fine-tuning to narrow the process down to task-specific smaller datasets.

“I’m distinguishing between two types of data,” he said. “One is public data which is large and diverse, which we’re going to use for pretraining, and the other one is smaller, private, and more focused on the downstream tasks.”

In an ideal situation, public and private data go hand-in-hand such that public data is used to improve private machine learning accuracy. The caveat, he noted, is that not all publicly available data is appropriate; some datasets may have been plagiarized from external sources while others may be unsuitable for the specific purposes of machine training.

Kamath concluded that differential privacy guarantees machine learning models won’t depend on sensitive data from a single individual in their training, but rather on aggregate patterns contained in large-scale datasets from a collection of users. This approach polishes the output data as a whole, keeping machine learning models from memorizing and sharing private individual data.

Additional highlights from the event included Vector Faculty Member Shai Ben-David’s discussion on the intersection of public and private data training in estimation tasks, Vector Affiliate Reza Samavi’s talk on the robustness of deep neural networks to adversarially modified data, and Vector Postdoctoral Fellow Masoumeh Shafieinejad’s unique take on the applications of differential privacy and data regulation in industry, academia, and healthcare settings.

As machine learning models continue to advance, more data analysis and computational processes are required to successfully train them. With the frequent deployment of training datasets in machine learning algorithms and the rise in large language model applications, the event speakers emphasized an urgent need for intellectual property protection and privacy regulation, as valuable information becomes more susceptible to thefts and malware attacks.

Want to learn more about the Vector Institute’s current research initiatives in machine learning privacy, click here for the full playlist of talks.

Reducing machine learning models’ susceptibility to attacks

Developing data poisoning algorithms

Combining public and private data training to enhance privacy

Related:

Vector researchers tackle real-world AI challenges at ICML 2025

Transforming Youth Mental Health Support: FAIIR’s AI-Powered Crisis Response Model

AI Weather Forecasting Breakthrough: How Canadian Innovation is Transforming Climate Prediction | Aardvark Weather