Vector researcher Will Grathwohl wants to lower the barriers to entry to AI
December 17, 2019
December 17, 2019
Photo: Will Grathwohl (far left) with fellow Vector Institute researchers Jesse Bettencourt, Yulia Rubanova, and Ricky Chen.
By Ian Gormely
Artificial intelligence is a transformative technology. Yet, much like the Internet before web browsers, it remains inaccessible to many people. The web’s true potential wasn’t realized until the barriers to entry were lowered to the point that “anyone with a laptop had the potential to build the next Facebook,” says Will Grathwohl, a Vector researcher and graduate student at the University of Toronto. “I think we should put AI into people’s hands. The people who have the best ideas for how to apply something are usually not the people that created the thing. But right now, it’s just not like that at all.”
Grathwohl was part of a robust contingent of Vector affiliated folk who attended this year’s International Conference on Learning Representations (ICLR) in New Orleans. In total, 12 posters from Vector Faculty Members were accepted to the conference, with Grathwohl giving an oral presentation of the paper, “FFJORD: Free-Form Continuous Dynamics for Scalable Reversible Generative Models” which he co-authored with Ricky T. Q. Chen, Jesse Bettencourt, Ilya Sutskever, and Vector Faculty Member David Duvenaud.
FFJORD, an acronym for Free-form Jacobian of Reversible Dynamics, is a small but important step in Grathwhol’s quest to lower the barriers to entry to AI. There have been tremendous breakthroughs in the field, particularly around the use of machine learning, over the past five years. But those breakthroughs still require vast sums of hand-labelled data – say pictures of cats that are identified as such – and computing power, neither of which comes cheap. “To me, the most interesting method to making that amount of data smaller is finding ways to use the massive amounts of unlabeled data that are out there,” says the 27-year old. “One way that that’s become popular to do that is to look into generative models.”
Grathwohl’s paper looks specifically at normalizing flows, a class of generative models that have become popular in the machine learning community for their ability to generate samples and compute likelihood. Building them though requires placing a lot of restrictions on neural networks that can be used to solve a problem. FFJORD applies the idea of continuous time as a workaround to build better, less restrictive normalizing flows.
It builds off an idea first put forth by Grathwohl’s advisor, Vector Faculty Member David Duvenaud in the paper Neural Ordinary Differential Equations, which won Duvenaud, and his co-authors Ricky Tian Qi Chen, Yulia Rubanova, and Jesse Bettencourt the Best Paper Award at last year’s NeuIPS Conference. “David’s paper presented the idea of having a neural network parameterize a continuous time dynamic process. And that opened up a whole new paradigm to think about things that involve machine learning in neural networks,” says Grathwohl. Leveraging Duvenaud’s idea of switching from discrete time – data sampled at regular intervals – to continuous time – data sampled at any point in the flow – allows for the creation of normalizing flow-based generative models in a much simpler and expressive way.
After finishing his undergrad in 2014, Grathwohl spent several years bouncing around the tech industry, first as an entrepreneur, developing content moderation software, and later using machine learning for product indexing at a startup. Eventually, he became frustrated with the lack of creativity. Yet out of that milieu came the inspiration for his return to school. “My job was building infrastructure to collect data and figuring out how to do it as cheaply as possible,” he says. “We had to build more classifiers to serve more industries and more customers. Every single one of those was a constant cost of time and money. I realized we need to make these things work better with less data.”
FFJORD does not solve that problem, but it is a step in the right direction. “Better models that can solve this less labelled data problem will be a key piece,” he says, noting that down the road, normalizing flows could also help in modelling environments, an important aspect of genetic research and robotics. “Any improvement in unsupervised generative models will help us in the semi-supervised learning setting.”