This is a navigation window which overlays the main content of the page. Pressing the "X" at the top right corner of the modal will close the modal and bring you back to where you were on the page.
See posts about
This is a call to action window which overlays the main content of the page. Pressing the "X" at the top right corner of the modal will close the modal and bring you back to where you were on the page.
Let's Collaborate
This is a search window which overlays the main content of the page. Pressing the "X" at the top right corner of the modal will close the modal and bring you back to where you were on the page.
Associate Professor, Department Computer Science, University of Toronto
Canada CIFAR Artificial Intelligence Chair
Roger is an Assistant Professor of Computer Science at the University of Toronto, focusing on machine learning. Previously, he was a postdoc at the University of Toronto, after having received a Ph.D. at MIT, studying under Bill Freeman and Josh Tenenbaum. Before that, Roger did his undergraduate degree in symbolic systems and MS in computer science at Stanford University. Roger is a co-creator of Metacademy, a web site which uses a dependency graph of concepts to help you formulate personalized learning plans for machine learning and related topics.
Research Interests
AI Alignment
Understanding Deep Learning
Efficient Second-Order Approximations for Neural Nets
Highlights
Alfred P. Sloan Research Fellow in Computer Science
Connaught New Researcher Award
Canada Research Chair in Probabilistic Inference and Deep Learning
Canada CIFAR AI Chair
Publications
Flipout: Efficient Pseudo-Independent Weight Perturbations on Mini-Batches
Y. Wen, P. Vicol, J. Ba, D. Tran, and R. Grosse
International Conference on Learning Representations (ICLR) 2018
Understanding Short-Horizon Bias in Stochastic Meta-Optimization
Y. Wu, M. Ren, R. Liao, and R. Grosse
International Conference on Learning Representations (ICLR) 2018
Scalable trust-region method for deep reinforcement learning using kronecker-factored approximation
Y. Wu, E. Mansimov, R. B. Grosse, S. Liao, and J. Ba
Advances in Neural Information Processing Systems (NIPS) 2017
The reversible residual network: Backpropagation without storing activations
A. N. Gomez, M. Ren, R. Urtasun, and R. B. Grosse
Advances in Neural Information Processing Systems (NIPS) 2017
Lime: Learning inductive bias for primitives of mathematical reasoning
Yuhuai Wu and Markus N Rabe and Wenda Li and Jimmy Ba and Roger B Grosse and Christian Szegedy
2021
Differentiable annealed importance sampling and the perils of gradient noise
Guodong Zhang and Kyle Hsu and Jianing Li and Chelsea Finn and Roger B Grosse
2021
On Monotonic Linear Interpolation of Neural Network Parameters
James R Lucas and Juhan Bae and Michael R Zhang and Stanislav Fort and Richard Zemel and Roger B Grosse
2021
REFACTOR: Learning to Extract Theorems from Proofs
Jin Peng Zhou and Yuhuai Wu and Qiyang Li and Roger Baker Grosse
2021
Improving Mutual Information Estimation with Annealed and Energy-Based Bounds
Rob Brekelmans and Sicong Huang and Marzyeh Ghassemi and Greg Ver Steeg and Roger Baker Grosse and Alireza Makhzani
2021
Studying large language model generalization with influence functions
Roger Grosse and Juhan Bae and Cem Anil and Nelson Elhage and Alex Tamkin and Amirhossein Tajdini and Benoit Steiner and Dustin Li and Esin Durmus and Ethan Perez and Evan Hubinger and Kamilė Lukošiūtė and Karina Nguyen and Nicholas Joseph and Sam McCandlish and Jared Kaplan and Samuel R Bowman
2023
Many-shot jailbreaking
Cem Anil and Esin Durmus and Nina Panickssery and Mrinank Sharma and Joe Benton and Sandipan Kundu and Joshua Batson and Meg Tong and Jesse Mu and Daniel Ford and Francesco Mosconi and Rajashree Agrawal and Rylan Schaeffer and Naomi Bashkansky and Samuel Svenningsen and Mike Lambert and Ansh Radhakrishnan and Carson Denison and Evan Hubinger and Yuntao Bai and Trenton Bricken and Timothy Maxwell and Nicholas Schiefer and James Sully and Alex Tamkin and Tamera Lanham and Karina Nguyen and Tomek Korbak and Jared Kaplan and Deep Ganguli and Samuel Bowman and Ethan Perez and Roger B Grosse and David K Duvenaud
2024
Connecting the dots: Llms can infer and verbalize latent structure from disparate training data
Johannes Treutlein and Dami Choi and Jan Betley and Samuel Marks and Cem Anil and Roger B Grosse and Owain Evans
2024
Training data attribution via approximate unrolling
Juhan Bae and Wu Lin and Jonathan Lorraine and Roger B Grosse
2024
Measuring stochastic data complexity with boltzmann influence functions
We use cookies to ensure that we give you the best experience on our website. If you continue to use this site we will assume that you are happy with it.