Events

Loading Events

All Events

  • This event has passed.

Endless Summer School: NeurIPS Highlights

February 16 @ 10:00 am - 12:00 pm

This session summarizes some of the papers, workshops, and tutorials Vector researchers presented at the 35th annual conference on Neural Information Processing Systems (NeurIPS), the premier machine learning conference. This year vector researchers presented more than 50 papers, and were selected for 10 workshops, 4 spotlights, and 1 tutorial in different fields of AI research, including deep learning, reinforcement learning, computer vision, and responsible AI.

 

Register

Featuring:

Jimmy BaJimmy Ba

How does a Neural Network’s Architecture Impact its Robustness to Noisy Labels?

Jingling Li, Mozhi Zhang, Keyulu Xu, John P Dickerson, Jimmy Ba
Noisy labels are inevitable in large real-world datasets. In this work, we explore an area understudied by previous works — how the network’s architecture impacts its robustness to noisy labels. We provide a formal framework connecting the robustness of a network to the alignments between its architecture and target/noise functions. Our framework measures a network’s robustness via the predictive power in its representations — the test performance of a linear model trained on the learned representations using a small set of clean labels. We hypothesize that a network is more robust to noisy labels if its architecture is more aligned with the target function than the noise. To support our hypothesis, we provide both theoretical and empirical evidence across various neural network architectures and different domains. We also find that when the network is well-aligned with the target function, its predictive power in representations could improve upon state-of-the-art (SOTA) noisy-label-training methods in terms of test accuracy and even outperform sophisticated methods that use clean labels.

 

Yaoliang Yu

Yaoliang Yu

Yaoliang Yu and Guojun Zhang

Quantifying and Improving Transferability in Domain Generalization
Guojun Zhang, Han Zhao, Yaoliang Yu, Pascal Poupart
When transferring a predictor from the lab to the real world, there are always discrepancies between the data in the lab and data in the wild.  In this paper, we quantify the transferability of data features and describe a new algorithm to compute transferable features.  This work advances the state of the art in data analysis when there is a need to transfer a predictor trained in some domain (e.g., client A) to a new domain (e.g., client B).

 

Guojun Zhang

 

 

 

 

 

 

 

Mihai Nica

The Future is Log-Gaussian: ResNets and Their Infinite-Depth-and-Width Limit at Initialization
Mufan (Bill) Li, Mihai Nica, Daniel M. Roy
The infinite-width limit theory dramatically expanded our understanding of neural networks. Real life networks are, however, too deep: their performance deviates from the infinite-width theory. We study networks with residual connections in the infinite-depth-and-width limit, and show remarkable agreement between theoretical predictions and empirical measurements in real networks.

 

Gennady Pekhimenko PhotoGennady Pekhimenko,  Max Ryabinin and Eduard Gorbunov

Moshpit SGD: Communication-Efficient Decentralized Training on Heterogeneous Unreliable Devices
Max Ryabinin, Eduard Gorbunov, Vsevolod Plokhotnyuk, Gennady Pekhimenko
Training of deep neural networks is often accelerated by combining the power of multiple servers with distributed algorithms. Unfortunately, communication-efficient versions of these algorithms frequently require reliable high-speed connections usually available only in dedicated clusters. This work proposes Moshpit All-Reduce — a fault-tolerant scalable algorithm for decentralized averaging that maintains favorable convergence properties than regularly distributed approaches. We show that Moshpit SGD, a distributed optimization method based on this algorithm, has both strong theoretical guarantees and high practical efficiency. In particular, we demonstrate gains of 1.3-1.5x in large-scale deep learning experiments such as ImageNet classification with ResNet-50 or ALBERT-large pretraining on BookCorpus.

 

 

Register

 

This event is open to Vector Sponsors, Vector Researchers, and invited health partners only. Any registration that is found not to be a Vector Sponsor, Vector Researcher or invited health partner will be asked to provide verification and, if unable to do so, will not be able to attend the event. Please contact events@vectorinstitute.ai with any questions.

Virtual

Organizer

Vector Institute Professional Development
Scroll to Top