- This event has passed.
Endless Summer School: Large Language Models
March 1 @ 10:00 am - 12:00 pm
This Endless Summer School session features a keynote from Google Research Scientist and incoming Vector Institute Faculty Member Wenhu Chen about his work in knowledge-grounded natural language processing (NLP). The event will also feature a talk from Dr. Alistair Johnson of SickKids on his work using large language models for the deidentification of data, and a talk from Zining Zhu on his work building interpretable NLP models.Register
Building Semi-Parametric Language Models to Integrate World Knowledge by Wenhu Chen
Large pre-trained language models have been shown to store factual knowledge in their parameters and achieve state-of-the-art results when fine-tuned on downstream NLP tasks. However, the large size of parameters like GPT-3 greatly hinders its downstream applications due to enormous computation and memory cost. In this talk, I will talk about the recent trend to dramatically decrease the model size by building semi-parametric language model where the world knowledge is disentangled and stored as semi-parameters. The semi-parameters have following benefits: 1) it’s stored in CPU ram, decreasing GPU ram footprint, 2) it’s activated very sparsely to incur negligible computation cost, 3) each semi-parameter represent a specific piece of knowledge, thus greatly enhancing model interpretability.
Wenhu Chen is a research scientist at Google Research and an incoming Faculty Member at the Vector Institute and Assistant Professor at the University of Waterloo. His research interests include natural language processing, deep learning, knowledge representation and reasoning. Specifically, he aims to develop models that can ground and reason over external world knowledge to understand human language and communicate with humans. He is also interested in multi-modal problems like visual question answering and captioning. He publishes and serves as program committee in ACL, NAACL, EMNLP, ICLR, NeurIPS, etc. I received the WACV best-student paper honorable mention in 2021. I also received outstanding dissertation award from UCSB in June 2021. I serve as Senior Program Committee for AAAI 2022.
Deidentification of free-text clinical notes by Dr. Alistair Johnson
The ability of caregivers and investigators to share patient data is fundamental to many areas of clinical practice and biomedical research. Prior to datasharing, it is often necessary to remove patient identifiers such as names, a process known as deidentification. In recent years, advances in machine learning methods have led to rapid performance improvements in natural language processing tasks, in particular with the advent of large-scale pretrained language models. Here we describe an approach for deidentification of clinical notes based on a bidirectional transformer model. Current challenges unique to deidentification are highlighted, including the absence of clear annotation guidelines, lack of portability of models, and paucity of training data.
Alistair Johnson is a Scientist at the Hospital for Sick Children. He received his Bachelor of Biomedical and Electrical Engineering at McMaster University and successfully read for a DPhil at the University of Oxford. Dr. Johnson’s work focuses on overcoming barriers to data sharing in healthcare, and his work on MIMIC-III demonstrates the immense potential of publicly available data. Dr. Johnson’s current research focuses on the development of new database structures tailored for healthcare, algorithms for deidentification of rich clinical data, and tools for assessing data quality.
Incorporating probing for developing large language models by Zining Zhu
Recently, large language models based on deep neural networks have demonstrated impressive performances on a wide variety of tasks, but how they are able to do so remains relatively unknown. As people try to inquire the intrinsic mechanisms of large language models, a convenient method, probing, have gained increasing popularity. Probing is a flexible framework that has revealed many aspects of large language models. This talk reviews some interesting findings from probing and discuss the potentials of developing the next-generation, foundational language models using the findings.
Zining Zhu is a PhD student at the University of Toronto. His long-term goal is to build trustworthy NLP systems that deploy reliably across domains. Zining is interested in understanding the mechanisms and abilities of neural NLP systems to encode and use knowledge with language as a medium.
This event is open to Vector Sponsors, Vector Researchers, and invited health partners only. Any registration that is found not to be a Vector Sponsor, Vector Researcher or invited health partner will be asked to provide verification and, if unable to do so, will not be able to attend the event. Please contact firstname.lastname@example.org with any question.