Events

- This event has passed.
NATURAL LANGUAGE PROCESSING (NLP) SYMPOSIUM
September 16, 2020 @ 10:00 am - 1:45 pm
Held Virtually – event open to Vector researchers and industry sponsors only
Register VIEW SEPT 15 AGENDA
September 16: AGENDA
Opening Remarks
10:00 am – 10:10 am

MC: Sedef Akinli Kocak
Project Manager, Industry Innovation, Vector Institute

Cameron Schuler
Chief Commercialization Officer and VP, Industry Innovation, Vector Institute
Keynote Presentation: AI at Work – How AI is transforming knowledge work
10:10 am – 10:40 am
In this talk, I will provide an overview of how AI in general and NLP in particular are transforming knowledge work and will share case studies from the Legal and News industries. I will also discuss our collaboration with the Vector Institute and other partner organizations on Deep Learning and the impact of deep learning on how Thomson Reuters develops AI-powered applications. I will conclude the talk with lessons learned from over 25 years of building machine learning and NLP applications and share my perspective on some of the opportunities that lie ahead.

Khalid Al-Kofahi
Senior Vice President and Head of AI Personal Investments, Fidelity, former VP, Research and Development and Head of Center for AI and Cognitive Computing, Thomson Reuters
Keynote Presentation: Say ‘ah’: Speech and language in medicine
10:40 am – 11:00 am
The healthcare industry is rapidly acquiring data, but organizing and using those data remains a challenge. Speech and language data, in particular, have multiple peculiar aspects that need to be overcome, in order to really derive the potential benefit, and to make healthcare more efficient, more affordable, and less prone to error. I will present a few use cases of NLP in healthcare, and also highlight a few of the current hurdles to operating this exciting technology in practice

Frank Rudzicz
Associate Scientist, International Centre for Surgical Safety, Li Ka Shing Institute, St. Michael’s Hospital, Associate Professor Department of Computer Science, University of Toronto, Director of AI, Surgical Safety Technologies Inc., Co-Founder, WinterLight Labs Inc., Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair
Project/Research Presentation
11:00 am – 11:30 am
Project Presentation: An Experimental Evaluation of Large NLP Models in the Biomedical Domain
With the growing amount of text in health data, there have been rapid advances in large pre-trained models that can be applied to a wide variety of biomedical tasks with minimal task-specific modifications. Emphasizing the cost of these models, which renders technical replication challenging, in this project we present experiments replicating BioBERT and further pre-training and fine-tuning in the biomedical domain. We also investigate the effectiveness of domain-specific and domain-agnostic pre-trained models across downstream biomedical NLP tasks. Our finding confirms that pre-trained models can be impactful in some downstream NLP tasks (QA and NER) in the biomedical domain.

Faiza Khattah Khan
Data Scientist, Manulife
Project Presentation: An Investigation of Transformer based model in Legal Texts
In this presentation we overview the results of some of the experiments we conducted to best customize the state-of-the-art contextualized transformer-based language models to address the specific characteristics of the legal domain texts.

Shohreh Shaghaghian
Research Scientist at Center for AI and Cognitive Computing, Thomson Reuters
Student Presentation: Examining the rhetorical capacities of neural language models
Language models are key components for building neural NLP systems in discourse contexts. How should we choose between BERT, GPT-2, and other language models? We measure their abilities to understand rhetorical signals in texts, using a specially designed method called RST-probe.

Zining Zhu
PhD Student, University of Toronto
Panel Discussion: Business Impacts: What is the main risk to NLP or from NLP in the next 5 years?”
Moderator:

Frank Rudzicz
Associate Scientist, International Centre for Surgical Safety, Li Ka Shing Institute, St. Michael’s Hospital, Associate Professor Department of Computer Science, University of Toronto, Director of AI, Surgical Safety Technologies Inc., Co-Founder, WinterLight Labs Inc., Faculty Member, Vector Institute, Canada CIFAR Artificial Intelligence Chair
Panelists:

Stephany Lapierre
CEO, tealbook

Jimoh Ovbiagele
Co-founder/CTO ROSS Intelligence

Yevgeniy Vahlis
Head of Artificial Intelligence Capabilities, Bank of Montreal

Andrew Brown
Senior Director of Data Science and AI Research, CIBC
Networking and Poster Session
12 noon – 12:30 pm
Poster #1: Application of NLP in Emergency Medical Services Amrit Sehdev Queen’s University |
Poster #2: Modelling Sentence Pairs via Reinforcement Learning: An Actor-Critic Approach to Learn the Irrelevant Words Mahtab Ahmed University of Western Ontario |
Poster #3: SentenceMIM: A Latent Variable Language Model Micha Livne University of Toronto |
Poster #4: Training without training data: Improving the generalizability of automated medical abbreviation disambiguation Marta Skreta University of Toronto |
Poster #5: Explainability for deep learning text classifiers Diana Lucaci University of Ottawa |
Poster #6: Identifying Clinical Terms in Medical Text Using Ontology-Guided Machine Learning Aryan Arbabi University of Toronto |
Poster #7: Sharing is Caring: Exploring machine learning methods to facilitate medical imaging exchange using metadata only Joanna Pineda University of Toronto |
Poster #8: GOBO: Quantizing Attention-Based NLP Models for Low Latency and Energy Efficient Inference Ali Hadi Zadeh University of Toronto |
Poster #9: Improved knowledge distillation by utilizing backward pass knowledge in neural networks Aref Jafari University of Waterloo |
Poster #10: How Nouns Surface as Verbs: Inference and Generation in Word Class Conversion Lei Yu University of Toronto |
Poster #11: Informal Natural Language Processing: The Case of Slang Zhewei Sun University of Toronto |
Poster #12: Applications of the Chinese Remainder Theorem in Word Embedding Compression and Arithmetic Patricia Thaine University of Toronto |
Poster #13: Predicting change in Major Depressive Disorder symptoms based on topic modelling features from psychiatric notes: An exploratory analysis Marta Maslej Centre for Addiction and Mental Health |
Poster #14: Non-Pharmaceutical Intervention Discovery with Topic Modeling Jonathan Smith Layer 6 |
Poster #15: Hurtful words: quantifying biases in clinical contextual word embeddings Haoran Zhang University of Toronto |
Poster #16: Domain Specific Fine-tuning of Denoising Sequence-to-Sequence Models for Natural Language Summarization Matt Kalebic PwC/Deloitte |
Poster #17: An Experimental Evaluation of Large NLP Models in the Biomedical Domain Faiza Khattak Khan Data Scientist, Manulife |
Poster #18: An Investigation of Transformer based model in Legal Texts Shohreh Shaghaghian Research Scientist at Center for AI and Cognitive Computing, Thomson Reuters |
Poster #19: Examining the rhetorical capacities of neural language models Zining Zhu PhD Student, University of Toronto |
Poster #20: Using natural language processing to predict splenomegaly from >100,000 structured CT reports Karen Batch Queen’s University |
Networking and Break
12:30 pm – 12:45 pm
Concurrent Workshops
12:45 pm – 1:45 pm
WS1: How to use Fairseq (Facebook AI Research Sequence-to-Sequence Toolkit) for your own NLP application
Fairseq library is a sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling and other text generation tasks. In this short workshop we will walk participants through the basics of the Fairseq library. We will dive into their codebase and learn how to modify existing modules to create and keep track of new applications. The purpose of this workshop is to provide learning through demonstration and hands-on experience.
Level of workshop: Intermediate/Advanced

Joey Cheng
Machine Learning Research Scientist, Layer 6

Gary Huang
Machine Learning Research Scientist, Layer 6

Felipe Perez
Senior Machine Learning Research Scientist, Layer 6
WS2: Question Answering Systems in Responding to COVID-19 Open Research Dataset Challenge
In response to the COVID-19 pandemic, the White House and a coalition of leading research groups have prepared the COVID-19 Open Research Dataset (CORD-19). In this workshop we demonstrate three Question Answering systems that were submitted to a Kaggle COVID-19 Open Research Dataset Challenge that could help the medical community develop answers to high priority scientific questions. The competition has launched to provide a chance for the machine learning research communities to employ advanced NLP methods to form a QA system for finding scientific answers for questions related to COVID-19. To do so, a dataset that is a collection of scholarly studies on the coronavirus group (i.e., referred to as the CORD-19) has been provided as a result of a collaboration between different research institutes such as Microsoft Research, the Allen Institute for AI, the National Library of Medicine at the NIH.
Level of workshop: Intermediate

Rohan Bhambhoria
PhD Researcher, Queen’s University

Luna Feng
Research Scientist, Thomson Reuters

Hillary Ngai
Graduate Researcher, Vector Institute, University of Toronto

Yoona Park
Masters Student, Vector Institute, University of Toronto

Mah Parsa
Post-doc Researcher, Vector Institute, University of Toronto.
Event to conclude
1:45 pm
Register