The Genesis of Manulife’s NLP Academy

November 10, 2020

Blog Generative AI Natural Language Processing Research Trustworthy AI
Photo by Shahadat Rahman on Unsplash
November 10, 2020
By Jonathan Woods

In early 2020, Manulife data scientists from across the globe attended the firm’s first NLP Academy. That Academy ― a four-day deep dive into natural language processing (NLP) ― focused on topic modelling, a technique that automates document classification according to topic. For a major insurance company with copious amounts of unstructured text ― from customer call transcripts to benefit submissions and insurance applications ― automating the accurate reading, analysis, and sorting of documents is incredibly valuable.

The NLP Academy is part of Manulife’s Advanced Analytics function, which was established in 2016 as a strategic effort at business transformation. That function’s plan included the creation of a suite of expert teams dubbed Centres of Excellence (CoE), which are tasked with understanding new research and leading technical development within the company.

The NLP Centre of Excellence, the team that ran the NLP Academy, arose out of Manulife’s experience with the Vector Institute. As a Vector sponsor, Manulife scientists get to engage in Vector-hosted industry AI projects, working alongside top AI researchers on business-relevant experiments. One such project was Vector’s Recreation of Large-Scale Pre-trained Language Models – or as its more commonly known, the NLP Project.

Mateusz Ujma, Manulife’s Assistant Vice President of Research and Development in Analytics & AI, says of the experience: “After we engaged in the NLP Project with Vector, it sparked more interest in NLP as a skill that all our data scientists should have, and a few months later, we founded the NLP CoE.”

The NLP Project brought Manulife data scientists, Vector researchers, and other sponsor companies together to reproduce BioBERT, an advanced language representation model trained to understand general language along with specific medical terms and their contexts. The model was also fine-tuned to perform language-related tasks like named entity resolution (extracting biomedical named entities such as genes, proteins, and diseases), relation extraction (determining whether two words are related, given the text’s context), question answering tasks (finding answers to specific questions in a body of text), and text summarizations (extracting sentences from a document to create a summary and having a model describe a document’s text in its own words).

Working on the most advanced NLP modeling with Vector familiarized the team with new developments that they should capitalize on, and that they’ve since done by training the BERT model on real contact centre data to help identify customer pain points.

Part of Vector’s role in the NLP Project was identifying what cutting-edge research would be most impactful for industry participants. Ujma explains, “Given there’s hundreds of new NLP-related papers every year, it’s difficult to process all that great research, but Vector acted as a filter and only brought the most relevant research.”

Eugene Wen, Vice President of Group Advanced Analytics, explains how Vector’s role as a filter of state-of-the-art AI publications supports Manulife’s broader ambition: “We aim to become a leading organization in applying advanced analytics in businesses. The NLP Academy is an important step in this journey. Vector’s contribution is very well reflected in the success of our internal academy.”

The Vector sponsorship – which Wen considers a “strategic investment” ― has also paid off in the form of talent acquisition. On the BioBERT project, Shobhit Jain, Lead Data Scientist at Manulife and a participant in the Vector NLP Project, worked alongside Vector researcher Faiza Khattak, then a post-doc at Western University. Khattak was interested in industry, and through interactions on the project and at a Vector event honouring Turing Award winner Geoffrey Hinton, the Manulife team became familiar with her and soon hired her to the Centre of Expertise.

Jain says that the NLP Project is an excellent place to find experts in the technique: “Everyone who was there was interested in NLP, and because I had worked with some of them on that project, I knew how good they are.”

Khattak, for her part, finds Manulife’s approach of internally curating, teaching, and applying new research through the NLP Academy appealing. “Academia and industry are totally opposite in many cases,” she says. “What Manulife is doing is bridging the gap and at the same time helping its workers stay connected to the research. Their knowledge doesn’t become obsolete or outdated.”

To further bridge that gap, more AI-related programs in Manulife’s suite of academies are now underway. For instance, an Executive AI Academy is currently getting Manulife’s 1500 or so VPs and AVPs around the globe up to speed on AI, enabling them to identify problems suitable for AI solutions and engage various analytical teams to deliver business value to the front lines.

All of this groundwork is preparing Manulife to get value from AI at scale. According to Wen, the plan is to leverage Manulife’s AI talent and new research to “embed AI and advanced analytics into every line of our business.”

The ultimate goal: “to enable the company to generate new value and better service to our customer.”


Case Study

BMO, TELUS, and partners use Vector AI toolkit to apply computer vision techniques in the fight against climate change


Vector researchers are presenting over a dozen papers at CVPR 2024


Vector Institute Computer Vision Workshop showcases the field’s current capabilities and future potential