Computer Vision Technical Report details insights from industry-academic collaborative project

May 26, 2022

2022 Blog Insights Natural Language Processing Research Research 2022

May 26, 2022

Vector’s Industry Innovation team has released the Computer Vision: Applications in Manufacturing, Surgery, Traffic, Satellites, and Unlabelled Data Recognition Technical Report. It details experiments and insights from the computer vision (CV) project, a multi-phase industrial-academic collaborative project focusing on recent advances in CV, one of the largest and fastest-growing areas of AI. 

The project is the latest example of Vector bridging the gap between academia and industry, a key part of Vector’s Three Year Strategic Plan. With advances in AI proliferating at an increasing rate, Vector’s Industry Innovation team engages in collaborative projects with corporate partners to deepen the understanding of cutting-edge AI techniques, accelerate their adoption, and enhance the skills of AI practitioners to help realize the societal and economic potential of AI. Past projects include work around natural language processing and dataset shift

The CV project brought together 15 Vector researchers and 14 technical professionals from eight industry sponsors: EY, Intact, Linamar, PwC, RBC, Scotiabank, and Thales. Together they explored novel applications of computer vision methods to help sponsor companies apply the latest CV techniques to their own use cases while allowing researchers to assess how those methods worked in the real world. 

Divided into three working groups, project participants designed and performed experiments using three CV approaches: anomaly and semantic segmentation, two-stream neural networks, and transfer learning. These approaches were applied in the following five use cases:

  • Anomaly detection in manufacturing

Participants explored the use of autoencoders trained on the MVTec Anomaly Detection dataset to optimize anomaly detection on the manufacturing line. 

  • Semantic segmentation in aerial and road obstacle imagery

Participants applied semantic segmentation techniques to two image sources: satellite imagery and dash cam footage. Semantic segmentation techniques involve labeling each pixel in an image with a class and grouping classified pixels to identify objects.

  • Automated traffic incident detection with two-stream neural networks

Participants applied two-stream neural networks to dashcam footage to detect frames containing hazards, localize those hazards, and classify them by hazard type. 

  • Identifying clinically-relevant features of interest in cholecystectomy (gallbladder surgery) procedures

Participants applied semantic and instance segmentation techniques to enable real-time identification of specific anatomical regions (e.g., the common bile duct, hepatic artery, and portal vein) that are ‘no-go zones’ for surgeons performing laparoscopic cholecystectomy (the surgical removal of the gallbladder). 

  • Transfer learning for efficient video classification and detection

Participants studied the efficacy of transfer learning for detecting and classifying actions in videos that contain few or zero annotations. 

Researchers and sponsors have already seen positive results. Linamar’s work paved the path for an automated parts defect detection system, while Thales was able to work on obstacle detection that parallels work being done on their autonomous trains. 

Notably, two use cases have been presented by project participants at the Location Intelligence and Knowledge Extraction 2022 Canada Conference (LIKE ME), specifically “A Comparative Study of Semantic Segmentation Models for Building Footprint Extraction Using Satellite Imagery” and “Automated Traffic Incident Detection with Two-stream Neural Networks.” The former was nominated for the “Best Paper” award. Full descriptions of the technical implementations and results of each use case are provided in the report and the project toolkit includes various datasets and useful image/video tools such as data augmentation and visualization utilities provided by the Vector AI Engineering team. The project code is provided in the Computer Vision Project Repo.

Related:

2024
Machine Learning
Research
Research 2024

Vector Institute researchers reconvene for the second edition of the Machine Learning Privacy and Security Workshop

Headshot of Vector Faculty Member Wenhu Chen
2024
Insights
Research
Research 2024

Vector researcher Wenhu Chen on improving and benchmarking foundation models

2024
Research
Research 2024

Vector Researchers present papers at ACL 2024