Computer Vision Technical Report details insights from industry-academic collaborative project
May 26, 2022
May 26, 2022
May 26, 2022
Vector’s Industry Innovation team has released the Computer Vision: Applications in Manufacturing, Surgery, Traffic, Satellites, and Unlabelled Data Recognition Technical Report. It details experiments and insights from the computer vision (CV) project, a multi-phase industrial-academic collaborative project focusing on recent advances in CV, one of the largest and fastest-growing areas of AI.
The project is the latest example of Vector bridging the gap between academia and industry, a key part of Vector’s Three Year Strategic Plan. With advances in AI proliferating at an increasing rate, Vector’s Industry Innovation team engages in collaborative projects with corporate partners to deepen the understanding of cutting-edge AI techniques, accelerate their adoption, and enhance the skills of AI practitioners to help realize the societal and economic potential of AI. Past projects include work around natural language processing and dataset shift.
The CV project brought together 15 Vector researchers and 14 technical professionals from eight industry sponsors: EY, Intact, Linamar, PwC, RBC, Scotiabank, and Thales. Together they explored novel applications of computer vision methods to help sponsor companies apply the latest CV techniques to their own use cases while allowing researchers to assess how those methods worked in the real world.
Divided into three working groups, project participants designed and performed experiments using three CV approaches: anomaly and semantic segmentation, two-stream neural networks, and transfer learning. These approaches were applied in the following five use cases:
Participants explored the use of autoencoders trained on the MVTec Anomaly Detection dataset to optimize anomaly detection on the manufacturing line.
Participants applied semantic segmentation techniques to two image sources: satellite imagery and dash cam footage. Semantic segmentation techniques involve labeling each pixel in an image with a class and grouping classified pixels to identify objects.
Participants applied two-stream neural networks to dashcam footage to detect frames containing hazards, localize those hazards, and classify them by hazard type.
Participants applied semantic and instance segmentation techniques to enable real-time identification of specific anatomical regions (e.g., the common bile duct, hepatic artery, and portal vein) that are ‘no-go zones’ for surgeons performing laparoscopic cholecystectomy (the surgical removal of the gallbladder).
Participants studied the efficacy of transfer learning for detecting and classifying actions in videos that contain few or zero annotations.
Researchers and sponsors have already seen positive results. Linamar’s work paved the path for an automated parts defect detection system, while Thales was able to work on obstacle detection that parallels work being done on their autonomous trains.
Notably, two use cases have been presented by project participants at the Location Intelligence and Knowledge Extraction 2022 Canada Conference (LIKE ME), specifically “A Comparative Study of Semantic Segmentation Models for Building Footprint Extraction Using Satellite Imagery” and “Automated Traffic Incident Detection with Two-stream Neural Networks.” The former was nominated for the “Best Paper” award. Full descriptions of the technical implementations and results of each use case are provided in the report and the project toolkit includes various datasets and useful image/video tools such as data augmentation and visualization utilities provided by the Vector AI Engineering team. The project code is provided in the Computer Vision Project Repo.