The New Cartography of the Invisible
January 22, 2026
January 22, 2026
By John Knechtel
From the telescope to the balance sheet – a Foundation Models for Science Workshop recap relates how scientists can help businesses solve their most stubborn data problems and build AI that actually works.
On a brisk Tuesday in Toronto, as the November sun filtered down from skylights far above, a packed room at MaRS gathered for the Vector Institute’s Foundation Models for Science Workshop. The astrophysicists, biologists, and chemists in the audience were there to tackle a concrete problem: how to make machines understand phenomena that even humans have only recently learned to see.
The public story of AI has, of late, focused on models, like GPT-4, that are trained on the entirety of the open internet to best mimic human language. But in the labs where drugs are designed and in the observatories that map the universe, the internet is effectively a desert. You can’t scrape your way to the chemical signature of a failing liver or the distance to a galaxy thirteen billion light-years away. Those truths are uncovered slowly, at the cost of millions of dollars and decades of work.
Fittingly, the day was focused on building AI that works even when data is scarce, stakes high, and sources disconnected. In what follows, we’ll look at three recurring challenges in this landscape: the annotation bottleneck, the trust gap, and the lab‑to‑production leap.
Nolan Koblischke, a doctoral student in Astrophysics at the University of Toronto, spends his days looking for a galactic needle in a haystack: rare celestial phenomena such as gravitational lenses—moments where the gravity of a massive foreground galaxy acts as a cosmic magnifying glass, warping the light of a more distant object into a luminous ring. But in astronomy, the core problem is not a lack of data, but an overwhelming excess of the wrong kind of data. These lenses are the “rare jewels” of cosmology, appearing in perhaps one out of every ten thousand telescope images.
Traditionally, finding them has meant a Herculean amount of manual work. Teams of researchers sit in front of screens, scrolling through millions of near-identical smudges, labelling them one by one. For executives, the pattern will be familiar: you have a huge asset—an image archive, a fleet of sensors, a decade of scans—but it’s effectively unusable because no one has had the time or budget to label it at scale.
Koblischke’s team decided to remove that bottleneck. They fed 140 million galaxy images to GPT-4o-mini and asked it for simple textual descriptions. Then they used a technique called contrastive learning to build a bridge between the world of pixels and the world of language. Instead of treating images and text as separate silos, they trained a foundation model (AION-1) to place them in a shared semantic space where similar images end up close together, words and phrases that describe those images are pulled into the same neighbourhood, and unrelated concepts are pushed far apart.
Simply put, they taught the model a bilingual dictionary that maps pictures to prose. In this space, an image of a swirling galaxy and the word “spiral” sit side by side; an image of a smooth, featureless blob and the phrase “elliptical galaxy” form another cluster.
The payoff is what AI researchers call “zero-shot” search. Because the model understands the relationship between visual patterns and language, astronomers no longer have to predefine every category they care about. A researcher can type a brand-new query like “distorted blue ring” or “thin crescent arc near bright core,” and the system instantly surfaces the images that live closest to that description in its library—even if no one ever trained a classifier for that exact phrase.
For corporations, this represents a fundamental shift. Instead of investing years in manual labelling and rigid taxonomies, you get an open-ended, natural language interface to your visual data and turn your archive from a static record into an interactive, conversational asset that specialists can interrogate in their own words.
The same pattern extends to any industry drowning in images but starved for structured labels. In healthcare, radiologists could query millions of scans for phrases like “small, spiculated nodule in upper right lung” without pre-annotating every case, because the model uses a shared image–text space to find close matches. In manufacturing, engineers could search for “hairline crack near weld” or “misaligned component on conveyor two” and have the system scan across product lines and plants, simply by describing the defect they care about.
Executives can take a straightforward lesson from this work: your unlabelled visual data is one of your largest underused assets. Instead of pouring resources into exhaustive annotation projects, you can use foundation models to align images and language in a single semantic space, give domain experts a search box that speaks their vocabulary, and unlock zero-shot use cases—new queries, new products, new defect types—without rebuilding your AI stack each time.
In other words, the strategic opportunity isn’t just to see more in your images; it’s to let your organization talk to them. The winners won’t be the firms that collect the most pictures, but the ones that build systems where an expert can type a few words and have the entire visual history of the business respond.

If astronomers are mapping the far reaches of space, the pharmaceutical industry is mapping the microscopic interior of the human body where a single wrong turn can quietly erase billions in R&D. Justin Donnelly (PhD, Chemical Biology) of Axiom Bio reminded the room that the average cost of bringing a single drug to market is about $2.5 billion—and one of the biggest, most persistent drivers of that cost is drug-induced liver injury. It’s a toxic reaction that often stays invisible until a drug is already deep into clinical trials, when it’s most expensive and reputationally damaging to fail.
For executives, the core problem is familiar: critical decisions made on thin data. In this case, we have clear toxicity labels for only about 2,000 compounds. That’s nowhere near enough to reliably de-risk an entire discovery portfolio if we treat the AI as a traditional “big data” engine.
Donnelly’s approach reframes the question. Instead of asking the model to make a simple yes/no call on toxicity, he teaches the AI to explain itself in biological terms. His team trains a large, powerful AI model on abundant experimental data—about 116,000 compounds tested in lab assays—to predict a set of fundamental biological features from each molecule’s structure. In plain terms, the AI learns to answer: “Given this molecule, what is it likely to do to cells and tissues?” before it ever pronounces a verdict on safety.
Crucially, this is only the first step. Those predicted features are then fed into a second, more transparent model that behaves less like a black box and more like a senior toxicologist. When a Pfizer drug called Lexipeptide began to show toxicity in development, this two-step system didn’t just raise a red flag; it produced a quantified post-mortem. It attributed the risk primarily to mitochondrial stress and high drug concentration, clearly surfacing the specific levers chemists could have adjusted earlier. Just as important, when the model is uncertain, it says so explicitly: “I don’t have enough data on this mechanism; run this experiment next.”
This turns AI from a mysterious oracle into a decision companion with an audit trail. Instead of a binary score that can’t be defended in a governance committee, leaders get:
For high-stakes sectors beyond pharma, the pattern is the same. In financial services, banks are beginning to use similarly transparent models to move beyond one-line credit scores. Rather than a generic rejection, a system can specify that a loan risk is, say, 60% driven by market volatility and 40% by debt-to-income ratios, enabling more nuanced and compliant lending decisions. In infrastructure and energy, models that explain whether a predicted failure is driven by material fatigue or environmental stress would allow operators to target maintenance with surgical precision rather than relying on broad, conservative schedules.
Executives can take a direct lesson: in environments where a single miscalculation can lead to systemic collapse, you cannot afford AI that only predicts; you need AI that explains. The winning implementations will be those that:
In other words, the competitive advantage does not come from being the first to deploy a powerful model. It comes from being the first to deploy one that your scientists, risk officers, and regulators can interrogate, trust, and act on.

If astronomy and pharma deal with galaxies and molecules, Vector Faculty Member Anna Goldenberg’s work focuses on something closer to home: the continuous signals of the human body. Senior Scientist in Genetics and Genome Biology program at SickKids Research Institute, Goldenberg’s core claim challenges a common assumption in healthcare and beyond—that every disease, use case, or environment is so different that you need to start from scratch each time. Her data suggests the opposite: under the surface, the human body behaves in surprisingly consistent ways.
Her team studies what they call “physiological states”—stable patterns in data like heart rate, sleep, and movement that persist over time. What’s striking is not just that these states exist, but that they transfer. In one project, their models learned states from wearable data on pregnant women, including periods labelled as “feeling in control” or “sleep problems.” Those same states then transferred with unexpected accuracy to a completely different group: patients with Crohn’s disease.
For executives, the message is important: you may not need a bespoke model for every subpopulation or condition. If you can learn robust, interpretable states in one context, you can often reuse them in another.
Technically, Goldenberg’s team borrows an idea from Large Language Models (LLMs), but applies it to time-series signals instead of words. Rather than predicting the next word in a sentence, they feed the model “sentences” of physiological data over time:
A foundation model for time-series data learns the “grammar of human physiology”; it discovers recurring “phrases” of stress, recovery, stability, and deterioration.
Goldenberg relates an industrial analogy: a technician knows that a vibrating turbine in a jet engine and a vibrating pump on a factory floor may suffer from the same underlying physics, even if they sit in different systems. Once you understand the pattern of wear and tear in one environment, you can spot it in another. Similarly, in energy and utilities, grid components, transformers, and substations emit time-series data that reflect stress, overload, and degradation; learned states like stable operation, incipient overload, or thermal stress can be reused across regions and asset types.
The executive takeaway is clear: don’t treat every dataset, site, or product line as a greenfield AI problem. Invest in foundation models for your key time-series signals—sensor outputs, wearables, machine logs, transaction streams—and focus on discovering reusable states such as health vs. stress, normal vs. anomalous, and stable vs. drifting. Design your AI strategy so each new deployment can inherit what was learned before, and treat operational data like a language to be modelled, not just a set of isolated metrics. The real measure of success is not only accuracy in one pilot, but how easily those learned states travel across business units, assets, and conditions.
In other words, the real value isn’t merely predicting a single outcome more precisely. It’s building an AI layer that understands the universal dynamics of your systems—so that when a new product, patient group, or asset class comes online, the model already speaks its language well enough to help from day one.

As the sun set over University Avenue, the day’s participants emerged better-quipped to work toward a more disciplined and honest form of AI—one that shows the annotation bottleneck, the trust gap, and the lab-to-production leap are not insurmountable.
The path forward is a kind of disciplined parsimony: using physics to guide the machine’s imagination, causality to ground its predictions, and explainability to ensure that, when the model speaks, it has something meaningful to say. For industry—life sciences with limited data, manufacturers of high-stakes products, and financial institutions under scrutiny—the message was clear: the goal is no longer to build a machine that can see the stars, but one that understands why they shine.
Our renowned research community is advancing breakthroughs in the science and application of AI.