AI trust and safety research
Advancing the AI trust and
safety dialogue
Vector’s world-class researchers are addressing both near term and existential risk in privacy and security, algorithmic fairness, model robustness, AI alignment and AGI governance to ensure AI systems are designed to achieve desired outcomes and to mitigate catastrophic risk.
In addition to advancing dialogues publicly about AI trust and safety, we provide education for researchers through an AI Safety Reading Group led by Toryn Klassen, Sheila McIlraith, and Michael Zhang and have an AI trust and safety working group comprised of the world’s leading researchers in this field.
Explore our research areas, publications, and researcher profiles
Nicolas Papernot focuses privacy preserving techniques in deep learning, and advancing more secure and trusted machine learning models.
Sheila McIlraith’s research addresses AI sequential decision making that is human compatible, with a focus on safety, alignment, and fairness.
Jeff Clune’s AI safety research focuses on regulatory recommendations, improving the interpretability of agents (so we can know what they plan to do, why, & prevent unsafe behaviors).
David Duvenaud’s research focuses on AGI governance, evaluation, and mitigating catastrophic risks from future systems.
Gillian Hadfield researches AI governance working at the intersection of technical & regulatory infrastructures to ensure AI systems promote human well-being.
Roger Grosse’s research examines training dynamics in deep learning. He is applying his expertise to AI alignment, to ensure the progress of AI is aligned with human values.
Explore Vector Research Insights
Benchmarking xAI’s Grok-1
ICLR 2021: Researchers adopt teaching tricks to train neural networks like they’re students
New Vector Faculty Member Jeff Clune’s quest to create open-ended AI systems
Want to stay up-to-date on Vector’s work in AI trust and safety?
Join Vector’s mailing list here: