AI trust and safety research

Explore our research areas, publications, and researcher profiles

Nicolas Papernot focuses privacy preserving techniques in deep learning, and advancing more secure and trusted machine learning models.

Sheila McIlraith’s research addresses AI sequential decision making that is human compatible, with a focus on safety, alignment, and fairness.

Jeff Clune’s AI safety research focuses on regulatory recommendations, improving the interpretability of agents (so we can know what they plan to do, why, & prevent unsafe behaviors).

David Duvenaud’s research focuses on AGI governance, evaluation, and mitigating catastrophic risks from future systems.

Gillian Hadfield researches AI governance working at the intersection of technical & regulatory infrastructures to ensure AI systems promote human well-being.

Roger Grosse’s research examines training dynamics in deep learning. He is applying his expertise to AI alignment, to ensure the progress of AI is aligned with human values.

Who are we working with in AI trust and safety?

Explore Vector Research Insights

Three people stare at a laptop with a Vector logo on it

Benchmarking xAI’s Grok-1

Code Graphic

ICLR 2021: Researchers adopt teaching tricks to train neural networks like they’re students

A poised individual with light brown hair, wearing a modern checkered shirt, stands confidently against a soft-focus office background. His demeanor reflects the innovative spirit and professionalism expected of those at the forefront of technology and research. The image captures a blend of approachable intelligence and forward-thinking ambition.

New Vector Faculty Member Jeff Clune’s quest to create open-ended AI systems

Want to stay up-to-date on Vector’s work in AI trust and safety?

Join Vector’s mailing list here: