AI Safety Researcher
I work on detecting deceptive behavior in frontier models, scalable oversight, and the moral status of digital minds. I was an Astra Fellow (Jan–Apr) working with Apollo Research on scheming-detection evaluations. I'm now extending that work through a FIG Extension Fellowship on persona dynamics.
Featured
FIG Extension FellowshipStudying persona dynamics in frontier models.
Projects
Elsewhere