Publications

More Publications

We identify the common workflow for mechanistic interpretability work, and automate its “systematic ablations” step with a new …

The key idea behind causal scrubbing is to test interpretability hypotheses via behavior-preserving resampling ablations. We apply this …

Recent Blogs

Use case You have to run your program on a remote server. However your favourite editor with your favourite configuration isn’t …

Let’s say we believe consequentialism and utilitarianism. Roughly, we hold that the morality of an action depends only on its …

Attention! For those of you who do not understand Catalan, There is an English translation below! La meva germana i jo vam començar a …

This is a writeup of the problems my team solved in the Murcia contest, which we were participating in as preparation for SWERC. Those …

The information in this post is taken from the sources listed in the Sources section at the end of it. If you don’t care about …