d_model

look inside the model

April 2025

Inside the CodeBot

A Gentle Introduction to How LLMs Understand Nullability

March 2025

How Language Models Understand Nullability

Sanchez-Stern and Tondwalkar, 2025

We study how models represent the nullability of program values. We measure how well models of various sizes, at various training checkpoints, complete programs that use nullable values, and then extract an internal representation of nullability.

September 2024

Steering Characters with Interpretability

We think you can make better characters with steering vectors. Try it out in our notebook, or check out some of the examples from the screenshots in the post below.