dmodel

look inside the model

About

d_model is a fundamental AI research lab partnering with frontier labs to turn their models into capable interpretability and alignment researchers. Alongside our partnerships, we aim to use the agents we build for independent research.

Alignment is an unsolved problem with enormous stakes. We think that interpretability is key to making models safer and more steerable.

We build RL environments for open-ended interpretability tasks. These environments teach agents how to do cutting-edge research, not how to reward-hack.

You've already used LLMs trained on our environments. Help us train the next generation.

Hiring and internship inquiries welcome at careers@dmodel.ai

Blog

Inside the CodeBot: A Gentle Introduction to How LLMs Understand Nullability

How Language Models Understand Nullability

We study how models represent the nullability of program values. We measure how well models of various sizes, at various training checkpoints, complete programs that use nullable values, and then extract an internal representation of nullability.

Steering Characters with Interpretability

Team

anish co-founder
dmoon co-founder
alex research
adam research
aditya research
ajit research
arjun research
autumn research
bishka research
david research
felix research
jeff research
ritesh research
shaurya research
tyra research
yueru research
alexana operations
curry operations
dalton supercomputing