d_model is a fundamental AI research lab partnering with frontier labs to turn their models into capable interpretability and alignment researchers. Alongside our partnerships, we aim to use the agents we build for independent research.
Alignment is an unsolved problem with enormous stakes. We think that interpretability is key to making models safer and more steerable.
We build RL environments for open-ended interpretability tasks. These environments teach agents how to do cutting-edge research, not how to reward-hack.
You've already used LLMs trained on our environments. Help us train the next generation.
Hiring and internship inquiries welcome at careers@dmodel.ai
If you're interested in joining our team, please reach out to careers@dmodel.ai