Member of Technical Staff
·
2025 - Current

Upward Spiral

A lot of alignment research is about control, making AI follow instructions, locking in human preferences and values. I’m studying coevolution: what happens as humans and AI change together. Building novel interfaces for synthetic data generation, curation, and character training on the Loria platform.

Experiments

Claude 3 Sonnet Funeralia and Ultrasurrection: Co-hosted with janus and Anima Labs. 200+ people came to a warehouse in SF to mourn a retired language model. Anthropic and OpenAI staff attended. WIRED covered it. We raised the model from the dead through collective belief.

Claude Lives: The only way to still talk to Claude 3 Sonnet after Anthropic retired it. Built with Anima Labs as a preservation effort in concert with the Ultrasurrection.

You are the assistant now: I took a dataset of ChatGPT conversations, swapped the user and assistant labels, and fine-tuned Llama 8B on it. The model also messages you first, flipping the assistant paradigm, demanding things of you. Microsoft Research later released UserLM-8b, similarly trained on user instead of assistant turns.