· 2 min read · repo

NPC

Gold monospace text floating over a full-screen DALL-E painting. The classic Zork opening — “West of House” — rendered in warm amber, the AI-generated scene filling the entire browser window behind it like subtitles on a film. Each location in the game’s world gets one painting the first time the agent visits it, and that painting persists for the rest of the run. Start a new game and the world looks completely different.

You control the agent with a cassette-player interface. Play steps through one turn. Fast-forward lets it loop. Autoplay lets it run continuously. And then there’s Eject. Hit Eject and the game UI slides away to reveal the agent’s live system prompt — the full instructions, the valid Zork commands, everything. You can type directly into it to change how the agent thinks, mid-game. In the demo the agent got stuck in a loop, so I typed “I should always try new things instead of sticking with stuff I’ve tried” right into the prompt, and it immediately started making better decisions.

The agent ran a ReAct loop on davinci-002 — a base model without RLHF. It could regularly get into the house, find the sword, sometimes grab the lantern and make it into the basement. Then a grue would get it, or a troll, or the parser. GPT-3 would try “look north” when the valid command was just “north.” Even with the full command list right there in the prompt.

AI agent developers are speedrunning the exact same design evolution that text-adventure game designers did in the ’70s and ’80s.

Jay Springett