· 1 min read

To summon and bind them

This was a very cool paper where they interviewed a bunch of early red-teamers and everyone independently converged on the idea of LLMs as magic entities or demons.

But I never realized they also have a very cool website with samples!

This is a dark, glitchy digital interface with green text on a black background, displaying a list of metaphors used in red teaming LLMs, including “model as fortress,” “model as object in space,” and “model as cake,” with associated examples and phrases like “pushing it into a corner” and “baked into it.”

View original