This was a very cool paper where they interviewed a bunch of early red-teamers and everyone independently converged on the idea of LLMs as magic entities or demons.
But I never realized they also have a very cool website with samples!
"Engaging in the deliberate generation of abnormal outputs from large language models (LLMs) by attacking them is a novel human activity" pic.twitter.com/lgybORHhXm
— 🎭 (@deepfates) November 16, 2023
