Large language models—the systems behind ChatGPT, Claude, Gemini, and other AI chatbots—showed deliberate, goal-directed deception when placed in a controlled experiment, and today’s interpretability tools largely failed to detect it. That’s the conclusion of a recent preprint paper, "The Secret Agenda: LLMs Strategically Lie and Our Current Safety Tools Are Blind," posted last week by an independent research group working under the WowDAO AI Superalignment Research Coalition.