Threats Instead of "Thank You": Brin Finds a Way to Make AI Try Harder
Humanity has once again failed at parenting.In a surprising statement at the recent All-In-Live conference in Miami, Google co-founder Sergey Brin claimed that threatening generative AI models appears to improve their response quality. He noted that this effect has been observed not only in Google’s models but across other AI systems, though such findings are rarely discussed within the AI community.
Brin remarked, half-jokingly, that "threats of physical violence" somehow make models perform better. While his tone was lighthearted, the comment quickly sparked debate over how the framing of prompts influences AI output.
Interestingly, just a month earlier, OpenAI CEO Sam Altman sarcastically addressed the practice of politely prompting AI. When asked about the electricity costs of "excessive politeness" in prompts, he quipped: "Tens of millions of dollars well spent—who knows what actually works better?"
The Rise and Fall of Prompt Engineering
So-called "prompt engineering"—the art of crafting effective AI queries—gained popularity in 2022. Initially hailed as the critical skill of the future, it later faced skepticism as AI itself began auto-generating prompts. IEEE Spectrum declared it "dead," while the Wall Street Journal first crowned it the "hottest job of 2023"—then labeled it obsolete.Yet, prompt engineering persists, especially in jailbreaking—bypassing AI restrictions. One common tactic involves manipulative phrasing or even threatening the model to generate prohibited or harmful content.
Is This a Flaw or a Feature?
- Stuart Buttersby (CTO, Chatterbox Labs) argues this isn’t unique to Google but a universal challenge for advanced AI developers. He stresses that while threats might aid jailbreaking, rigorous testing and security audits are needed to assess their real impact.
- Daniel Kahn (Univ. of Illinois at Urbana-Champaign) points out that such claims often rely on anecdotal evidence. Citing the study "Should We Respect LLMs?", he notes results were inconclusive—politeness didn’t consistently improve responses.
