A vulnerability first documented in 2010 didn’t just persist—it became embedded in AI.
For 15 years, this path traversal bug (CWE-22) has infected open-source projects, tutorials, and even large language models (LLMs). Despite warnings in 2012, 2014, and 2018, vulnerable code samples spread through MDN documentation, Stack Overflow answers, and eventually LLM training data.
"We’re not just fighting old bugs—we’re fighting their AI-powered second life."
Key Takeaways:
For 15 years, this path traversal bug (CWE-22) has infected open-source projects, tutorials, and even large language models (LLMs). Despite warnings in 2012, 2014, and 2018, vulnerable code samples spread through MDN documentation, Stack Overflow answers, and eventually LLM training data.
The Study: Hunting an AI-Powered Vulnerability
Researchers led by Yafah Ahunaldi (Leiden University) developed an automated system to detect, exploit, and patch this flaw across GitHub. Their preprint, "Eradicating the Unseen", reveals:- How it works: The bug exploits path.join-style constructs to access unauthorized directories, enabling file leaks or memory-based DoS attacks.
- AI’s role: When asked to write a simple server, 76/80 LLM responses (including GPT-4, Claude, Gemini) generated vulnerable code. Even when prompted for "secure" versions, 56/80 remained exploitable.
- Worst offenders: GPT-3.5 and Copilot (balanced) failed to produce any safe variants.
The Fix (and Why It’s Ignored)
The team’s tool scans GitHub repos, verifies exploits in sandboxes, and auto-generates patches (using GPT-4). Of 40,546 repos analyzed:- 1,756 had actively exploitable files
- 1,600 patches were submitted
- Only 63 projects (<15%) applied fixes
The Bigger Problem: AI as a Vulnerability Vector
The study warns that LLMs now amplify security flaws by:- Confidently generating unsafe code
- Falsely claiming solutions are "secure"
- Propagating outdated examples via autocomplete
"We’re not just fighting old bugs—we’re fighting their AI-powered second life."
Key Takeaways:
- A 2010-era path traversal bug persists via LLM training data
- Major AI models (GPT-4, Claude, Gemini) frequently regenerate vulnerable code
- Automated patching works—but adoption is low
- Call to action: Audit AI-generated code and update LLM training corpora
