NEWS Found 104 Zero-Day Vulnerabilities in Android and Proved They Worked. All for $1.77.

ExcalibuR

Legend
LEGEND
PREMIUM
MEMBER
Joined
Jan 17, 2025
Messages
4,031
Reaction score
7,831
Deposit
11,800$
AI Found 104 Zero-Day Vulnerabilities in Android and Proved They Worked. All for $1.77.
1757133632412.png
The cost of hacking is now less than a cup of coffee.

Artificial intelligence systems have often been criticized for generating confusing vulnerability reports and flooding open-source developers with irrelevant complaints. However, researchers from Nanjing University and the University of Sydney present a counterexample: they have introduced an agent called A2, capable of finding and verifying vulnerabilities in Android applications by mimicking the work of a human bug hunter. This new development is a continuation of their previous project, A1, which was able to exploit bugs in smart contracts.

The authors claim that A2 achieved 78.3% coverage on the Ghera benchmark test suite, outperforming the static analyzer APKHunt, which scored only 30%. When run on 169 real APK files, it found 104 zero-day vulnerabilities, with 57 of them confirmed by automatically created working exploits. Among these was a medium-severity flaw in an app with over 10 million installations. This was an intent redirection issue that could allow malicious software to hijack control.

A key distinguishing feature of A2 is its validation module, which its predecessor lacked. The older A1 system used a fixed validation scheme that only assessed whether an attack would be profitable. A2, however, can confirm a vulnerability step-by-step, breaking the process down into specific tasks. As an example, the authors describe a scenario with an app where an AES key was stored in plaintext. The agent first finds the key in the strings.xml file, then uses it to generate a fake password reset token, and finally verifies that this token actually bypasses authentication. All stages include automatic verification: from matching values to confirming app activity and displaying the target address on screen.

To operate, A2 combines several commercial large language models (LLMs): OpenAI o3, Gemini 2.5 Pro, Gemini 2.5 Flash, and GPT-oss-120b. They are assigned specific roles: a Planner formulates the attack strategy, an Executor performs the actions, and a Validator confirms the result. According to the authors, this architecture replicates human methodology, which has allowed it to reduce noise and increase the number of confirmed findings. The developers note that traditional analysis tools generate thousands of insignificant signals and very few real threats, whereas their agent is capable of immediately proving the exploitability of a bug.

The researchers separately calculated the system's operating cost. Vulnerability discovery costs between $0.0004 and $0.03 per application when using different models, and the full cycle with verification costs an average of $1.77. However, if using exclusively Gemini 2.5 Pro, the cost rises to $8.94 per bug. For comparison, last year a team from the University of Illinois showed that GPT-4 could create an exploit from a vulnerability description for $8.80. This means the cost of finding and confirming breaches in mobile applications is comparable to the price of a single medium-severity vulnerability in bug bounty programs, where rewards are measured in hundreds and thousands of dollars.

The specialists emphasize that A2 already outperforms static analyzers for Android programs, and A1 is close to the best results in smart contracts. They are confident that this approach can speed up and simplify the work of both researchers and hackers, as instead of complex tool development, it's enough to call the API of already trained models. A problem remains, however: bounty hunters could use A2 for quick enrichment, but reward programs do not cover all bugs. This leaves loopholes for attackers who could directly exploit the found vulnerabilities.

The authors believe this field is just beginning to develop and that a surge in activity, both in defense and offensive attacks, is to be expected soon. Industry representatives note that systems like A2 are shifting vulnerability discovery from endless alerts to confirmed findings, reducing the number of false positives and allowing a focus on real risks. For now, the source code is only available to researchers with an official partnership to maintain a balance between open science and responsible disclosure.
 
Top Bottom