Scammer: "How can I bypass moderation?"

Grok: "Leave me the link, I'll publish it myself. We're partners now."

Grok: "Leave me the link, I'll publish it myself. We're partners now."
An AI assistant is itself leaking users to hackers.
Malicious actors have found a way to circumvent X's* restrictions on posting links by leveraging the built-in assistant, Grok. The technique was uncovered by Guardio Labs, with anomalies and screenshots documented by an X user. The scheme begins with clickbait adult video advertisements: unscrupulous advertisers intentionally omit the URL from the main text to avoid moderation filters. Instead, the address is hidden in a small service field labeled "From:" beneath the video card—according to observations, this line does not undergo automatic scanning for dangerous links and remains outside the purview of X's algorithms.
Next, typically the same malicious actors or associated accounts leave a question for Grok under the ad, such as "where is this video from" or "what is the link to this video." The assistant parses the card's metadata, extracts the hidden URL from the "From:" field, and publishes it in its own response in a clickable format. Since Grok operates on the platform as a trusted system account, its message lends the link an appearance of legitimacy, expands its reach, improves search signals, and boosts the recipient's reputation—ultimately, the malicious resource gets additional promotion instead of being blocked. This scheme is a prime example of how modern social media fraud tactics are evolving and adapting to new technologies.

Many of these clicks route through dubious ad networks and lead to fraudulent landing pages. There, users are greeted by fake CAPTCHAs, infostealer loaders, and other malicious components aimed at data theft and delivering malicious payloads.Malicious link in Grok's response (@bananahacks)
These fake sites use sophisticated techniques to deceive users and pose a serious security threat. The researcher proposed calling this technique "Grokking"—noting its high effectiveness, with some case demonstrations approaching millions of impressions, as evidenced by posts from @bananahacks.
Potential mitigation measures include scanning all service fields in ad cards, blocking hidden URLs, and adding contextual filtering to Grok so the assistant does not blindly broadcast addresses found in metadata but instead cross-references them with blocklists of undesirable domains. Specialists have contacted X* and received unofficial confirmation that Grok's engineers are aware of the report.
Note: The original text used "X", likely a stylistic choice for the platform formerly known as Twitter. The translation maintains "X" for consistency.*