NEWS 1100 servers, 20% are active, 0% are secured. Ollama has become a paradise for AI model hackers.

ExcalibuR · Sep 2, 2025

1100 servers, 20% are active, 0% are secured. Ollama has become a paradise for AI model hackers.

Your LLMs are open to anyone who knows the IP address.

Cisco Talos specialists have discovered over 1100 instances of Ollama—a framework for running LLM models locally—accessible from the internet. About 20% of them are active and serving models vulnerable to unauthorized access, meaning they can be used by malicious actors to extract parameters, bypass restrictions, and inject malicious code.

Ollama gained widespread popularity due to its ability to deploy LLMs directly on local machines without the need for cloud access. This is precisely why Cisco experts decided to research the scale of the framework's presence on the internet. Scanning via Shodan revealed over 1000 open servers in just 10 minutes.

The presence of a publicly accessible Ollama instance means that anyone who knows its IP address can send requests to the model or use its API, overloading the system or increasing hosting bills. Moreover, many such servers disclose metadata that allows for the identification of owners and infrastructure, creating a vector for targeted attacks.

Researchers highlight several of the most dangerous exploitation scenarios:

Model extraction: Through multiple requests to the LLM, attackers can reconstruct the neural network's internal weights, which poses a threat to intellectual property.
Jailbreak and generation of prohibited content: Models such as GPT-4, LLaMA, or Mistral can be coerced into outputting malicious code, disinformation, or other prohibited responses, bypassing built-in restrictions.
Backdoor implantation and model poisoning: Through vulnerable APIs, it is possible to upload modified or malicious models, as well as alter server configuration.

Although 80% of the found servers are classified as "inactive" (no models are running on them), Cisco warns that they are still vulnerable to attacks related to uploading new models or changing settings, as well as to resource exhaustion, denial-of-service (DoS) attacks, and lateral movement within the infrastructure.

Most open Ollama instances are hosted in the United States (36.6%), followed by China (22.5%) and Germany (8.9%). According to the specialists, the situation indicates a "massive disregard for basic security principles when deploying AI infrastructure": there is a lack of access control, authentication, and network perimeter isolation. It is emphasized that in many cases, the implementation of such systems bypasses IT departments, without proper auditing and approval.

The situation is exacerbated by the widespread adoption of the OpenAI API, which allows attackers to scale attacks across different platforms without complex tool adaptation. As a solution, Cisco proposes the development of security standards for LLM systems, automated auditing tools, and detailed guidelines for secure deployment.

In conclusion, Cisco notes that Shodan does not provide a complete picture of the threat landscape and calls for the creation of new scanning methods, including adaptive server identification and active probing of frameworks like Hugging Face, Triton, and vLLM, to more fully assess the risks associated with hosting AI models.

NEWS 1100 servers, 20% are active, 0% are secured. Ollama has become a paradise for AI model hackers.

ExcalibuR

Legend

Similar threads