Ollama Heap Out-of-bounds Read Vulnerability Leads to Remote Process Memory Leak (CVE-2026-7482)

Threat researchers have identified a critical severity vulnerability impacting Ollama. Tracked as CVE-2026-7482, successful exploitation of the vulnerability may allow a remote, unauthenticated attacker to leak the entire process memory.

Ollama is an open-source framework that lets users download and run large language models (LLMs) such as Llama 3, Mistral, and DeepSeek directly on their local machine. It simplifies the management of these models by packaging them into a single format, a “Modelfile,” that automatically handles configuration and dependencies.

Vulnerability Details

The heap out-of-bounds read vulnerability has a CVSS score of 9.9. The vulnerability exists in the GGUF model loader of Ollama. GGUF (GPT-Generated Unified Format) is a file format for storing large language models, enabling efficient local loading and execution. It serves a similar purpose to formats like PyTorch’s .pt/.pth files (Python pickle-based), safetensors, and ONNX.

The /api/create endpoint processes attacker-supplied GGUF files with inflated tensor offsets and sizes that exceed the file’s actual length. During quantization in fs/ggml/gguf.go and server/quantization.go (WriteTo()), this causes the server to read beyond the allocated heap buffer.

Leaked memory may expose sensitive data, including environment variables, API keys, system prompts, and conversation data from concurrent users. Attackers can exfiltrate this data by uploading the resulting model artifact via the unauthenticated /api/push endpoint to a registry they control.

Upstream distributions lack authentication for both /api/create and /api/push. While default deployments bind to 127.0.0.1, the documented OLLAMA_HOST=0.0.0.0 configuration is commonly used, leading to widespread public internet exposure.

To summarize, the exploitation chain unfolds over three steps:

  1. Upload a crafted GGUF file with an inflated tensor shape to a network-accessible Ollama server using an HTTP POST request.
  2. Use the /api/create endpoint to trigger model creation, which fires the out-of-bounds read vulnerability.
  3. Use the /api/push endpoint to exfiltrate data from the heap memory to an external server.

Affected versions

The vulnerability affects Ollama versions before 0.17.1.

Mitigations

Users must upgrade to the Ollama version 0.17.1 to patch the vulnerability.

For more information, please refer to the GitHub Security Advisory.

Qualys Detection

Qualys customers can scan their devices with QIDs 734196 and 5012259 to detect vulnerable assets.

QID 5012259 is currently available via the SwCA capabilities for Container Security.

Please continue to follow Qualys Threat Protection for more coverage on the latest vulnerabilities.

References
https://github.com/advisories/GHSA-x8qc-fggm-mpqg