Heap Buffer Overflow via Integer Overflow in GGUF Tensor Parsing

Summary

An integer overflow vulnerability in the ggml_nbytes function allows an attacker to bypass memory validation by crafting a GGUF file with specific tensor dimensions. This causes ggml_nbytes to return a significantly smaller size than required (e.g., 4MB instead of Exabytes), leading to a heap-based buffer overflow when the application subsequently processes the tensor. This vulnerability allows potential Remote Code Execution (RCE) via memory corruption.

Details

The vulnerability exists in ggml/src/ggml.c within the ggml_nbytes function, which calculates the storage size of a tensor:

// ggml/src/ggml.c
size_t ggml_nbytes(const struct ggml_tensor * tensor) {
    // ...
    size_t nbytes;
    const size_t blck_size = ggml_blck_size(tensor->type);
    if (blck_size == 1) {
        nbytes = ggml_type_size(tensor->type);
        for (int i = 0; i < GGML_MAX_DIMS; ++i) {
            nbytes += (tensor->ne[i] - 1)*tensor->nb[i]; // <--- VULNERABILITY
        }
    }
    // ...
    return nbytes;
}

The multiplication (tensor->ne[i] - 1) * tensor->nb[i] operates on int64_t (converted to size_t), which is vulnerable to integer overflow.
Additionally, the GGUF loader in gguf.cpp does not adequately validate that the claimed byte size matches the logical dimensions, nor does it check if the total logical size exceeds SIZE_MAX bytes, only checking if the element count fits in INT64_MAX.

PoC

To reproduce this vulnerability, create a GGUF file containing a tensor with the following properties:

Type: GGML_TYPE_F32 (4 bytes)
Dimensions: ne = [1024, 1024, 4398046511105, 1]
- Note: 4398046511105 is 2^42 + 1.

Calculation Breakdown:

Strides (nb) are calculated naturally:
- nb[0] = 4
- nb[1] = 4 * 1024 = 4096
- nb[2] = 4096 * 1024 = 4,194,304 (2^22)
ggml_nbytes Loop:
- i=2 term: (ne[2] - 1) * nb[2]
- = (2^42) * (2^22) = 2^64
- = 0 (due to overflow wrapping on 64-bit systems).
Result: The calculated size (nbytes) represents only the small dimensions (0 and 1), resulting in ~4MB allocation.
Real Size: The tensor logically contains ~2^62 elements, requiring Exabytes of memory.
Crash: Accessing this tensor triggers a heap buffer overflow.

Impact

Vulnerability Type: Heap-based Buffer Overflow
Impact: Remote Code Execution (RCE), Denial of Service (DoS).
Affected Components: Applications using llama.cpp to load untrusted GGUF models.
Attack Vector: An attacker provides a malicious GGUF file which, when loaded, triggers the overflow.

Suggested Fix

Checked Arithmetic: Use checked arithmetic functions (e.g., __builtin_mul_overflow) in ggml_nbytes and stride calculations to detect overflows.
Input Validation: Implement strict size validation in gguf.cpp when parsing tensor information. Verify that the total size in bytes (not just element count) fits within addressable memory limits (e.g., SIZE_MAX) before accepting the tensor.

// Example validation logic
if (total_elements > SIZE_MAX / ggml_type_size(type)) {
    return false; // Overflow would occur
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Heap Buffer Overflow via Integer Overflow in GGUF Tensor Parsing

Package

Affected versions

Patched versions

Description

Summary

Details

PoC

Impact

Suggested Fix

Severity

CVSS overall score

CVSS v3 base metrics

CVSS v3 base metrics

CVE ID

Weaknesses

Heap-based Buffer Overflow

Integer Overflow or Wraparound

Credits