CVE-2024-34359: llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata
llama-cpp-python
depends on class Llama
in llama.py
to load .gguf
llama.cpp or Latency Machine Learning Models. The __init__
constructor built in the Llama
takes several parameters to configure the loading and running of the model. Other than NUMA, LoRa settings
, loading tokenizers,
and hardware settings
, __init__
also loads the chat template
from targeted .gguf
’s Metadata and furtherly parses it to llama_chat_format.Jinja2ChatFormatter.to_chat_handler()
to construct the self.chat_handler
for this model. Nevertheless, Jinja2ChatFormatter
parse the chat template
within the Metadate with sandbox-less jinja2.Environment
, which is furthermore rendered in __call__
to construct the prompt
of interaction. This allows jinja2
Server Side Template Injection which leads to RCE by a carefully constructed payload.
References
Detect and mitigate CVE-2024-34359 with GitLab Dependency Scanning
Secure your software supply chain by verifying that all open source dependencies used in your projects contain no disclosed vulnerabilities. Learn more about Dependency Scanning →