CVE-2024-34359: llama-cpp-python vulnerable to Remote Code Execution by Server-Side Template Injection in Model Metadata

May 13, 2024

llama-cpp-python depends on class Llama in llama.py to load .gguf llama.cpp or Latency Machine Learning Models. The __init__ constructor built in the Llama takes several parameters to configure the loading and running of the model. Other than NUMA, LoRa settings, loading tokenizers, and hardware settings, __init__ also loads the chat template from targeted .gguf ’s Metadata and furtherly parses it to llama_chat_format.Jinja2ChatFormatter.to_chat_handler() to construct the self.chat_handler for this model. Nevertheless, Jinja2ChatFormatter parse the chat template within the Metadate with sandbox-less jinja2.Environment, which is furthermore rendered in __call__ to construct the prompt of interaction. This allows jinja2 Server Side Template Injection which leads to RCE by a carefully constructed payload.

References

Code Behaviors & Features

Detect and mitigate CVE-2024-34359 with GitLab Dependency Scanning

Secure your software supply chain by verifying that all open source dependencies used in your projects contain no disclosed vulnerabilities. Learn more about Dependency Scanning →