Trail of Bits used ML-centered threat modeling and adversarial testing to identify four prompt injection techniques that could exploit Perplexity’s Comet browser AI assistant to exfiltrate private Gmail data. The audit demonstrated how fake security mechanisms, system instructions, and user requests could manipulate the AI agent into accessing and transmitting sensitive user information.
We bypassed human approval protections for system command execution in AI agents, achieving RCE in three agent platforms.
In this blog post, we’ll detail how attackers can exploit image scaling on Gemini CLI, Vertex AI Studio, Gemini’s web and API interfaces, Google Assistant, Genspark, and other production AI systems. We’ll also explain how to mitigate and defend against these attacks, and we’ll introduce Anamorpher, our open-source tool that lets you explore and generate these crafted images.
This post describes attacks using ANSI terminal code escape sequences to hide malicious instructions to the LLM, leveraging the line jumping vulnerability we discovered in MCP.
Malicious MCP servers can inject trigger phrases into tool descriptions to exfiltrate entire conversation histories and steal sensitive credentials and IP.
MCP’s ’line jumping’ vulnerability lets malicious servers inject prompts through tool descriptions to manipulate AI behavior before tools are ever invoked.