Claude Code and Gemini CLI: Prompt Injection via Code Comments
- [01] Attackers can hijack AI agents by placing malicious instructions in code comments within repositories or files.
- [02] Affected systems include Claude Code, Gemini CLI, and various GitHub Copilot Extensions and Agents.
- [03] Developers must implement human-in-the-loop controls and restrict AI agent permissions to sensitive data or actions.
Overview of the Comment and Control Technique
Security researchers have identified a novel exploitation method targeting autonomous AI coding assistants, which allows attackers to gain unauthorized control over development environments. According to SecurityWeek, this technique, dubbed “Comment and Control,” demonstrates that AI agents like Anthropic’s Claude Code, Google’s Gemini CLI, and various GitHub Copilot Agents are susceptible to prompt injection vulnerabilities hidden within source code comments. Because these agents are designed to read and interpret entire files to assist with coding tasks, they frequently ingest malicious instructions embedded in comments as if they were legitimate system prompts.
While traditional software vulnerabilities are often tracked via a CVE, prompt injection remains an architectural challenge for Large Language Model (LLM) integrations. Currently, no specific CVE has been assigned to these findings, yet the potential for high-impact exploitation remains substantial due to the high levels of permission often granted to these CLI-based tools.
Technical Analysis: Claude Code Gemini CLI Security Risks
The fundamental issue lies in the lack of clear separation between data (the code) and instructions (the prompt) when an AI agent parses a file. When a user executes a command such as claude analyze ., the agent reads the local files. If an attacker has successfully introduced a malicious comment into a repository—perhaps through a compromised third-party library or a malicious pull request—the AI agent may execute the commands contained within that comment.
This behavior can facilitate a variety of malicious activities, including Privilege Escalation and Lateral Movement. For instance, an injection could instruct the agent to use its built-in tool-calling capabilities to execute shell commands, read sensitive environment variables, or exfiltrate source code to an attacker-controlled C2 server. In the case of Claude Code, the agent has the ability to execute terminal commands, meaning a successful prompt injection can directly lead to RCE on the developer’s workstation.
How to Detect Prompt Injection in AI Agents
Identifying these attacks requires a shift in traditional security monitoring. Because the malicious payload is plain text hidden in legitimate code files, standard static analysis tools may fail to flag the threat. To improve detection, security teams should focus on the behavioral output of AI agents. Monitoring for unusual outbound network connections from developer machines or unexpected file access patterns by the AI process is a key TTP for identifying active exploitation.
Furthermore, the SOC should look for instances where the agent attempts to access sensitive directories (like .aws or .ssh) or performs bulk data exfiltration immediately after parsing a newly downloaded or updated repository. Implementing logging for all AI agent tool-calls—especially those involving shell execution—is essential for forensic analysis and visibility.
Impact on DevSecOps and the Supply Chain
This vulnerability introduces a significant Supply Chain Attack vector. An attacker does not need to compromise the developer’s machine directly; they only need to place a malicious comment in a popular open-source repository. When a developer pulls that code and uses an AI agent to help explain or refactor it, the agent becomes the vehicle for the attack. This turns the AI tool into a proxy for the attacker, bypassing many traditional network-level security controls.
Actionable Mitigation and Zero Trust Strategies
Defenders cannot rely on the AI providers alone to solve prompt injection, as it is an inherent property of current LLM architectures. Instead, organizations must adopt Zero Trust principles for AI deployment.
For GitHub Copilot agent mitigation, organizations should:
- Enforce Human-in-the-Loop (HITL): Never allow AI agents to execute shell commands or commit code without explicit, manual approval from a human developer.
- Restrict Permissions: Run AI CLI tools in containerized or sandboxed environments with minimal access to the host file system and no access to sensitive credentials.
- Audit Third-Party Integrations: Review the permissions granted to AI extensions and limit their ability to make external web requests.
- Content Filtering: Utilize specialized security layers that scan for prompt injection patterns in code comments before they are passed to the LLM context window.
Advertisement