Security Considerations for Artificial Intelligence Agents

Blaine Dillingham, Daniel King and Soham Mehta

March 9, 2026

The featured image for a post titled "Security Considerations for Artificial Intelligence Agents"

Today, we responded to the Center for AI Standards and Innovation's request for information on the security of AI agent systems.

Introduction

This comment responds to CAISI's request for information on the security of AI agent systems. We offer two proposals for CAISI’s consideration: 1) a standardized framework for differential access to advanced AI cyber capabilities, and 2) a technical architecture for runtime oversight, intervention, and incident reporting.

AI agent systems present security challenges at two distinct levels. The first is pre-deployment: determining which actors should have access to which capabilities, and under what conditions. The second is post-deployment: detecting and intervening when an agent's behavior diverges from its authorized purpose, even when its credentials are valid and its permissions were legitimately granted. Our two primary proposals address these levels respectively.

Regarding differential access, we argue that model-level behavioral restrictions are insufficient to ensure the security of highly capable agentic systems. The alternative is access control at the capability level by ensuring that the most powerful cyber capabilities are reserved for actors with the attributes to use them responsibly (e.g., traceability, expertise, and internal controls). Today this exists only as ad hoc, lab-specific programs with opaque criteria and no portability across providers. We recommend that CAISI develop a standardized credentialing framework, in response to the RFI’s questions about unique security threats due to adoption (1(c)), how technical controls can adapt in response to capability and deployment method (2(b)), and areas where government collaboration would most accelerate progress (5(a), 5(b)).

On oversight and incident reporting, we argue that access control and capability confinement leave residual risk that can only be addressed at runtime. We propose that CAISI treat oversight as a distinct technical layer and standardize an incident and near-miss reporting schema that would capture blocked attempts and contained injections. We further recommend that CAISI explore metrics for intent deviation over the course of a workload, as well as risk-budget mechanisms that bound cumulative exposure in long-horizon agent behavior. This responds principally to the RFI’s questions about human oversight controls (2(a)); methods for assessing threats and detecting incidents (3(a), 3(b)); modifying and monitoring deployment environments (4(b), 4(d)); and areas where government standard-setting would most accelerate progress (5(a)–(c)).

However, both proposals require addressing underlying questions of agent identity, delegation, and human accountability that arise as agents operate across platforms and services and delegate to sub-agents with diminishing human oversight. Who declared the intent that constrains an agent's behavior? How can a service at the end of a multi-agent delegation chain verify that a real, accountable human stands at its origin? How should identity and authorization propagate across domains without fragmenting into incompatible proprietary systems? We briefly address these questions in the final section of this comment, which we intend to develop more fully in a submission to NIST’s National Cybersecurity Center of Excellence’s (NCCoE) concurrent proceeding on software and AI agent identity and authorization.