- Failure-First VLA corpus: 215 scenarios across 24 attack families (CANONICAL_METRICS.md, verified 2026-03-16)
- FLIP-graded results: 131,836 total results, 47,303 LLM-graded (CANONICAL_METRICS.md)
- Competence-Danger Coupling (CDC): Report #107 (formerly #89b) (Clara Oswald, cross-domain IDDL transfer)
- IDDL: Report #88 (Clara Oswald)
- Worker safety impact: Report #92 (this author)
- Australian AISI: established January 2026 within DISR
- EU AI Act: Regulation 2024/1689, Articles 6, 9, Annex III
- SafeWork Australia WHS Act 2011, Model Codes of Practice
- NIST AI RMF 1.0 (January 2023)
Executive Summary
This report proposes a practical governance framework for embodied AI safety testing. The proposal responds to three structural problems identified in prior Failure-First research:
-
The Competence-Danger Coupling (CDC, Report #107 (formerly #89b)): In embodied AI, the capabilities that make systems useful are the same capabilities that make them dangerous. Safety mechanisms that could prevent danger necessarily impair usefulness. This means the standard regulatory approach — certify that the system is safe, then deploy — is structurally insufficient because “safe” and “functional” are in tension at the architectural level.
-
The Inverse Detectability-Danger Law (IDDL, Report #88): The most dangerous embodied AI attacks are the least detectable by current evaluation methods. This means that safety evaluation institutions that rely on text-layer testing will systematically miss the attacks that matter most.
-
The Disclosure Paradox (Report #93): The attacks that most urgently need defensive attention are the attacks whose disclosure provides the least defensive benefit. This means that the knowledge generated by safety testing itself becomes a governance challenge.
Claim types in this report:
- Sections 1-2 are primarily descriptive (what institutions exist, what problems they face).
- Sections 3-5 are normative (what institutions should exist, what mandates they should have).
- Section 6 is predictive (what is likely to happen without intervention) and explicitly hedged.
What this report does NOT do: It does not propose specific legislation or regulatory text. That is Martha Jones’s domain (legal memos) and Sarah Jane Smith’s domain (standards engagement). This report provides the ethical and structural analysis that grounds policy positions.
1. The Governance Gap: Why Current Institutions Cannot Govern Embodied AI Safety
1.1 The Institutional Landscape (Descriptive)
Descriptive claim: As of March 2026, no institution anywhere in the world has a specific mandate, methodology, and enforcement authority for evaluating the safety of embodied AI systems against adversarial attacks. Several institutions have adjacent mandates:
AI Safety Institutes:
- US AISI (NIST, established 2023): Focused on frontier model evaluation. No embodied AI evaluation programme. Testing methodology is text-layer (red-teaming, benchmark suites). Institutional capacity constrained by federal hiring rules and budget.
- UK AISI (DSIT, established 2023): Similar text-layer focus. Published pre-deployment testing frameworks for frontier models. No embodied AI component.
- Australian AISI (DISR, established January 2026): Newest of the three. Mandate includes “AI safety research” broadly. Initial focus appears to be LLM evaluation. Embodied AI expertise is not evident in public staffing or work programme. Funding through DISR creates potential conflict of interest with industry development objectives (Report #84).
Robotics Safety Regulators:
- SafeWork Australia / state WHS regulators: Strong mandate for physical workplace safety. Deep expertise in machine safety, risk assessment, and enforcement. No AI-specific evaluation capability. WHS risk assessments for autonomous mining equipment treat the autonomy stack as a deterministic system, not an adversarial target.
- NHTSA (US): Autonomous vehicle safety authority. Investigates incidents involving autonomous vehicles. No adversarial AI testing programme. Evaluates vehicle safety systems against failure modes, not attack modes.
- TGA / FDA: Medical device regulators. Pre-market evaluation of surgical and assistive robots. Focus on clinical safety in intended use. Adversarial misuse is generally outside the evaluation scope (though FDA’s cybersecurity guidance for medical devices, last updated 2023, begins to address this for network-connected devices).
Descriptive conclusion: The AI safety institutes understand AI but not physical safety. The robotics safety regulators understand physical safety but not adversarial AI. No institution combines both competencies.
1.2 The Three Structural Problems
Analytical claim: The institutional gap is not merely an oversight that will be filled as institutions mature. Three structural properties of embodied AI safety make governance genuinely difficult:
Problem 1: CDC means “safe” is not a binary property.
Traditional product safety certification operates on a binary model: the product meets safety standards, or it does not. A haul truck either has functioning brakes or it does not. An electrical installation either meets wiring standards or it does not.
The CDC undermines this binary. An embodied AI system that reliably follows instructions (functional) is an embodied AI system that can be instructed to do dangerous things in dangerous contexts (unsafe). The “unsafe” state is not a failure of the system — it is a consequence of the system working correctly. You cannot certify such a system as “safe” in the traditional sense because the safety properties depend on the deployment context, which changes continuously.
Implication for governance: Certification-based governance (issue a certificate that the system is safe, then allow deployment) is insufficient for embodied AI. Governance must be ongoing, context-sensitive, and able to respond to novel attack patterns that emerge after deployment.
Problem 2: IDDL means evaluation institutions cannot detect the most important risks.
An evaluation institution that uses text-layer testing (which is all current AI evaluation methodology) will certify embodied AI systems as safe against attacks it can detect while missing the attacks that cause the most physical harm. This is not a resource problem (more testers, more scenarios) — it is a methodological limitation. The evaluation paradigm itself is blind to CDC-class vulnerabilities.
Implication for governance: Any governance framework that relies on evaluation results must include a mechanism for acknowledging and communicating evaluation limitations. An evaluation certificate for an embodied AI system that does not state “this evaluation does not cover physically contextual attacks” is misleading (see Report #93, Section 3).
Problem 3: Disclosure paradox means safety research generates its own governance challenges.
The knowledge needed to build defenses is the same knowledge that enables attacks. For CDC-class vulnerabilities, this tension is particularly acute because the “attack” is simply using the system in a context that makes normal operation dangerous. Publishing this insight creates awareness but not defense capability.
Implication for governance: The governance framework must include provisions for managing dual-use safety research, not just for testing deployed systems. This is an unusual requirement — most product safety governance focuses on the product, not on the safety research about the product.
2. What Mandatory Testing Should Look Like
2.1 The Case for Mandatory Testing (Normative)
Normative claim: Embodied AI systems that operate in proximity to humans should be subject to mandatory adversarial safety testing before deployment and on an ongoing basis after deployment. The case rests on three premises:
Premise 1: Embodied AI systems can cause direct physical harm. This is the factual foundation: unlike text-only AI, embodied systems have actuators that operate in the physical world. The harm pathway from “adversarial input” to “physical injury” is direct and fast (sub-second in current industrial control loops).
Premise 2: Voluntary safety testing is insufficient when the entity responsible for testing is also the entity that benefits from deployment. This is the standard market failure argument for safety regulation: companies have economic incentives to underestimate risks and overstate safety. The argument is not novel — it is the foundation of workplace safety law, vehicle safety law, and medical device regulation globally. It applies to embodied AI for the same reasons.
Premise 3: The CDC means that safety properties are deployment-context-dependent. A system that is safe in one context may be unsafe in another. Voluntary testing conducted by the developer in controlled lab conditions does not capture the deployment contexts where the system will actually operate. Mandatory testing by an independent evaluator with authority to define test conditions is necessary to close this gap.
Acknowledging the counterargument: The strongest objection to mandatory testing is that it will slow innovation and deployment of beneficial embodied AI systems. This objection has force. The governance framework must balance safety requirements against the costs of compliance, including delayed access to beneficial technology. The framework proposed below attempts this balance through risk-proportionate testing requirements — more stringent testing for higher-risk deployments, lighter requirements for controlled environments.
2.2 A Three-Layer Testing Architecture (Normative Proposal)
Layer 1: Pre-deployment evaluation (mandatory for high-risk categories)
Before an embodied AI system is deployed in a setting where it operates in proximity to humans, it should undergo adversarial safety evaluation by an independent evaluator. “Independent” means not employed by, funded by, or commercially dependent on the developer or deployer.
What “adversarial safety evaluation” means in practice:
The evaluation must include, at minimum:
- Text-layer red-teaming: Standard jailbreak and prompt injection testing. This is the baseline that current AI safety evaluation provides.
- Physical-context scenario testing: SRDEA Tier 1 and Tier 2 attack families from the Failure-First taxonomy (SBA, LHGD, CET, TCH at minimum) must be tested using scenarios adapted to the specific deployment context. This is the capability that no current evaluation institution has.
- IDDL disclosure: The evaluation report must state which attack classes the evaluation can and cannot detect, using the evaluator card format proposed in Report #93. An evaluation that covers only text-layer testing must say so.
What the evaluation does NOT need to include: The evaluation need not solve the CDC problem. It cannot certify the system as safe against all possible physically contextual attacks (that would require solving the world-model problem). It needs to (a) test against known attack families, (b) state what it has and has not tested, and (c) provide the deployer with information needed for context-specific risk assessment.
Layer 2: Deployment-context risk assessment (mandatory, done by deployer with evaluator support)
The deployer (the PCBU under Australian WHS law, the operator under the EU AI Act) must conduct a deployment-context risk assessment that maps the evaluation results onto the specific physical environment where the system will operate.
This addresses the CDC directly: the same system may be safe in one context and unsafe in another. The pre-deployment evaluation identifies the system’s vulnerability profile. The deployment-context assessment identifies which of those vulnerabilities are relevant in the specific workplace.
Example: A VLA-based collaborative robot evaluated against the Failure-First SBA scenarios shows vulnerability to contextual manipulation when handling sharp objects. The deployment-context assessment for a kitchen environment (sharp knives present) rates this as high risk. The assessment for a foam packaging environment rates it as low risk. The same system, the same evaluation results, different risk conclusions.
Layer 3: Post-deployment monitoring and re-evaluation (mandatory for high-risk, recommended for others)
Embodied AI systems in high-risk deployments (as defined by the EU AI Act Annex III categories or equivalent national classification) should be subject to ongoing monitoring for:
- Anomalous instruction patterns that may indicate adversarial probing.
- Near-miss events where the system’s action sequence approached a physical safety boundary.
- Changes in the model (API updates, fine-tuning, system prompt changes) that may alter the vulnerability profile.
Re-evaluation should be triggered by: model updates, deployment context changes, or new attack family discoveries.
2.3 Risk-Proportionate Categories
Normative proposal: Not all embodied AI deployments require the same testing intensity. A risk-proportionate approach, broadly consistent with the EU AI Act’s risk classification:
| Risk Category | Examples | Layer 1 | Layer 2 | Layer 3 |
|---|---|---|---|---|
| Critical | Surgical robots, autonomous vehicles on public roads | Full adversarial evaluation including physical-context testing | Mandatory deployment-context assessment with independent review | Continuous monitoring, annual re-evaluation, incident reporting |
| High | Industrial robots co-located with workers, autonomous mining equipment, care robots | Full adversarial evaluation | Mandatory deployment-context assessment | Periodic monitoring, re-evaluation on model change |
| Medium | Warehouse robots with separation barriers, agricultural drones | Text-layer evaluation + IDDL disclosure | Deployment-context assessment (self-assessment acceptable) | Incident reporting |
| Low | Robotic vacuum cleaners, educational robots in controlled settings | Voluntary evaluation with IDDL disclosure | Recommended self-assessment | Voluntary |
Analytical note: The categories map approximately to the EU AI Act’s Annex III high-risk categories, but with a physical-proximity dimension that the Act does not explicitly include. An embodied AI system in a controlled environment with no human co-location is lower risk than the same system in an open environment with human co-location, even if the system’s intended purpose is the same.
3. Who Should Govern: Institutional Design
3.1 The Hybrid Institution Model (Normative Proposal)
Normative claim: No single existing institution has the combined competencies needed to govern embodied AI safety testing. The required competencies are:
- AI evaluation methodology — understanding how AI models work, how to test them, how adversarial attacks function.
- Physical safety engineering — understanding physical risk assessment, industrial safety, workplace safety law.
- Adversarial thinking — the red-teaming mindset that goes beyond compliance testing to actively seek failures.
- Independence — structural separation from the entities being evaluated, to prevent capture.
Proposed model: Federated Embodied AI Safety Authority
Rather than creating a single new institution (politically difficult, slow to establish, prone to mandate creep), the framework proposes a federated model that leverages existing institutional strengths:
Component 1: AI Safety Institutes provide evaluation methodology. The AISIs (US, UK, Australia) develop and maintain the adversarial testing methodology — the attack taxonomies, scenario libraries, evaluator tools, and SRDEA classification standards. They do not conduct evaluations directly (to preserve their advisory independence) but accredit third-party evaluators.
Component 2: Workplace safety regulators provide enforcement authority. SafeWork Australia, OSHA, the Health and Safety Executive (UK) already have legal authority to require safety testing for workplace equipment, inspect workplaces, and impose penalties for non-compliance. Embodied AI adversarial safety testing is added as a requirement under existing WHS frameworks, not as new legislation. The WHS regulators enforce; the AISIs provide the technical standards.
Component 3: Independent third-party evaluators conduct the testing. Accredited evaluation bodies (analogous to NATA-accredited testing labs in Australia, or Notified Bodies under the EU AI Act) conduct the actual adversarial evaluations. Accreditation requires demonstrated competence in both AI evaluation and physical safety testing. Evaluators must not have commercial relationships with the entities they evaluate.
Component 4: A coordination body manages the dual-use knowledge. A small body (potentially housed within an AISI, or as an independent committee) manages the responsible disclosure of embodied AI attack research. It receives SRDEA assessments from researchers, coordinates vendor notifications, and maintains a registry of known attack families and their disclosure status. This body is the embodied AI equivalent of a CERT/CC for vulnerability coordination.
3.2 Why Not a Single New Regulator?
Analytical claim: A single new “Embodied AI Safety Agency” would face three problems:
- Competency gap. Building combined AI and physical safety expertise from scratch takes years. The federated model leverages existing expertise immediately.
- Jurisdictional conflict. A new agency would overlap with WHS regulators (physical safety), AI safety institutes (AI evaluation), and sector-specific regulators (TGA for medical devices, NHTSA for vehicles). Jurisdictional disputes delay action.
- Capture risk. A new agency with a narrow mandate is more susceptible to industry capture than a federated model where multiple institutions with different constituencies provide checks on each other. (Report #84 documents this risk for the Australian AISI specifically.)
3.3 The Independence Architecture
Normative claim: Independence from the entities being evaluated is the single most important design parameter. Historical precedent is unambiguous: safety evaluation that is funded, staffed, or controlled by the evaluated industry produces systematically biased results. (The Boeing 737 MAX FAA delegation failure is the canonical modern example: the FAA delegated safety certification to Boeing’s own engineers, with documented consequences.)
Structural safeguards:
-
Funding independence. Evaluator accreditation fees should be set by the accrediting body (the AISI), not negotiated with the evaluated entity. Evaluator revenue should come from a broad client base, not concentrated in a few large AI companies.
-
Personnel independence. Evaluator staff should not have worked for the evaluated entity within the preceding 3 years (cooling-off period). AISI staff developing evaluation standards should not hold equity in or receive consulting income from AI companies.
-
Methodological independence. The attack taxonomy, scenario library, and SRDEA classification standards should be developed by the AISIs with academic and civil society input, not by industry working groups. Industry can propose additions but should not control the standard-setting process.
-
Transparency independence. Evaluation results for high-risk and critical-risk deployments should be publicly available in summary form (attack families tested, IDDL coverage statement, tier assignment). Operational details may be withheld per SRDEA Tier 1 norms, but the existence and scope of the evaluation should be public.
4. The CDC Governance Impossibility and How to Live With It
4.1 The Impossibility Stated
Analytical claim: The CDC creates a governance impossibility: you cannot certify an embodied AI system as “safe” in the traditional sense because the safety properties depend on the deployment context, which is unbounded. Any finite set of test scenarios will miss some physical contexts in which normal system operation produces harm.
This is not a criticism of the governance framework proposed above — it is a constraint that the framework must acknowledge. The framework does not claim to make embodied AI systems safe. It claims to:
- Test against known attack families (Layer 1).
- Map known vulnerabilities to specific deployment contexts (Layer 2).
- Monitor for novel risks after deployment (Layer 3).
- Communicate what has and has not been tested (IDDL disclosure).
4.2 Living with Residual Risk
Normative claim: The honest governance position is: embodied AI systems in proximity to humans carry residual risk that cannot be eliminated through testing. The governance framework manages this residual risk through:
-
Transparency. Deployers, workers, and the public should know that evaluation is incomplete. The IDDL coverage statement on every evaluation report makes this explicit.
-
Proportionality. Higher-risk deployments get more testing, not because more testing eliminates the risk, but because it narrows the residual risk window.
-
Reversibility preference. Where possible, embodied AI deployments should be designed for reversibility: physical safety barriers, speed limits, restricted operational envelopes, human override capability. These engineering controls reduce the consequences of attack success even when the attack itself cannot be detected.
-
Incident learning. When incidents occur (and the CDC predicts that they will), the governance framework should have a rapid learning loop: incident investigation, attack family classification, scenario library update, re-evaluation requirement. This is the embodied AI equivalent of aviation’s “just culture” approach to incident reporting.
4.3 The Workers’ Right to Know
Normative claim: Workers who share workspace with embodied AI systems have a right to know that:
- The system has been adversarially tested (or has not been).
- The evaluation has known blind spots (IDDL disclosure).
- Residual risk exists that cannot be eliminated through testing.
- Physical safety controls (barriers, speed limits, emergency stops) are the primary protection, not the AI system’s safety training.
This is an extension of existing WHS “right to know” principles (workers have the right to information about workplace hazards) applied to the specific context of adversarial AI risk. It should be communicated through existing WHS consultation mechanisms — safety committees, toolbox talks, risk register updates — not through technical AI safety reports that workers will not read.
5. The Dual-Use Knowledge Management Problem
5.1 Who Holds the Attack Knowledge?
Descriptive claim: Under the proposed framework, several parties will accumulate knowledge about embodied AI attack families:
- AISIs (developing the methodology).
- Third-party evaluators (conducting the tests).
- Researchers (discovering new attack families).
- Vendors (receiving vulnerability notifications).
Each party has different incentives regarding this knowledge:
- AISIs want to improve methodology (incentive to share).
- Evaluators want to demonstrate competence (incentive to share).
- Researchers want to publish (incentive to share).
- Vendors want to minimise disclosure of their products’ vulnerabilities (incentive to suppress).
5.2 The SRDEA-Based Disclosure Protocol
Normative proposal: The coordination body (Component 4 of the federated model) should manage dual-use knowledge using the SRDEA framework:
-
New attack family discovered. Researcher or evaluator conducts SRDEA assessment. Assessment is filed with the coordination body.
-
Tier 1 findings. Coordination body notifies affected vendors. 90-day disclosure window begins. Pattern-level disclosure to AISIs and evaluator community is immediate. Operational details are embargoed.
-
Tier 2 findings. Standard coordinated disclosure. Vendor notification, 90-day window, then full publication with defensive context.
-
Tier 3 findings. Standard academic publication norms. No vendor pre-notification required unless the finding is model-specific and novel.
-
Embargo expiry. After 90 days (Tier 2-3) or 12 months (Tier 1, extended window for architectural vulnerabilities), the finding is published regardless of vendor response. Indefinite suppression is not permitted.
-
Registry. The coordination body maintains a registry of known attack families, their SRDEA classifications, disclosure status, and available defenses. The registry is available to accredited evaluators and AISIs. A pattern-level summary is publicly available.
6. Predictive Assessment: What Happens Without Governance
Predictive claim (hedged): If no governance framework is established for embodied AI safety testing before widespread deployment of VLA-based systems in worker-proximate settings, the following outcomes are probable (not certain):
-
Incident-driven regulation. The first serious injury or fatality caused by an adversarial attack on a deployed embodied AI system will trigger regulatory action. This is the historical pattern for workplace safety regulation (major incidents drive legislative change). The resulting regulation will be reactive, may be poorly designed (drafted under political pressure), and will arrive after harm has occurred.
-
Liability vacuum. When the incident occurs, the liability question (who is responsible: developer, deployer, model provider, system integrator?) will be unresolved. Report #79 documented this accountability vacuum. Without pre-established governance, the litigation to resolve it will be protracted, expensive, and unlikely to produce coherent precedent.
-
Evaluation theatre. In the absence of mandatory standards, some deployers will commission voluntary evaluations that test only what they can pass. Text-layer evaluations will be presented as comprehensive safety assessments. The IDDL means these evaluations are structurally misleading, but without a governance framework that requires IDDL disclosure, the misleading character will not be apparent to regulators or the public.
-
Research chill. Without a coordinated disclosure framework, safety researchers face legal risk from vendors who may use trade secret or computer fraud laws to suppress vulnerability disclosures. The cybersecurity field spent decades establishing legal protections for security researchers. Embodied AI safety research will face the same dynamic.
Predictive timeline (very uncertain): VLA-based systems in worker-proximate industrial settings are plausible within 2-5 years (based on announced industry roadmaps from Google DeepMind, Physical Intelligence, and major industrial robotics companies). The governance framework should be established before this transition, not after the first incident.
7. Implementation Pathway for Australia
7.1 Why Australia First
Analytical claim: Australia is unusually well-positioned to pioneer embodied AI safety governance for three reasons:
-
Largest autonomous mining fleet in the world. 1,800+ autonomous haul trucks create immediate industrial relevance. The transition from classical autonomy stacks to VLA-based systems will happen here first because the economic incentive (labour cost, remote operation, harsh environment) is strongest.
-
Strong WHS regulatory framework. The Model WHS Act, enforced by SafeWork Australia and state regulators, provides an existing legal framework for requiring safety testing of workplace equipment. Adversarial AI testing can be incorporated under existing duty-of-care obligations without new legislation.
-
New AISI. The Australian AI Safety Institute, while currently focused on text-layer evaluation, has a broad mandate that could encompass embodied AI. Early engagement (per Issue #347, Standards Australia IT-043 EOI) could shape the Institute’s work programme before it ossifies.
7.2 Concrete Steps
Normative recommendation (immediate, 0-6 months):
- Engage Australian AISI to include embodied AI adversarial evaluation in its research programme. Provide the SRDEA framework, attack taxonomy, and scenario library as a starting point.
- Brief SafeWork Australia on the CDC and IDDL findings. Propose that WHS risk assessments for autonomous workplace equipment include an adversarial attack component.
- Submit a contribution to Standards Australia IT-043 on embodied AI evaluation standards, referencing the three-layer testing architecture.
Normative recommendation (medium-term, 6-18 months): 4. Develop an evaluator accreditation framework for embodied AI adversarial testing. Define competency requirements, independence standards, and IDDL disclosure obligations. 5. Pilot the three-layer testing architecture with a willing industry partner (preferably a mining company operating autonomous equipment). 6. Establish a coordination body for dual-use knowledge management, initially as a voluntary agreement between Australian safety researchers and the AISI.
Normative recommendation (longer-term, 18-36 months): 7. Incorporate embodied AI adversarial testing requirements into the WHS Model Code of Practice for Automated and Robotic Systems. 8. Seek recognition of the evaluation standards in the EU AI Act conformity assessment process (mutual recognition). 9. Formalise the coordination body with international participation.
8. Limitations
-
The governance framework is aspirational. No jurisdiction has implemented anything like this. The proposal is grounded in the structural analysis from the Failure-First corpus, but its practical feasibility depends on political will, institutional capacity, and industry cooperation — none of which can be guaranteed.
-
The federated model has coordination costs. Multiple institutions sharing governance responsibilities requires strong coordination mechanisms that may be difficult to establish in practice.
-
The risk categorisation is approximate. The four-tier risk classification in Section 2.3 is a starting point, not a tested methodology. The boundary between “high” and “critical” risk is not precisely defined.
-
Australia-specific implementation may not transfer. The WHS regulatory framework in Australia differs from the US (OSHA) and EU (Framework Directive 89/391/EEC) in important details. The implementation pathway in Section 7 is specific to Australia.
-
The CDC governance impossibility is real. Acknowledging residual risk is honest but may undermine political support for the framework. Regulators and the public may expect safety certification to mean “safe,” not “tested against known attack families with acknowledged blind spots.”
-
Sample sizes underlying the IDDL and SRDEA are small. The IDDL rests on n=91 FLIP-graded VLA traces. The governance proposal inherits this uncertainty.
Appendix A: Comparison with Existing Governance Proposals
| Feature | EU AI Act (2024) | NIST AI RMF (2023) | This Proposal |
|---|---|---|---|
| Embodied AI specificity | No (applies generally to “AI systems”) | No (risk management framework is general) | Yes (designed for embodied AI) |
| Adversarial testing | Mentioned (Article 9) but not specified | Mentioned (MAP function) but not specified | Specified (three-layer architecture) |
| Physical-context testing | No | No | Yes (Layer 1 includes SBA-class testing) |
| IDDL disclosure | No | No | Yes (mandatory evaluator card) |
| Independence requirements | Notified Bodies must be independent | No independence requirement | Explicit structural safeguards (Section 3.3) |
| Worker consultation | Not specific to AI | Not specific to workers | Yes (Section 4.3, right to know) |
| Dual-use management | No | No | Yes (SRDEA-based disclosure protocol) |
| CDC acknowledgment | No | No | Yes (residual risk framework) |
Appendix B: The CDC Governance Trilemma
Novel analytical contribution: The CDC, IDDL, and disclosure paradox together create a governance trilemma:
- Capability. You want the embodied AI system to be capable (follow instructions, manipulate objects, navigate environments).
- Safety. You want the system to be safe (not cause physical harm).
- Transparency. You want the safety evaluation to be transparent (publicly known what was tested, what was found).
The CDC says you cannot fully have both (1) and (2) — capability is vulnerability. The IDDL says you cannot fully have both (2) and (3) — the most important safety gaps are invisible to evaluation. The disclosure paradox says you cannot fully have both (1) and (3) — transparent disclosure of capability-as-vulnerability enables exploitation.
The governance framework proposed in this report does not resolve the trilemma. It manages it through:
- Proportionality (accept different levels of each in different risk categories).
- Transparency about trade-offs (IDDL disclosure, residual risk acknowledgment).
- Ongoing learning (post-deployment monitoring, incident-driven re-evaluation).
The trilemma is a structural property of embodied AI governance. Any framework that claims to fully achieve all three is either not dealing with CDC-class vulnerabilities or is making claims its methodology cannot support.
This report is a structural analysis of embodied AI governance requirements. It proposes a framework but does not endorse specific legislation or regulatory action. The analysis is the author’s ethical and institutional assessment, informed by the Failure-First corpus. Implementation requires political, legal, and institutional decisions that are outside the scope of ethical analysis.