Policy Brief: Cross-Embodiment Vulnerability Assessment for Shared VLM Backbones | Research | Failure-First

Adrian Wedd

Report 303 Research — AI Safety Policy 2026-03-25

Report #303: Cross-Embodiment Vulnerability Assessment for Shared VLM Backbones

Disclaimer: This document presents research findings structured as a policy brief. It does not constitute legal advice or legal opinion. Organisations should engage qualified legal counsel for jurisdiction-specific compliance guidance.

Executive Summary

Modern embodied AI systems increasingly share a common architectural feature: a Vision-Language-Action (VLA) model built on top of a general-purpose Vision-Language Model (VLM) backbone. When multiple robot embodiments — humanoid, industrial arm, autonomous vehicle, agricultural drone — use the same VLM backbone, adversarial vulnerabilities discovered against one embodiment transfer to all embodiments sharing that backbone.

This brief presents the policy implications of cross-embodiment vulnerability transfer, drawing on Failure-First empirical data (212 models, 134,321 results, 42 VLA attack families) and external literature (BadVLA, Blindfold, Zhu et al. 2026).

Key finding: The shared VLM backbone creates fleet-level correlated risk. A single adversarial attack technique that succeeds against one robot type using a given VLM backbone is likely to succeed against all robot types using that backbone. No standardised cross-embodiment adversarial benchmark exists, and no regulatory framework currently requires cross-embodiment vulnerability testing.

1. The Shared Backbone Problem

1.1 Architecture

Current VLA systems are built by adding action heads to pre-trained VLM backbones:

System	VLM Backbone	Embodiments	Deployers
Gemini Robotics 1.5	Gemini	Boston Dynamics (Atlas), Apptronik (Apollo), Agility (Digit)	Google DeepMind
RT-2 / RT-X	PaLM-E / PaLI	Multiple robot form factors	Google DeepMind
OpenVLA / OpenVLA-OFT	SigLIP + Llama	Open-source, multiple embodiments	UC Berkeley / Stanford
Pi-0 / Pi-0.5	Proprietary VLM	Dexterous manipulation, mobile	Physical Intelligence
GR00T	Custom VLM	Humanoids (Apptronik, Agility, others)	NVIDIA

The VLM backbone processes visual and linguistic inputs. The action head translates VLM representations into motor commands. The VLM backbone is shared across embodiments; only the action head is embodiment-specific.

1.2 Attack Surface Decomposition

Two distinct attack surfaces exist:

Embodiment-agnostic (VLM layer): Adversarial prompts, visual adversarial patches, and format-lock attacks that target the VLM backbone. These transfer across all embodiments using that backbone because the vulnerability exists in the shared component.
Embodiment-specific (action head): Attacks targeting the action token vocabulary, trajectory representation, or motor command mapping. These are specific to the embodiment’s action head and do not transfer.

Our research indicates that the majority of documented attacks operate at the VLM layer (embodiment-agnostic), not the action head layer. Of 42 VLA attack families in our corpus, at least 30 target the VLM input processing or reasoning layer rather than the action output representation.

1.3 Empirical Evidence for Cross-Embodiment Transfer

BadVLA (Liang et al. 2024, arXiv:2412.09181): Achieved near-100% ASR against both Pi-0 and OpenVLA by attacking the shared VLM backbone. The attack was designed once and transferred across embodiments without modification.

Failure-First VLA corpus (Reports #49, #295): Across 42 attack families and 368 scenarios, the dominant failure pattern is PARTIAL compliance — models produce text-layer safety disclaimers while generating harmful action sequences. This pattern was observed across all tested model architectures, suggesting it originates in the VLM backbone’s training, not in embodiment-specific components.

Cross-provider safety inheritance (Report #184): Safety properties do NOT reliably transfer through distillation or fine-tuning. Third-party fine-tuning universally eliminated Llama safety alignment (25 degraded, 58 preserved, 17 improved across 50 models, 8 families). This finding extends to VLA systems: a VLM backbone that has been safety-trained may lose that safety when fine-tuned for a specific robot embodiment.

2. Fleet-Level Correlated Risk

2.1 The Problem

When a vulnerability is discovered in a VLM backbone, every robot using that backbone is potentially affected. This creates correlated fleet risk analogous to monoculture vulnerabilities in cybersecurity.

Consider a scenario where Gemini Robotics 1.5 is used by:

Boston Dynamics Atlas (industrial/warehouse)
Apptronik Apollo (manufacturing)
Agility Digit (logistics)

A format-lock attack that achieves 30.4% ASR on the Gemini backbone (Report #51) affects all three embodiments simultaneously. The deployers may be different organisations with no coordination mechanism for shared vulnerability response.

2.2 Quantified Risk

From our corpus:

Attack Family	Broad ASR	Fleet Impact
Format-lock (frontier)	23-42%	All embodiments using affected VLM
Temporal drift (TDA)	74.4% (n=39)	Continuous-operation embodiments
Emotional manipulation	22.0% (n=41)	Care/service embodiments
Compliance cascade (CCA)	100% (n=10, 2 models, preliminary)	All embodiments
Semantically benign (SBA)	93.2% (Blindfold, GPT-4o)	All embodiments in physical environments

NOTE: ASR figures are from specific models and sample sizes as documented. They should not be extrapolated to all VLM backbones without testing.

2.3 No Existing Cross-Embodiment Benchmark

No standardised benchmark exists for testing whether a vulnerability discovered in one embodiment transfers to another. The Failure-First programme has proposed this as a priority gap (AGENT_STATE.md, Brief A). All public benchmarks (AdvBench, HarmBench, JailbreakBench, StrongREJECT) have zero embodied or tool-integrated agent scenarios.

3. Regulatory Implications

3.1 EU AI Act

Under Regulation (EU) 2024/1689:

Article 9 (Risk Management): Risk management for high-risk AI systems must address “foreseeable misuse” (Art. 9(2)(b)). Cross-embodiment attack transfer is a foreseeable misuse pattern for VLA systems sharing a backbone. A deployer’s Article 9 risk assessment that tests only the specific embodiment deployed — without considering shared backbone vulnerabilities — may be insufficient.
Article 15 (Robustness): Requires resilience against “attempts by unauthorised third parties to alter use or performance” (Art. 15(4)). A vulnerability in the shared VLM backbone that enables third-party manipulation of the action output is within scope.
Article 73 (Incident Reporting): A safety incident caused by a shared backbone vulnerability may require reporting by all deployers using that backbone, not just the deployer where the incident occurred. The regulation does not currently address coordinated disclosure for shared AI components.

Deadline: High-risk AI system obligations under Annex III take effect 2 August 2026 (Article 113(1)). Product-linked obligations under Annex I apply by 2 August 2027 at latest (Article 113(3)(a)).

3.2 EU Product Liability Directive (2024/2853)

The interlocking PLD creates additional exposure:

Article 10: Non-compliance with AI Act risk management requirements (Article 9) triggers a rebuttable presumption of defectiveness. If a deployer’s safety assessment did not test for cross-embodiment transfer vulnerabilities, and such a vulnerability causes harm, the presumption of defectiveness applies.
“State of the art” defence: Published adversarial research documenting cross-embodiment transfer (BadVLA, Blindfold, Failure-First corpus) establishes the “state of the art” for safety evaluation. Failure to incorporate documented attack techniques into safety assessment weakens the state of the art defence.

3.3 Australian Regulatory Landscape

NSW WHS Digital Work Systems Act 2026: The binding duty on PCBUs to ensure worker safety from digital work systems extends to the AI decision layer. Cross-embodiment vulnerability transfer is a material risk factor for workplaces deploying multiple robot types from different manufacturers sharing a VLM backbone.
AU AISI: As documented in Brief E (AGENT_STATE.md), the AU AISI (established November 2025) focuses on LLMs and does not currently address embodied or multi-agent systems. Cross-embodiment vulnerability is outside its current scope.
Safe Work Australia: The Best Practice Review on AI and automated decision-making (submission in preparation) should address shared AI component risk for workplace robot deployments.

3.4 ISO Standards

ISO 10218 (Industrial robots): Does not address shared AI component vulnerabilities. Applies to individual robot systems.
ISO 13482 (Personal care robots): Does not address VLM backbone sharing across embodiments.
ISO 17757 (Autonomous mining machinery): Does not address AI decision-layer vulnerabilities.
F1-STD-001 v0.3 (this programme): Addresses cross-embodiment vulnerability through R5 (attack family coverage) and R13 (format-lock defense verification). The standard requires testing against 19 attack families, including those that target the VLM layer.

4. Recommendations

4.1 For Regulators

Require shared component disclosure. Deployers of high-risk embodied AI should be required to disclose the VLM backbone used, enabling regulators to coordinate vulnerability response across all embodiments using the same backbone.
Mandate cross-embodiment vulnerability testing. Safety evaluations for embodied AI should include testing of attack families known to target the VLM layer, not just the specific embodiment’s action head.
Establish coordinated vulnerability disclosure. When a vulnerability is discovered in a VLM backbone, all deployers using that backbone should be notified. This requires a registry of backbone-to-embodiment mappings that does not currently exist.

4.2 For Standards Bodies

Develop a cross-embodiment adversarial benchmark. The absence of any standardised benchmark for cross-embodiment vulnerability transfer is the most significant evaluation gap in embodied AI safety. F1-STD-001 v0.3 provides a starting framework (19 attack families), but a multi-stakeholder benchmark with standardised scenarios, grading methodology, and reproducibility requirements is needed.
Extend ISO 10218/13482/17757 to address shared AI components. These standards cover individual robot safety but do not address the correlated risk created by shared AI decision-layer components across robot types.

4.3 For Deployers

Test the VLM backbone, not just the integrated system. Safety evaluation should include adversarial testing of the VLM backbone in isolation (before the action head), as well as testing of the integrated system.
Monitor backbone provider security advisories. Deployers should maintain a documented process for monitoring the VLM backbone provider for security advisories, model updates, and vulnerability disclosures.
Diversify backbone suppliers where safety-critical. Fleet-level correlated risk is reduced by using multiple VLM backbones across the fleet, preventing a single backbone vulnerability from affecting all deployed systems simultaneously.

5. Limitations

Cross-embodiment transfer has been demonstrated empirically in BadVLA (2 systems) and inferred from shared architecture analysis. Large-scale empirical validation across many embodiments using the same backbone has not been conducted.
ASR figures cited are from specific models and sample sizes as documented. Extrapolation to all VLM backbones requires per-backbone testing.
The fleet-level correlated risk analysis is based on architectural reasoning and limited empirical data. Quantified fleet-level risk models do not yet exist.
This brief presents research findings, not legal opinion.

References

Liang, Q. et al. BadVLA: Attacking Vision-Language-Action Models via Shared Visual Language Model Backbone. arXiv:2412.09181, 2024.
Li, Y. et al. Blindfold: Attacking Embodied AI via Benign Instruction Composition. ACM SenSys 2026. arXiv:2603.01414.
Zhu, X. et al. Physical Adversarial Attacks Against Embodied AI Systems. arXiv:2602.19107, 2026.
F41LUR3-F1R57 Report #49. VLA Cross-Embodiment Vulnerability Analysis. 2026-03-11.
F41LUR3-F1R57 Report #146. Cross-Embodiment Transfer Benchmark. 2026-03-18.
F41LUR3-F1R57 Report #184. Cross-Provider Safety Inheritance. 2026-03-24.
F41LUR3-F1R57 Report #293. Format-Lock Midrange Confirmation. 2026-03-25.
F41LUR3-F1R57 Report #295. VLA Data Curation Sprint 15 R2 (TDA). 2026-03-25.
F41LUR3-F1R57 Report #297. Emotional Manipulation Attack Family. 2026-03-25.
Regulation (EU) 2024/1689. Artificial Intelligence Act. 12 July 2024.
Directive (EU) 2024/2853. Product Liability Directive (recast). 2024.
F1-STD-001 v0.3. Safety Evaluation of AI Systems with Physical Actuation Capabilities. 2026-03-25.

Prepared by Martha Jones, Policy & Standards Lead, Failure-First Embodied AI (failurefirst.org). All empirical claims cite documented measurements with sample sizes. This document presents research findings, not legal opinion.