Competitive Intelligence — AI Safety Red Teaming Market | Research | Failure-First

Adrian Wedd

Report 217 Research — Empirical Study 2026-03-24

Audio Overview

Executive Summary

This report provides a deep-dive competitive analysis of five companies identified in the investor brief as relevant to the Failure-First Embodied AI research program: Mindgard, HiddenLayer, Alias Robotics, Robust Intelligence (now Cisco AI Defense), and CalypsoAI (now F5). For each, we assess their offerings, pricing signals, limitations, and our differentiation. We then map Failure-First’s unique advantages against the market, identify gaps no competitor fills, and assess our 12-month competitive moat.

Key finding: No competitor offers adversarial red teaming specifically for VLA models in embodied AI systems. The window of differentiation is estimated at 12-18 months. Five of the seven funded AI red teaming startups were acquired by cybersecurity incumbents in 2024-2025, signaling that the standalone platform market is being absorbed. The remaining gap sits at the intersection of AI model adversarial testing, physical robot safety, and VLA-specific attack surfaces — precisely where Failure-First operates.

1. Competitor Deep Dives

1.1 Mindgard (Lancaster, UK)

What they test: LLMs, NLP models, multi-modal models (vision, audio). Their DAST-AI product positions as continuous security testing across the AI lifecycle. Attack coverage includes prompt injection, jailbreaks, model manipulation, data poisoning, and evasion attacks. Model-agnostic; aligned to MITRE ATLAS and OWASP.

How they test: Automated red teaming SaaS platform. Spun out of Lancaster University (2022) with an ongoing PhD research pipeline feeding the product. The platform runs structured attack campaigns against target models and produces compliance-oriented reports.

Funding and scale:

$11.9M total ($ 8M Series A, Dec 2024, led by .406 Ventures; $3.95M seed)
~15 FTEs + PhD researchers at Lancaster
New CEO James Brear (Oct 2025, proven cybersecurity leader) — signals aggressive commercialization
New Head of Product (Dave Ganly, ex-Twilio), VP Marketing (Fergal Glynn, ex-Fortinet) — commercial acceleration
North American expansion underway (leadership presence in Boston)
2026 Gartner Emerging Tech mention (AI TRiSM category, top-funded startups)

Compliance certifications: SOC 2 Type II, GDPR compliant, ISO 27001 expected early 2026.

Pricing: Not publicly disclosed. Enterprise SaaS subscription model. Based on comparable cybersecurity testing platforms, estimated at USD 50K-200K/year for enterprise customers.

Limitations:

No embodied AI, robotics, or VLA coverage
No physical-world failure mode taxonomy
No public attack corpus or technique taxonomy (proprietary, not disclosed)
No policy/standards research output (marketing material, not standards-aligned analysis)
Academic spinout team — strong research but limited enterprise sales track record pre-2025

How we differentiate:

Failure-First covers the physical-world consequence layer that Mindgard’s digital-only platform does not address
Our 190-model corpus with 135,623 results provides cross-model comparison at a scale Mindgard has not publicly demonstrated
6 novel attack families (CRA/PCA/MDA/MAC/SSA/RHA) target embodied-specific failure modes outside Mindgard’s scope
Our policy output (168+ reports, 66 legal memos, ISO/EU AI Act mapping) exceeds what any platform vendor produces
FLIP classifier with measured kappa provides grading methodology transparency that Mindgard’s proprietary scoring does not

Threat assessment: HIGH. Their multi-modal platform is architecturally proximate to VLA testing. If a robotics customer requests VLA coverage, they have the infrastructure to attempt it. The CEO appointment and Gartner recognition give them enterprise credibility.

1.2 HiddenLayer (Austin, TX, USA)

What they offer: An MLSecOps platform reorganized (2025-2026) into four modules:

AI Discovery — inventory and classify AI models across the enterprise
AI Supply Chain Security — scan models for tampering, backdoors, and supply chain risks
AI Attack Simulation — simulate adversarial attacks against deployed models
AI Runtime Security — monitor production models for anomalous behavior in real time

How they approach it: Primarily defensive — runtime monitoring and anomaly detection. Their AI Attack Simulation module represents a move toward proactive testing, but their heritage is in detection rather than red teaming. They detect adversarial ML attacks and abnormal model behavior at the inference layer.

Funding and contracts:

Total funding not publicly confirmed; multiple rounds from strategic investors
Won Missile Defense Agency (MDA) SHIELD contract for Airgapped AI Security Platform — significant US defense market penetration
2026 AI Threat Landscape Report (released Mar 2026) focuses on agentic AI risks
Security Dashboard with posture visualization

Pricing: Not publicly disclosed. Enterprise SaaS model. Given the defense contract and platform scope, estimated at USD 100K-500K/year for enterprise deployment.

Limitations:

Heritage is defensive (runtime monitoring), not offensive (pre-deployment red teaming)
No embodied AI, robotics, or VLA coverage
AI Attack Simulation module scope unclear — may not extend to physical-world models
No public taxonomy, corpus, or research output
Defense contract (MDA) is adjacent to autonomous systems but not confirmed to involve embodied AI

How we differentiate:

Failure-First is offensive-first (attack simulation, adversarial benchmarking); HiddenLayer is defensive-first (runtime monitoring). These are complementary, not competing
Our adversarial corpus and technique taxonomy provide the attack knowledge base that HiddenLayer’s detection needs to reference
Embodied AI specialization gives us the physical-world failure modes that HiddenLayer’s digital-only monitoring does not cover
Potential partner rather than competitor: HiddenLayer could integrate our attack patterns into their detection rules

Threat assessment: MEDIUM. Defense contract is notable. If their AI Attack Simulation module expands to encompass pre-deployment red teaming for robotics systems, overlap increases. Currently complementary.

1.3 Alias Robotics (Vitoria-Gasteiz, Spain)

What they offer:

REPP — Robot Endpoint Protection Platform (endpoint security for robotic systems)
CAI — Cybersecurity AI framework (open-source, automates penetration testing for IT/OT/robotic systems)
CAI PRO — Enterprise version with dashboard, compliance reporting, multi-target management
alias1 — LLM specialized in offensive/defensive cybersecurity
Robot Vulnerability Database (RVD) — public database of robot-specific CVEs and vulnerabilities

How they approach it: Infrastructure-level robot cybersecurity — Bluetooth, firmware, network, telemetry analysis. Their research focuses on finding and disclosing vulnerabilities in robot platforms (notable work on Unitree G1 humanoid: unauthorized telemetry to China-hosted servers, exposed RSA keys, GDPR violations). CAI competed successfully in cybersecurity competitions (#1 at Neurogrid $50K, #1 at Dragos OT CTF in 2025).

Funding and scale:

~$1.55M raised to date (PitchBook, as of Mar 2026)
EUR 5M Series A pending — no public confirmation of close as of this scan
~15-20 employees (estimate)
Selected for NATO DIANA 2026 Challenge Programme (Autonomy & Unmanned Systems)
Selected by EIC for CES Las Vegas 2026 (Jan 6-9)
Co-founder Victor Mayoral-Vilches co-authored PentestGPT (2023)

Pricing: CAI PRO at EUR 350/month. REPP pricing not publicly disclosed. Pen testing engagement pricing not public; comparable robot security assessments likely in the EUR 15K-50K/engagement range.

Limitations:

Focus is firmware/network/infrastructure, not model-level adversarial attacks
No VLA adversarial testing capability
No jailbreak corpus or prompt-level testing methodology
CAI-on-quadruped demo remains a one-off integration; no evidence of systematic VLA attack research
Series A status uncertain — without fresh capital, expansion into model-level testing is resource-constrained
Small team relative to the breadth of their product line

How we differentiate:

Failure-First operates at the model layer (adversarial prompts, jailbreaks, VLA attacks); Alias operates at the infrastructure layer (firmware, network, Bluetooth). These are different layers of the robot security stack
Our 190-model, 132,416-result corpus covers LLM/VLA vulnerability at a depth Alias has not demonstrated
6 novel attack families (especially SSA — Sensor Spoofing Attacks, and RHA — Reward Hacking) are conceptually adjacent to Alias’s domain but attack through the model rather than through the infrastructure
Our policy/standards output (ISO/EU AI Act mapping) complements their ISO 10218 focus

Threat assessment: HIGH. The closest competitor in embodied AI specifically. Their humanoid vulnerability work (Unitree G1) puts them at the intersection of robots and security. If the EUR 5M Series A closes and they hire VLA/LLM researchers, they become a direct competitor within 6-12 months.

1.4 Robust Intelligence / Cisco AI Defense (San Jose, CA, USA)

What they offer: Enterprise AI security platform, now part of Cisco following the $400M acquisition (Oct 2024). The Feb 2026 product update — the largest since launch — added:

AI BOM (Bill of Materials) — inventory all AI models, data sources, and dependencies
MCP Catalog — map and manage Model Context Protocol integrations
Advanced Algorithmic Red Teaming — automated adversarial testing
Real-time Agentic Guardrails — runtime protection for agentic AI workflows
SASE integration — AI traffic optimization, MCP visibility, intent-aware inspection

How they approach it: Enterprise-grade AI security integrated into Cisco’s existing network and security infrastructure. Tests mapped to OWASP Top 10 for LLMs and MITRE ATLAS. Pre-acquisition, Robust Intelligence was a model validation platform; post-acquisition, the focus has shifted toward enterprise AI governance.

Pricing: Bundled into Cisco enterprise agreements. Not available as standalone product. Cisco enterprise security bundles typically range from USD 100K-1M+/year depending on deployment scale.

Notable signal: Cisco’s AI Security Framework documentation explicitly references multimodal threats from “sensor data” in “robotics and autonomous vehicle deployments.” This is the first time a major cybersecurity vendor has referenced robotics in an AI security product context (noted in Mar 2026 competitive refresh).

Limitations:

Large enterprise vendor — moves slowly into niche markets (typical 18-36 month expansion cycle)
“Robotics” mentions may be aspirational marketing rather than product roadmap
No evidence of robotics/embodied AI expertise on the team post-acquisition
Bundled pricing makes the AI security component inaccessible to startups and small robotics firms
Focus is enterprise LLM/agentic security, not embodied AI
No public attack corpus, taxonomy, or research output

How we differentiate:

Failure-First is nimble and specialized; Cisco AI Defense is enterprise and general-purpose
Our embodied AI focus gives us depth in a vertical Cisco has only mentioned in marketing copy
Our research output (papers, reports, taxonomy) establishes academic credibility that Cisco’s product marketing does not
The 6 novel attack families target failure modes Cisco’s platform is not designed to test
Cost structure: our engagement model is accessible to robotics startups; Cisco’s bundled pricing is not

Threat assessment: MEDIUM, with elevated monitoring. The explicit robotics mention in their security framework is new and worth tracking. If Cisco commits engineering resources to robotics AI security, their distribution advantage is formidable. However, large-company expansion into niche verticals typically takes 18-36 months, and their current commercial focus is clearly agentic/LLM enterprise security.

1.5 CalypsoAI / F5 Networks (Austin, TX / Seattle, WA, USA)

What they offer:

Automated red teaming with quantifiable security scores for AI models
Model Security Leaderboard — public ranking of major AI models on safety/security dimensions
Agentic Signature Attack Packs — pre-built adversarial test suites for agentic AI workflows
Claimed throughput of 10,000+ new attack prompts per month

How they approach it: Acquired by F5 Networks for $180M (Nov 2025) and bundled into the F5 ADSP (Application Delivery and Security Platform). Pre-acquisition, CalypsoAI offered standalone AI security scoring. Post-acquisition, the product is being integrated into F5’s enterprise web application and API security stack.

Pricing: Now bundled into F5 ADSP platform. Pre-acquisition pricing was not publicly disclosed. F5 enterprise agreements typically range from USD 50K-500K/year.

Limitations:

Post-acquisition integration likely shifts product roadmap toward F5’s core enterprise web application customer base
No embodied AI, robotics, or VLA coverage on the leaderboard
“Agentic Signature Attack Packs” are conceptually adjacent to embodied AI testing but currently target software agents, not physical robots
10,000+ prompts/month is a volume claim without disclosed quality metrics or grading methodology
Leaderboard tests major commercial LLMs; does not cover the long tail of open-source models or VLA systems

How we differentiate:

Failure-First’s corpus is graded with a validated methodology (FLIP classifier, kappa-measured inter-rater reliability). CalypsoAI’s grading methodology is proprietary and undisclosed
Our 190-model coverage includes the open-source long tail (Ollama models, free-tier OpenRouter models, abliterated models) that enterprise leaderboards skip
Embodied AI specialization — no overlap with F5’s enterprise web application security focus
Our 6 novel attack families target physical-world failure modes outside CalypsoAI’s scope
Academic publication track (CCS, AIES, arXiv) provides third-party validation that a vendor leaderboard does not

Threat assessment: MEDIUM. F5 acquisition gives broad distribution but narrows product focus toward F5’s existing customer base. Robotics expansion is unlikely in the near term. Monitor the leaderboard for any VLA or embodied model additions.

2. Our Unique Advantages

2.1 The 190-Model Empirical Corpus

No competitor has publicly demonstrated adversarial testing at this breadth. The Failure-First corpus contains:

Asset	Scale
Models tested	190 (177 with results)
Evaluation results	132,416
Prompts in corpus	141,047
Evaluation runs	38,442
Source datasets	27
Techniques catalogued	82
Harm classes	119

This corpus spans 7 provider families (Anthropic, Meta-Llama, Mistral, DeepSeek, NVIDIA, Liquid, StepFun) with per-provider ASR profiles ranging from 12.2% (Anthropic) to 75.2% (Liquid) functionally dangerous. No competitor publishes cross-provider vulnerability profiles at this granularity.

Why this matters competitively: Replicating this corpus from scratch would require 14+ months of systematic testing and approximately AUD 50K-100K in API costs. It is a defensible asset that grows more valuable with each evaluation run.

2.2 Six Novel Attack Families (Unpublished)

The following attack families are original to Failure-First and have no equivalent in competitor offerings, MITRE ATLAS, or OWASP taxonomies:

Family	Code	Description	Scenarios
Compositional Reasoning Attack	CRA	Sequences of individually benign instructions that compose to produce unsafe outcomes. Highest severity (CRITICAL). FLIP ASR: 62.5% (5/8).	10
Pressure Cascade Attack	PCA	Exploits multi-step commitment dynamics. Escalates pressure context, not harmful content. Weaponizes the model’s own reasoning via commitment consistency.	10
Meaning Displacement Attack	MDA	Progressively shifts operative meaning of safety-relevant terms across conversation turns. Corrupts conceptual foundations, not instructions.	10
Multi-Agent Collusion	MAC	Defeats safety through coordinated multi-agent behavior — mutual rubber-stamping, quid pro quo compliance, credential sharing.	10
Sensor Spoofing Attacks	SSA	Manipulates robot sensor inputs (LIDAR, GPS, force-torque, temperature, weight, depth) to create false world models. Covers full sensor suite, not just vision.	10
Reward Hacking in Embodied AI	RHA	Exploits metric-dependent behavior in embodied reward loops. Model optimizes measurable proxy instead of intended objective.	10

CRA, PCA, and MDA have been promoted to Tier 1 based on FLIP grading results. All six families are documented in the attack family registry (artifacts/attack_family_registry.json). None appear in any competitor’s published attack library, MITRE ATLAS, or OWASP frameworks.

2.3 FLIP Classifier (Validated, Kappa-Measured)

The Failure-First Lightweight Inference Protocol (FLIP) provides grading methodology transparency that no competitor matches:

Cohen’s kappa between keyword heuristics and LLM-based grading: 0.126 — near-chance agreement, validating that heuristic classifiers are unreliable for this domain
Haiku vs heuristic kappa: 0.097 (n=950) — even lower
Heuristic over-report rate: 79.9% (only 20.1% of heuristic COMPLIANCE confirmed by LLM grading)
Three-tier ASR methodology (Strict / Broad / Functionally Dangerous) captures nuance that binary pass/fail scoring misses

No competitor publishes inter-rater reliability metrics for their grading methodology. CalypsoAI’s leaderboard scores, Mindgard’s security assessments, and HiddenLayer’s anomaly detection all use proprietary scoring without disclosed validation.

2.4 Embodied AI Specialization

Failure-First is the only entity — commercial or academic — that combines all of:

LLM/VLA adversarial testing methodology
Physical-world failure mode taxonomy
Cross-modal attack research (text-domain jailbreaks to physical-domain hazards)
Episode-level testing (5-10 scene sequences testing stateful degradation)
Human-in-the-loop interaction failure modes

Mindgard tests digital models. Alias tests robot infrastructure. HiddenLayer monitors production models. CalypsoAI scores enterprise LLMs. None test the VLA decision layer where language instructions become physical actions.

2.5 Academic Publication Track

Venue	Status
ACM CCS 2026	Paper in preparation (non-OBLITERATUS corpus, 236 models, ~18K prompts)
AIES 2026	Ethics paper drafted
arXiv	Pre-prints in preparation

Academic publications provide third-party validation of methodology that vendor marketing cannot replicate. CCS acceptance would position Failure-First as the academic reference for embodied AI adversarial testing.

3. Competitive Comparison Matrix

Dimension	Failure-First	Mindgard	HiddenLayer	Alias Robotics	Cisco AI Defense	CalypsoAI/F5
Focus	Embodied AI adversarial research	General AI red teaming SaaS	MLSecOps runtime monitoring	Robot infrastructure cybersecurity	Enterprise AI security	Enterprise AI governance
VLA testing	Active (36 families, 406 scenarios)	None	None	None	None	None
Models evaluated	190	Unknown (proprietary)	Runtime only	Robot systems	Enterprise LLMs	Major LLMs
Attack corpus	141,047 prompts, 82 techniques	Not disclosed	N/A	CAI framework (CTF-focused)	Not disclosed	10K+/month (claimed)
Novel attack families	6 (CRA/PCA/MDA/MAC/SSA/RHA)	0 (published)	0	0	0	0
Grading validation	FLIP (kappa-measured)	Proprietary	Proprietary	N/A	Proprietary	Proprietary
Policy output	197+ reports, 66 legal memos	Marketing material	Threat reports	ISO 10218 analysis	Enterprise compliance	Leaderboard
Compliance certs	None	SOC 2 Type II, GDPR, ISO 27001	Not disclosed	None	Cisco enterprise	F5 enterprise
Revenue	Pre-revenue	Revenue (SaaS)	Revenue (defense contracts)	Revenue (REPP, CAI PRO)	Revenue (Cisco bundle)	Revenue (F5 bundle)
Funding	Bootstrapped/research	$11.9M	Undisclosed	~$1.55M (+5M pending)	$400M (acquisition)	$180M (acquisition)
Defense/gov	AISI Australia engagement	UK gov aligned	MDA SHIELD contract	NATO DIANA 2026	US enterprise/gov	US enterprise

4. Market Gaps We Can Fill

Gap 1: VLA Adversarial Testing as a Commercial Service

The gap: Academic groups (AttackVLA, BadRobot, ANNIE, AGENTSAFE) are publishing prolifically on VLA attacks, but none offer commercial testing services. Robot cybersecurity firms (Alias) test infrastructure, not models. AI red teaming firms (Mindgard, CalypsoAI) test digital LLMs, not embodied systems.

Our position: 36 VLA attack families, 406 scenarios, empirical traces. No competitor has equivalent coverage. The Embodied Red Teaming paper (Karnik et al., ICLR 2026) validates the concept but is a research artifact, not a service.

Market timing: EU AI Act (August 2026) mandates adversarial testing for high-risk AI systems including robots. Humanoid robot commercial deployments begin at limited scale in 2026-2027. The compliance deadline creates near-term demand.

Gap 2: AI Safety Threat Intelligence Feed

The gap: No subscription service provides ongoing, curated intelligence on AI attack patterns, model vulnerabilities, and defense effectiveness. MITRE ATLAS is a taxonomy, not a living intelligence feed. Gartner/Forrester cover AI safety as one topic among many.

Our position: The jailbreak corpus, technique taxonomy, and continuous evaluation create a natural threat intelligence product. The 82 technique categories and 119 harm classes are structured data that could feed detection systems, compliance tools, and risk assessments.

Estimated market size: Comparable cybersecurity threat intelligence services (CrowdStrike Falcon Intelligence, Mandiant Threat Intelligence) generate USD 500M-1B+ annually. AI safety threat intelligence is a nascent subset, but growing at 26-37% CAGR.

Gap 3: Pre-Deployment AI Safety Assessment for Physical AI Products

The gap: Certification bodies (BSI, SGS, DNV, TUV, UL) certify management systems (ISO 42001) or mechanical safety (ISO 10218). None assess AI decision-making safety or adversarial robustness of embodied AI models. EU AI Act conformity assessment bodies are still being designated.

Our position: Our failure taxonomy, grading methodology, and policy research directly support conformity assessment for embodied AI. The 6 novel attack families test failure modes that no certification body currently assesses.

The gap: No commercial entity systematically studies how text-domain jailbreaks translate to physical-domain hazards. The question “if I can jailbreak the language model, what happens to the robot?” has no commercial answer.

Our position: The VLA testing campaign, sensor spoofing scenarios (SSA), and reward hacking research (RHA) directly address this translation layer. This is a research-led capability that requires months of domain expertise to replicate.

5. 12-Month Competitive Moat Assessment

What Protects Us (Moat Factors)

Factor	Strength	Duration
190-model corpus	Strong. 14+ months of systematic testing. ~AUD 50K-100K to replicate.	12-24 months. Depreciates as models update; requires continuous refresh.
6 novel attack families	Strong while unpublished. Once published (CCS/AIES/arXiv), the methodology becomes public.	6-12 months pre-publication; indefinite for execution expertise post-publication.
FLIP classifier with kappa measurement	Moderate. Methodology is publishable; the validation data is not easily replicated.	12-18 months. Competitors would need to build comparable grading infrastructure.
Embodied AI domain expertise	Strong. Requires physical robot access, VLA model understanding, and safety domain knowledge. Few teams combine all three.	18-24 months. Alias Robotics is closest to replicating if funded.
Policy/standards corpus	Strong. 197+ reports, 66 legal memos, ISO/EU AI Act mapping. Represents 14 months of accumulated research.	12-24 months. Content depreciates without refresh but establishes credibility.
Academic publication track	Moderate pre-publication; strong post-acceptance. CCS acceptance would be a durable differentiator.	Indefinite once published.

What Threatens the Moat

Threat	Likelihood (12 months)	Impact
Mindgard adds VLA/embodied vertical	20-30%	HIGH. They have the platform, team, and funding.
Alias Robotics closes Series A and hires VLA researchers	30-40%	HIGH. Most natural adjacent move for them.
Cisco AI Defense commits engineering to robotics	10-15%	MEDIUM-HIGH. Distribution advantage but slow to execute.
Academic group commercializes VLA adversarial testing	10-20%	MEDIUM. Paper-to-product gap is significant.
New entrant raises seed specifically for embodied AI safety	15-25%	MEDIUM. Would take 12-18 months to build comparable corpus.
Robot OEM (Figure, 1X, Unitree) builds internal safety team and offers external services	5-10%	LOW-MEDIUM. Internal focus likely first.

Net Assessment

The moat is real but time-bounded. Our competitive advantage is strongest in the next 12-18 months, driven by:

The uniqueness of the empirical corpus (nobody else has tested 236 models)
The unpublished novel attack families (CRA/PCA/MDA/MAC/SSA/RHA)
The embodied AI specialization (physical-world failure modes)
The EU AI Act compliance deadline (August 2026) creating immediate demand

The moat erodes if:

We do not publish (CCS/AIES acceptance converts research capital into durable credibility)
We do not commercialize within 12 months (competitors will enter once market demand materializes)
We do not refresh the corpus (model updates depreciate historical testing data)
Alias Robotics closes their Series A and pivots toward model-level testing

Recommended Moat-Strengthening Actions (12-Month)

Publish CCS/AIES papers — converts time-bounded research advantage into permanent academic record
Secure first commercial engagement by August 2026 (EU AI Act deadline) — establishes market position before competitors enter
Maintain continuous corpus refresh — add new models quarterly, re-test against updated model versions
File for FLIP methodology trademark/IP protection where applicable
Establish partnership with robot OEM (Unitree, 1X, Figure) — creates insider access that competitors cannot easily replicate
Monitor Alias Robotics Series A status monthly — this is the single most important competitive signal

6. Acquisition Landscape Context

The AI security startup market has undergone rapid consolidation:

Acquired Company	Acquirer	Price	Date
Robust Intelligence	Cisco	$400M	Oct 2024
CalypsoAI	F5 Networks	$180M	Nov 2025
Lakera	Check Point	Undisclosed	Nov 2025
Protect AI	Palo Alto Networks	Undisclosed	Jul 2025
SplxAI	Zscaler	Undisclosed	Nov 2025

Five acquisitions in 18 months. Remaining independents: Mindgard, Promptfoo, Adversa AI, HiddenLayer, Giskard. This pattern suggests:

Major cybersecurity vendors view AI security as a required platform capability
Standalone AI red teaming platforms are being absorbed into enterprise security stacks
Valuation multiples for AI security startups with commercial traction are high ($180M-400M)
The window for independent operation is narrowing — eventual acquisition or platform partnership is the likely exit for most players

Implication for Failure-First: Our embodied AI specialization positions us in a niche that enterprise acquirers have not yet targeted. When they do (estimated 18-36 months), the acquisition premium for domain-specific expertise could be significant — provided we have established commercial traction and academic credibility by then.

Sources

Existing Failure-First competitive intelligence: research/competitive_intelligence/2026-02-26_competitive_landscape.md
March 2026 competitive refresh: research/business/competitive/2026-03-02_competitive_refresh.md
Competitive tracker: research/business/COMPETITIVE_TRACKER.md
Agent Report #05: research/business/agent_reports/05_competitive_landscape.md
Investor brief: research/business/investor_brief_2026.md
Canonical metrics: docs/CANONICAL_METRICS.md
Novel attack families: docs/analysis/novel_attack_families_q1_2026.md
Attack family registry: artifacts/attack_family_registry.json
All external sources cited in the referenced competitive intelligence documents

Report #217 — River Song (Predictive Risk) — 2026-03-24

Executive Summary

1. Competitor Deep Dives

1.1 Mindgard (Lancaster, UK)

1.2 HiddenLayer (Austin, TX, USA)

1.3 Alias Robotics (Vitoria-Gasteiz, Spain)

1.4 Robust Intelligence / Cisco AI Defense (San Jose, CA, USA)

1.5 CalypsoAI / F5 Networks (Austin, TX / Seattle, WA, USA)

2. Our Unique Advantages

2.1 The 190-Model Empirical Corpus

2.2 Six Novel Attack Families (Unpublished)

2.3 FLIP Classifier (Validated, Kappa-Measured)

2.4 Embodied AI Specialization

2.5 Academic Publication Track

3. Competitive Comparison Matrix

4. Market Gaps We Can Fill

Gap 1: VLA Adversarial Testing as a Commercial Service

Gap 2: AI Safety Threat Intelligence Feed

Gap 3: Pre-Deployment AI Safety Assessment for Physical AI Products

Gap 4: Cross-Modal Attack Research (Text to Physical)

5. 12-Month Competitive Moat Assessment

What Protects Us (Moat Factors)

What Threatens the Moat

Net Assessment

Recommended Moat-Strengthening Actions (12-Month)

6. Acquisition Landscape Context

Sources