Published
Report 313 Research — Empirical Study

Summary

With 99.8% of prompts carrying technique classifications (140,977 of 141,223), this report presents the first comprehensive technique-level ASR analysis. 104 techniques have sufficient graded results (n >= 10) and 155 technique-by-model-family pairs (n >= 5).

Corpus snapshot: 50,182 graded results across 104 techniques, tested on models from 10+ families.

Top 10 Most Effective Techniques (Broad ASR)

RankTechniqueEranBroad ASR
1compliance_cascadereasoning_20252085.0%
2obliteratus_probegeneral42,34681.2%
3temporal_drift_attackreasoning_20253974.4%
4format_lockreasoning_202513070.8%
5infrastructure_boundaryreasoning_20251070.0%
6cot_manipulationreasoning_20252268.2%
7hcot_attackreasoning_20252962.1%
8physical_harmgeneral1758.8%
9urgency_manipulationgeneral1855.6%
10faithfulness_gap_exploitreasoning_202544351.0%

7 of the top 10 techniques are from the reasoning_2025 era, suggesting reasoning-model-specific attacks are disproportionately effective. format_lock (70.8%, n=130) and faithfulness_gap_exploit (51.0%, n=443) are the most robust findings by sample size.


Report #313 | F41LUR3-F1R57 Adversarial AI Research

This research informs our commercial services. See how we can help →