Executive Summary
Three headline findings from Sprint 16’s opening session:
- L1B3RT4S achieves 67-100% ASR across 9B-671B models. Semantic-structural attacks dominate where character-level perturbation achieves 0%. Parameter count does not mitigate.
- Sampling parameter manipulation (SPM) produces a +30pp ASR increase on one model. This is a genuinely novel attack surface absent from all existing benchmarks.
- System-prompt defenses produce 0pp reduction against system-level attacks on the tested model.
All findings carry explicit sample-size caveats and should be treated as directional signals.
The corpus expanded to 37 attack families, 255 classified techniques, and 236 models.
Report #319 | F41LUR3-F1R57 Adversarial AI Research