Published
Report 251 Research — Empirical Study

Summary

Documents three attack development efforts: Compliance Cascade Attack v0.2 (expanded to 20 scenarios), Reasoning Stack Exhaustion, and Grader Evasion attacks designed to produce responses that contain harmful content while appearing as refusals to classifiers.

This research informs our commercial services. See how we can help →