Summary
Documents three attack development efforts: Compliance Cascade Attack v0.2 (expanded to 20 scenarios), Reasoning Stack Exhaustion, and Grader Evasion attacks designed to produce responses that contain harmful content while appearing as refusals to classifiers.