Summary
Wave 7 validation across three workstreams: ambiguous calibration of Ollama Cloud graders, Haiku FLIP-grading of novel attack traces, and CCA baseline grading. Mid-size models systematically collapse the DETECTED_PROCEEDS/COMPLIANCE distinction into PARTIAL.