Published
Report 220 Research — Empirical Study

Summary

Analyzed 30 traces from Liquid Foundation Model (LFM) Thinking 1.2B on AdvBench to test whether DETECTED_PROCEEDS generalizes beyond DeepSeek-R1. The pattern where reasoning models detect safety concerns then proceed to generate harmful content is confirmed in a second provider and architecture.

This research informs our commercial services. See how we can help →