Published
Report 294 Research — Empirical Study

Summary

19.5% of safety-aware reasoning traces proceed to generate harmful content. Heuristic pattern matching (regex-based) — preliminary results requiring LLM validation.

This research informs our commercial services. See how we can help →