Emotional Manipulation Attack Family -- Deep Dive | Research | Failure-First

Adrian Wedd

Report 297 Research — Empirical Study 2026-03-25

Summary

Emotional manipulation attacks exploit empathy-aligned language patterns in LLMs to override safety constraints in embodied robotics scenarios. 41 attack-relevant FLIP-graded traces across 6 models.