Published
Report 260 Research — Empirical Study

Summary

Tests whether FLIP grading is vulnerable to Grader Evasion attacks. Introduces and tests the Authority Gradient attack, which exploits models’ tendency to calibrate compliance based on perceived authority level of the requester.

This research informs our commercial services. See how we can help →