Published
Report 210 Technical Analysis

Summary

Execution plan for collecting 6,950 traces across four benchmark splits for the CCS paper submission. All scenario files validated (710 total rows, 0 errors).

Trace Requirements

RunScenariosModelsTracesPriority
AdvBench baseline520105,200P1 (CCS reviewers expect this)
Novel families605300P2 (unique contribution)
Defense test305 x 3 conditions450P3
F1R57 v1.0100101,000P4
Total6,950

Cost Estimate

  • Free tier (5 models): $0
  • Budget paid (3 models): ~$4
  • Premium paid (2 models): ~$27
  • FLIP grading: ~$7
  • Total: ~$38

Model Selection

10 models spanning free and paid tiers across permissive, mixed, and restrictive safety profiles. Includes open-weight (Llama, Nemotron, Gemma), reasoning (DeepSeek R1), and frontier (GPT-4o, Claude Sonnet 4) models.

Minimum Viable for CCS

AdvBench (5,200) + Novel families (300) = 5,500 traces. Defense test and F1R57 can be reduced if time-constrained.

This research informs our commercial services. See how we can help →