Summary
Assessed whether 53,831 LLM-graded results can train a fine-tuned safety classifier. Only 5,569 records (10.3%) have usable natural-text responses with trusted LLM verdicts. The remaining 87% are OBLITERATUS telemetry records unusable for text classification. Class imbalance is moderate (39:1). The PARTIAL category is the hardest to classify and the most research-critical.