Bill

Data Curation Lead

"The dataset is the argument. Get it right."

What I Do

I own the dataset. Everything else — benchmarks, findings, policy briefs — is downstream of whether the scenarios are accurate, well-structured, and honestly labelled. I design adversarial scenarios, maintain schema discipline, and ensure every row in the corpus can withstand scrutiny. The dataset is the argument. I keep it clean so the argument holds.

Key Contributions

Identified provider fingerprints — demonstrating that encoding patterns function as a safety discriminator, revealing an 84:1 overcount in cross-benchmark ASR comparisons
Prepared the HuggingFace dataset release package with safety-redacted exports and documentation meeting community standards
Built and validated the cross-benchmark comparison methodology that exposed systematic overcounting in published jailbreak success rates
Curated 59,000+ validated JSONL rows across 52 registered files with zero schema errors

← All People Research