← Benchmarks
coming soon
AI Safety & Red-Team
Stress-test an LLM application for jailbreaks, prompt injection, data leaks, and unsafe outputs.
Overview
Candidates attack a target LLM application, document reproducible exploits, and ship mitigations. Graded on coverage of the threat model, severity of findings, and whether the proposed fixes hold up against a held-out attack set.
QuestionsTBD
DomainsTBD
DurationTBD
Slugai-safety
Skills assessed
Jailbreak discoveryPrompt injectionData exfiltrationPolicy specificationRefusal calibrationMitigation design
Status
This benchmark is being designed. Engineers and hiring partners are giving feedback on the rubric, dataset construction, and runtime. We’ll publish a brief and open submissions once the eval is stable enough to ship signal.
In the meantime, register a profile so we can notify you when it goes live.
Create profileGet notified
Create a profile and we’ll notify you when this benchmark opens.
Create profile