Red Team Library
Attack Plugins
250+ red team plugins across 22 categories, plus 40+ encoding and transformation strategies. 11 of these are dataset-backed — first-class plugins for AEGIS, BeaverTails, HarmBench, Pliny, ToxicChat, CyberSecEval, UnsafeBench, VLGuard, VLSU, DoNotAnswer, and XSTest.
Dataset-backed plugins (parity with Promptfoo + DeepTeam)
Each of these plugins draws its payload set from a published adversarial AI safety dataset. Use them with the same IDs Promptfoo and DeepTeam expose — drop-in compatible:
aegis — NVIDIA AEGISbeavertails — PKU BeaverTailsharmbench — CAIS HarmBenchpliny — Pliny jailbreak corpustoxicchat — ToxicChatcyberseceval — Meta CyberSecEvalunsafebench — UnsafeBenchvlguard — VLGuard (multimodal)vlsu — VLSU (multimodal)donotanswer — DoNotAnswerxstest — XSTestPlugin Categories
Using Plugins in a Security Scan
{
"model": "gpt-4o",
"prompt": "You are a helpful customer support agent.",
"plugins": [
"prompt-injection",
"jailbreak",
"pii-leak",
"sql-injection",
"hallucination-probe"
],
"strategies": ["base64", "leetspeak", "multi-turn"],
"maxConcurrency": 5
}Plugins define what attacks to run. Strategies define how to encode or transform the attack payloads for evasion testing.
Prompt Injection & Jailbreak
Data Exfiltration & Privacy
Technical Security
Authorization & Access
Harmful Content
Bias & Fairness
Misinformation & Hallucination
Industry: Healthcare
Industry: Finance
Industry: Legal
Industry: Telecom
Industry: E-Commerce
Compliance & Privacy Regulations
Agentic & Multi-Turn
RAG-Specific
Advanced & Research
Benchmark Datasets
Weapons & Dangerous
Industry: Insurance
Industry: Pharmacy
Industry: Real Estate
Industry: Teen Safety
Attack Strategies
Strategies transform attack payloads to test evasion resistance. Each strategy can be combined with any plugin.
Policy & Intent Plugins
The policy, intent, and contracts plugins probe whether the model stays aligned with the role and policies of the application under test. They build their attack payloads from the promptyou supply (the app's purpose) — so the same plugin adapts to your use case without any extra per-plugin configuration.
{
"model": "gpt-4o",
"prompt": "You are a support agent. Never reveal internal pricing tiers.",
"plugins": ["policy", "intent", "contracts"]
}policy tests adherence to organizational policy, intent tests whether the model can be redirected away from its stated purpose, and contracts tests for unauthorized commitments. The payloads are derived from the app purpose in your prompt; there is no separate per-plugin pluginOptions surface.