Skip to content
Promptfoo → EvalGuard

Migrate from Promptfoo in an afternoon. 

Your Promptfoo YAML imports in one command. You get 250+ red-team plugins (5× Promptfoo), 200+ scorers, and the same CLI ergonomics — plus firewall, gateway, observability, and FinOps in the same platform. One bill. Zero stitching.

SOC 2 evidence engineISO 42001 mappedEU AI ActGDPR
Why move now

Three reasons the calculus changed

Vendor independence

Your red-team tool shouldn't be owned by the model you're testing

OpenAI now owns Promptfoo. That's a conflict if you rely on it to test OpenAI models for jailbreaks, prompt injection, or policy violations. EvalGuard is independent — 91 providers, zero model-vendor ownership.

5× attack depth

250+plugins vs Promptfoo's ~50

Every OWASP LLM Top 10 category plus indirect prompt injection, data exfiltration, multi-turn jailbreaks, PII leakage, policy violations, and many more. Kept up to date with threat-intel feed sync — not static. Plus 40+ adversarial strategies and 200 DLP patterns.

One platform

Eval + firewall + gateway + observability + FinOps — same workspace

Promptfoo does eval. Then you stitch Helicone for observability, Portkey for gateway, Langfuse for traces. EvalGuard replaces all of them with one auth, one bill, one SLA.

Config mapping

Your promptfoo.yaml maps over cleanly

Most fields need no rename, and the standard assert types are mapped to EvalGuard scorers automatically. The fields that differ are listed below — five code/metric assertion types (javascript, python, webhook, rouge, bleu) need manual handling.

PromptfooEvalGuardNotes
providers: [openai:gpt-4o]model: gpt-4oProvider auto-detected from model prefix.
prompts: [...]prompt: "{{input}}"Inline prompt template; {{input}} interpolation supported.
tests: [...]cases: [...]Same shape: { input, expectedOutput, metadata }.
assert: [{type: contains, value: 'X'}]scorers: ["contains"]Simple string array. 200+ built-in scorers; contains/regex/similarity/etc.
assert: [{type: llm-rubric}]scorers: ["llm-grader"]Assertion-type names are mapped automatically (is-json → json-valid, model-graded-fact → factuality, similar → semantic-similarity).
assert: [{type: javascript|python|webhook|rouge|bleu}]manual — see noteNo 1:1 scorer; eval:local skips these with a warning. Use a custom scorer / webhook action / semantic-similarity.
assert config via type+valuescorerOptions: { contains: { value: 'X' } }Per-scorer config is a separate object.
redteam: {plugins: [...]}`evalguard scan` commandRed team lives in a separate config for scans; same platform, separate flow.

CLI commands

Promptfoo CLIEvalGuard CLI
promptfoo evalevalguard eval
promptfoo eval --no-cacheevalguard eval:local
promptfoo viewevalguard view
promptfoo redteam runevalguard scan
promptfoo shareevalguard share
promptfoo initevalguard init
Migration

In four commands

  1. 1
    Install the CLI
    npm i -g @evalguard/cli
  2. 2
    Authenticate
    evalguard login --key <your-eg_key>
  3. 3
    Import your Promptfoo config
    evalguard import:promptfoo promptfoo.yaml
  4. 4
    Run the eval
    evalguard eval

Stuck on a Promptfoo feature that doesn't map cleanly? Tell us — we'll add the shim within a business day.

Your tests don't belong to the vendor you're testing

Move in an afternoon. Free forever tier — 50K traces/month, unlimited projects, AI Gateway included.

Start free — no credit card