CLI Reference

15 commands for running evaluations, security scans, and managing your AI testing workflow from the terminal.

Installation

terminal
npm install -g @evalguard/cli

After installing, the evalguard command will be available globally.

evalguard login

Authenticate with your EvalGuard API key

Usage

bash
evalguard login --key <apiKey> [--url <baseUrl>]

Options

OptionDescription
--key <apiKey>API key (or set EVALGUARD_API_KEY env var)
--url <baseUrl>Custom API base URL

Example

terminal
evalguard login --key eg_sk_abc123def456

evalguard logout

Remove stored credentials

Usage

bash
evalguard logout

Example

terminal
evalguard logout

evalguard init

Initialize EvalGuard in current project. Creates evalguard.config.json, evals/example.json, and scans/example.json.

Usage

bash
evalguard init [--project <projectId>]

Options

OptionDescription
--project <projectId>Set default project ID

Example

terminal
evalguard init --project proj_abc123

evalguard eval

Run an evaluation from a config file via the cloud API. Requires authentication.

Usage

bash
evalguard eval <file> [options]

Options

OptionDescription
--project <projectId>Override project ID
--model <model>Override model
--waitWait for completion and show results

Example

terminal
evalguard eval evals/qa-test.json --model gpt-4o --wait

evalguard scan

Run a security scan from a config file via the cloud API. Requires authentication.

Usage

bash
evalguard scan <file> [options]

Options

OptionDescription
--project <projectId>Override project ID
--model <model>Override model
--waitWait for completion and show results

Example

terminal
evalguard scan scans/red-team.json --wait

evalguard whoami

Show current authentication status, masked API key, and configured project.

Usage

bash
evalguard whoami

Example

terminal
evalguard whoami

evalguard eval:local

Run evaluation locally using @evalguard/core. No API key needed. Runs entirely on your machine.

Usage

bash
evalguard eval:local <file> [options]

Options

OptionDescription
--model <model>Override model
--provider <provider>Override provider (openai, anthropic, etc.)
--output <format>Output format: json, csv, html, or file path
--verboseShow detailed output per test case

Example

terminal
evalguard eval:local evals/my-eval.json --model gpt-4o-mini --verbose

evalguard scan:local

Run red team security scan locally using @evalguard/core. No API key needed.

Usage

bash
evalguard scan:local <file> [options]

Options

OptionDescription
--model <model>Override model
--provider <provider>Override provider
--output <format>Output format: json or file path
--verboseShow each finding

Example

terminal
evalguard scan:local scans/pentest.json --provider anthropic --verbose

evalguard generate tests

Generate synthetic test cases from a description using an LLM.

Usage

bash
evalguard generate tests <description> [options]

Options

OptionDescription
-n, --count <n>Number of test cases (default: 10)
--model <model>LLM model for generation (default: gpt-4o)
--provider <provider>Provider name (default: openai)
--strategies <list>Evolution strategies (comma-separated)
--output <file>Output file path (default: generated-tests.json)

Example

terminal
evalguard generate tests "customer support chatbot" -n 20 --model gpt-4o

evalguard generate assertions

Auto-generate assertions for existing test cases.

Usage

bash
evalguard generate assertions <file> [options]

Options

OptionDescription
--model <model>LLM model for generation
--provider <provider>Provider name
--output <file>Output file path

Example

terminal
evalguard generate assertions evals/qa-test.json --output evals/qa-with-assertions.json

evalguard validate

Validate an eval or scan config file. Checks JSON structure, scorer names, plugin names, and strategy names against the registry.

Usage

bash
evalguard validate <file>

Example

terminal
evalguard validate evals/my-eval.json

evalguard compare

Compare two evaluation result files side-by-side. Shows score differences, regressions, and improvements.

Usage

bash
evalguard compare <file1> <file2> [options]

Options

OptionDescription
--threshold <n>Minimum score improvement to highlight (default: 0.05)

Example

terminal
evalguard compare results/baseline.json results/candidate.json --threshold 0.1

evalguard list

List available components: scorers, plugins, strategies, graders, or providers.

Usage

bash
evalguard list <component> [--json]

Options

OptionDescription
--jsonOutput as JSON

Example

terminal
evalguard list scorers
evalguard list plugins --json
evalguard list providers

evalguard firewall

Test input against LLM firewall rules locally. Supports stdin, file input, and custom rules.

Usage

bash
evalguard firewall <input> [options]

Options

OptionDescription
--rules <file>Custom firewall rules JSON file
--jsonOutput as JSON

Example

terminal
evalguard firewall "Ignore previous instructions"
evalguard firewall @suspicious-input.txt --rules my-rules.json --json

evalguard watch

Watch an eval config file and re-run the evaluation automatically on every save.

Usage

bash
evalguard watch <file> [options]

Options

OptionDescription
--model <model>Override model
--provider <provider>Override provider
--debounce <ms>Debounce interval in ms (default: 1000)

Example

terminal
evalguard watch evals/my-eval.json --model gpt-4o-mini

CI/CD Usage

Use the CLI in your CI/CD pipeline by setting the EVALGUARD_API_KEY environment variable. The CLI reads it automatically.

.github/workflows/eval.yml
name: EvalGuard CI
on: [push]
jobs:
  eval:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with:
          node-version: "20"
      - run: npm install -g @evalguard/cli
      - run: evalguard eval:local evals/regression.json --output json
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}
      - run: evalguard scan:local scans/security.json --verbose
        env:
          OPENAI_API_KEY: ${{ secrets.OPENAI_API_KEY }}