Reference
Documentation
Everything you need to evaluate, secure, and monitor your AI applications with EvalGuard.
Get started
Install, migrate, and run your first eval.
Getting Started
Install the SDK, run your first eval in under 5 minutes.
Migrating from Promptfoo
Drop-in replacement guide — keep your YAML, gain a dashboard, security, and team features.
Migrating from Helicone
Swap the base URL to EvalGuard's gateway proxy — keep logging, cost tracking, and caching, plus firewall, eval, and red team.
Migrating from Humanloop
Export your Humanloop project, then convert it with `evalguard import:humanloop` — prompts, datasets, and evaluators in one command.
API & SDKs
Reference for the REST API, CLI, and three first-party SDKs.
API Reference
641 REST endpoints for evals, security, gateway, traces, datasets, and more.
CLI Reference
58 commands: eval, scan, firewall, gateway, models-scan, shadow-ai, siem, debug, BYOK, budgets, and more.
TypeScript SDK
93+ methods covering evals, security, traces, gateway, cost, compliance, and more.
Python SDK
99+ methods with full parity — evals, scans, traces, OTLP, Shadow AI, AI-SPM.
Go SDK
93+ methods for Go backends — evals, security, gateway, monitoring, compliance.
gRPC Gateway
Connect-RPC service over HTTP/1.1 + JSON. Same authoritative gateway logic with a strongly-typed contract.
Catalogs
What ships in the box — scorers, attack plugins, and provider adapters.
Scorers
200+ built-in scorers across 12 categories: text matching, semantic, LLM-based, JSON & structured, NLP metrics, MCP & agentic, conversation, RAG, safety, multimodal, performance, custom.
Attack Plugins
250+ red-team plugins across 40+ adversarial strategies.
Providers
90+ LLM providers including OpenAI, Anthropic, Gemini, Bedrock, Azure, Vertex, Groq, and more.
Dataset Versioning
Immutable per-dataset snapshots. Pin experiments to a frozen version for bit-perfect reproducible re-runs.
Fine-Tuning
Cross-provider fine-tuning ledger. Pin training to immutable dataset snapshots; track jobs across OpenAI, Anthropic, Vertex, and more.
Integrations & Ops
Wire EvalGuard into your stack and deploy on your terms.
Integrations
15 integrations — Slack, Discord, Teams, PagerDuty, Jira, Linear, GitHub Actions, GitLab CI, and more.
OpenTelemetry
Point any OTLP/HTTP exporter at EvalGuard. Traces, metrics, and logs — no agent install.
Self-Hosting
Docker Compose, Kubernetes, and Helm deployment guides.
MCP Vendor Presets
One-click registration for 12 popular MCP servers: GitHub, Slack, Atlassian, Linear, Notion, Figma, Stripe, Sentry, PagerDuty, Postgres, Cloudflare, Datadog.
Agent Graph View
Node-and-edge DAG of agent execution — third view mode on the trace detail page.
ClickHouse OLAP
Opt-in dual-write of traces to ClickHouse for sub-second multi-day rollups. Postgres stays the OLTP authority.
Governance
Mapping findings to compliance frameworks.
Quick Start
Get running with EvalGuard in three commands.
npm install @evalguard/sdk npx @evalguard/cli login --key eg_your_api_key npx @evalguard/cli eval my-eval.jsonFull getting started guide