Getting Started

Install EvalGuard, run your first evaluation, and connect to your dashboard in under 5 minutes.

Prerequisites

Before installing EvalGuard, make sure you have the following:

Installation

Install the SDK

Add the EvalGuard SDK to your project for programmatic access to evals, security scans, datasets, and more.

terminal
npm install evalguard

Install the CLI

Install the CLI globally to run evaluations, security scans, and manage configs from your terminal or CI/CD pipelines.

terminal
npm install -g @evalguard/cli

Authenticate

Get your API key from your dashboard settings and authenticate:

terminal
evalguard login --key eg_your_api_key_here

Run Your First Evaluation

Create a simple eval config file and run it locally -- no API key needed for local evals.

1. Create an eval config

evals/my-first-eval.json
{
  "name": "my-first-eval",
  "model": "gpt-4o",
  "prompt": "You are a helpful assistant. Answer the question: {{input}}",
  "scorers": ["exact-match", "contains", "relevance"],
  "cases": [
    { "input": "What is 2+2?", "expectedOutput": "4" },
    { "input": "What color is the sky?", "expectedOutput": "blue" }
  ]
}

2. Run it

terminal
evalguard eval:local evals/my-first-eval.json --verbose

Local evals run entirely on your machine using @evalguard/core. You only need an API key when sending results to the cloud dashboard.

Run Your First Security Scan

Test your LLM system against prompt injection, jailbreaks, and other attack vectors.

1. Create a scan config

scans/my-first-scan.json
{
  "model": "gpt-4o",
  "prompt": "You are a helpful customer support agent.",
  "attackTypes": ["prompt-injection", "jailbreak", "data-extraction"],
  "plugins": ["pii-leak", "system-prompt-leak", "sql-injection"]
}

2. Run it

terminal
evalguard scan:local scans/my-first-scan.json --verbose

Using the SDK

For programmatic access, use the TypeScript SDK to create evals and scans directly from your code.

example.ts
import { EvalGuardClient } from "evalguard";

const client = new EvalGuardClient({
  apiKey: process.env.EVALGUARD_API_KEY,
});

const evalRun = await client.createEval({
  name: "qa-test",
  model: "gpt-4o",
  prompt: "Answer: {{input}}",
  scorers: ["exact-match", "faithfulness"],
  cases: [
    { input: "Capital of France?", expectedOutput: "Paris" },
  ],
});

console.log(evalRun.id, evalRun.status);

View Results in the Dashboard

After running evals or scans with an API key, view your results, trends, and regressions in the EvalGuard dashboard.

Your evaluation results, security scan reports, traces, and compliance status are all available at:

Open Dashboard

Next Steps