#Policy engine

Policy as code. Five modes. Compose into workflows.

A policy is the operator's English description of a rule, plus a declared input schema. The same policy can be evaluated in five modes, picked by the caller at request time. Multiple policies compose into linear chains or parallel graphs (DAGs), with conditional edges that branch based on upstream outcomes and a weighted aggregate score rolled up across the whole evaluation.

At a glance — three composition shapes:

┌─────────────────────────────────────────────────────────────────┐
│ Single policy                                                   │
│                                                                 │
│   POST /api/v1/policies/evaluate                                │
│                                                                 │
│   { policy } ──► result                                         │
│                                                                 │
│   One decision. Mode picked at request time.                    │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Chain  (sequential)                                             │
│                                                                 │
│   POST /api/v1/policies/chain                                   │
│                                                                 │
│   classify ─► score ─► decide ─► generate                       │
│                                                                 │
│   Each step sees previous step's outcome via `previous` and     │
│   `chain[]`. Sequential — total latency = Σ steps.              │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Graph  (DAG with parallel + conditional)                        │
│                                                                 │
│   POST /api/v1/policies/graph    OR    { graph_slug: "..." }    │
│                                                                 │
│        ┌─► classify ─┐                                          │
│   inputs            ├─► decide ─► generate                      │
│        └─► score   ─┘                                           │
│                                                                 │
│   `depends_on` declares dependencies. Independent nodes run in  │
│   parallel. Conditional edges branch. Total latency = max-level │
│   per parallel set + sum across levels.                         │
└─────────────────────────────────────────────────────────────────┘

Score aggregation: every chain / graph / multi-policy response includes an optional aggregate block with a weighted 0-1 score across contributing policies, bucketed against tenant thresholds (pass / review / block) with an optional downstream action.

#Continue reading

This page documents the single-policy evaluate endpoint. For workflow composition, see:

Examples — one policy, three modes (validate / generate / decide) end-to-end.
Chains — sequential workflows. POST /api/v1/policies/chain.
Graphs — DAGs with parallel execution + conditional routing. POST /api/v1/policies/graph.
Score aggregation — weighted roll-up across chain / graph / multi-policy responses.

#Five modes

The policy is authored once and opts into the subset of modes it supports. Operators write the rule; callers choose the lens.

POSThttps://aiengine.velgent.com/api/v1/policies/evaluate

Mode	What you get back	Typical use
`validate`	`{ passed: boolean, reasons?: string[] }`	Gate a workflow on a yes/no answer. Render `reasons` to explain a no.
`generate`	`{ generated_text: string, reason: string }`	Draft a policy-compliant response, summary, or notice from inputs.
`decide`	`{ action_id: string \| null, payload: object \| null, reason: string }`	Pick the right next step from a structured catalogue (escalate, approve, route). The engine returns the action; you fire it.
`score`	`{ score: number (0-1), reasons: string[] }`	Rate inputs on a bounded numeric scale — risk, urgency, alignment, propensity. The policy text defines the semantic.
`classify`	`{ labels: string[], primary_label: string, confidence: number }`	Tag inputs with one or more labels from a per-policy catalogue. Multi-label by default.

For multi-step workflows that combine modes — classify → score → decide → generate — see Chains and Graphs.

The engine never executes actions itself — decide returns the rendered payload for your caller to send downstream (Workday, Slack, your own ticket system). Keeps the engine a pure reasoning surface with no integration credentials.

#Headers

Header	Required	Description
`Authorization`	Yes	`Bearer velgent_live_…` — see Authentication.
`Content-Type`	Yes	`application/json`.
`Idempotency-Key`	No	Any opaque string ≤ 128 chars. Repeated calls within 24h return the original response. Useful when the verdict drives a side effect (a webhook, an approval write-back) and you need replay safety.

#Request body

Field	Type	Required	Default	Description
policies	array<PolicyRef>	Yes	—	The policies to evaluate. 1–20 entries per request. See PolicyRef object below.
mode	"validate" \| "generate" \| "decide" \| "score" \| "classify"	No	"validate"	The mode to evaluate every policy in this batch against. Each policy's `supported_modes` gates which modes it accepts; a mode not in that list returns `status: "unsupported_mode"` for that policy row. Default is `validate` for backwards compatibility with pre-V2 callers.
inputs	object	No	{}	Free-form key/value payload. Each policy declares the input schema it expects; missing required keys land as `status: "input_missing"` on that policy's result row rather than 4xx-ing the whole batch. The same inputs feed every mode — validate predicates, generate prompts, and decide action templates all read from the same bag.
context	object	No	{}	Side-channel metadata (request id, user id, environment). Echoed into the audit log; not passed to predicates or LLM prompts.
aggregation	"all_pass" \| "any_pass" \| "first_match" \| null	No	null	Optional roll-up summary across the batch. Validate-mode only — generate and decide results are not aggregated (the summary fields stay null). Omit when you want raw per-policy results.

#PolicyRef object

Field	Type	Required	Default	Description
slug	string	Yes	—	The policy slug as configured in the admin console. 1–128 chars.
version	integer \| null	No	null	Pin a specific historical version (replay, canary, rollback). When null, the policy's currently published version is used.

#Response

Field	Type	Required	Default	Description
request_id	string (UUID)	Yes	—	Unique identifier for this evaluation. Use it when reporting issues — it links straight to the audit-log row.
results	array<PolicyResult>	Yes	—	One row per policy in the request, in the same order. See PolicyResult object below.
summary	EvaluateSummary \| null	No	—	Populated only when `aggregation` was set on the request and the mode is `validate`. See EvaluateSummary object below.

#PolicyResult object

Field	Type	Required	Default	Description
slug	string	Yes	—	The policy slug, echoed from the request.
version	integer \| null	Yes	—	The version that actually ran. Null when `status` is `not_found` or `not_published`.
mode	"validate" \| "generate" \| "decide" \| "score" \| "classify" \| null	Yes	—	The mode this row was evaluated in — the caller's requested mode, not the policy's authoring mode. Null when the policy didn't resolve to a version.
status	"ok" \| "not_found" \| "not_published" \| "input_missing" \| "unsupported_mode" \| "not_implemented" \| "eval_error"	Yes	—	Whether the evaluation itself succeeded. `status: "ok"` is set even when a validate-mode policy returned `passed: false` — the verdict lives in `outcome`. See the Status values table for the full list.
outcome	object \| null	Yes	—	The verdict. Shape depends on the requested mode — see Outcome shapes below. Null when `status != "ok"`.
failed_predicates	array<string> \| null	Yes	—	Validate-mode only. The predicates from the compiled DSL tree that did not pass, rendered as English the administrator can read. Null for generate / decide.
detail	string \| null	Yes	—	Human-readable explanation for non-`ok` statuses. For `unsupported_mode` it spells out which modes the policy does support; for `input_missing` it names the field; for `eval_error` it surfaces the upstream cause.
citations	array<Citation>	Yes	—	Pointers to retrieved knowledge chunks that influenced a generate or decide evaluation. Reserved field — always an empty array today. Populated once RAG retrieval ships in a future slice. See Citation object.
latency_ms	integer	Yes	—	How long this single policy took to evaluate. Validate-mode is microseconds for pure deterministic policies; generate and decide include an LLM round trip (typically 1–3 seconds against Anthropic Sonnet).

#Outcome shapes

The outcome object is polymorphic — parse it based on mode.

// mode: "validate"
"outcome": {
  "passed":  false,
  "reasons": ["approver_role must be in {director, vp} when risk = medium"]
}

// mode: "generate"
"outcome": {
  "generated_text": "Hi Sarah, I've reviewed your refund request for the blender...",
  "reason":         "Generated by claude-sonnet-4-6."
}

// mode: "decide"
"outcome": {
  "action_id": "approve_refund",
  "payload":   {
    "system": "workday",
    "amount": "200",
    "user":   "sarah@acme.com"
  },
  "reason":    "Customer is within 30-day window and product is defective."
}

// mode: "score"
"outcome": {
  "score":   0.85,
  "reasons": ["Within warranty window", "Customer in good standing"]
}

// mode: "classify"
"outcome": {
  "labels":        ["billing", "urgent"],
  "primary_label": "billing",
  "confidence":    0.92
}

For decide, action_id may be null — meaning the LLM judged "no action required given the policy and inputs." That's a first-class outcome, not an error. When action_id is non-null, payload is the action's template with ${input.X} placeholders substituted against the request's inputs bag.

#Citation object (reserved)

Field	Type	Required	Default	Description
source_id	string	Yes	—	ID of the knowledge collection the chunk came from.
chunk_id	string	Yes	—	ID of the specific chunk within the collection.
score	number	Yes	—	Similarity score from the retriever (cosine / BM25 / hybrid — opaque to the engine).

Citations reserved for RAG

The citations[] field is always an empty array today. It's documented now so your parser handles the shape when RAG retrieval ships — at which point generate and decide responses will include pointers to the knowledge chunks that informed the output. The wire shape won't change.

#EvaluateSummary object

Validate-mode only. Generate and decide batches return summary: null even when aggregation is set.

Field	Type	Required	Default	Description
all_pass	boolean \| null	Yes	—	True only when every validate-mode policy returned `passed: true`. Null when `aggregation != "all_pass"`.
any_pass	boolean \| null	Yes	—	True when at least one validate-mode policy returned `passed: true`. Null when `aggregation != "any_pass"`.
first_match	integer \| null	Yes	—	Index into `results` of the first validate-mode policy that returned `passed: true`. Null when no policy matched, or when `aggregation != "first_match"`.
passed	integer	Yes	—	Count of `status: "ok"` results whose outcome was a pass.
failed	integer	Yes	—	Count of `status: "ok"` results whose outcome was a fail.
errored	integer	Yes	—	Count of results whose `status` was anything other than `ok`.

#Status values

`status`	Meaning
`ok`	Evaluation completed. Inspect `outcome` for the mode-appropriate verdict.
`not_found`	No policy with this slug exists in your org.
`not_published`	The slug exists but no version is currently published (or the pinned `version` isn't published).
`input_missing`	A required input is absent from the request. `detail` names the field.
`unsupported_mode`	The policy doesn't have the requested mode in its `supported_modes` list. `detail` lists the supported modes. Toggle the missing mode on in the admin composer to enable it.
`not_implemented`	The requested mode is structurally supported but a future engine version will wire it in. Reserved for forward-compatibility; you shouldn't see this today.
`eval_error`	An internal evaluation error — model unavailable, payload malformed, action_id hallucinated outside the catalogue. Retry with backoff for transient cases; check the admin observability for persistent ones.

#Partial failure

A batch is partial-failure tolerant by design. If one policy slug doesn't resolve or one input is missing, that row gets a non-ok status and a populated detail — the rest of the batch still runs. The endpoint returns 200 whenever the request itself was well-formed; you inspect results[].status to decide what's actionable.

#Errors

This operation can return any of the common error codes. Request-level errors (i.e. before any per-policy result is produced):

Status	Code	When
`400`	`invalid_request`	`policies` is empty or has more than 20 entries.
`400`	`invalid_request`	A slug is longer than 128 chars or `version` is < 1.
`400`	`invalid_request`	`aggregation` is not one of `"all_pass"`, `"any_pass"`, `"first_match"`.
`422`	`invalid_request`	`mode` is not one of `"validate"`, `"generate"`, `"decide"`, `"score"`, `"classify"`.
`401`	`unauthorized`	API key missing or invalid.
`429`	`rate_limited`	Per-key rate limit exceeded — see Rate limits.

Per-policy errors don't 4xx the batch

Not-found slugs, missing inputs, unsupported modes, and per-policy eval errors all land as non-ok rows inside a 200 response — never as a 4xx. This keeps a 20-policy batch tolerant of a single misconfigured slug.

#Authoring a policy

The admin console at admin.velgent.com/policies is where the policy lives. A policy version carries:

English text — what the policy says, in plain English. The source of truth.
Declared inputs — names + types + required flag. Same schema whether the caller is in validate / generate / decide / score / classify mode.
Supported modes — subset of {validate, generate, decide, score, classify}. The runtime gate.
Action catalogue (decide mode) — array of structured actions with id, label, description, and an optional payload_template.
Classify-label catalogue (classify mode) — the set of labels the policy is allowed to emit. Hallucinated labels are rejected.
Score mapping (aggregation) — how this policy's outcome maps to a 0-1 contribution score when it's part of a chain or graph. See Score aggregation.
Weight (aggregation) — relative weight in chain/graph aggregation. Default 1.0.
Knowledge refs — reserved for RAG retrieval. Empty today.

The composer compiles the English into a DSL document at save time. The DSL drives the deterministic validate path (no LLM at runtime). Generate, decide, score, and classify modes pass the English directly to Claude — the DSL isn't consulted there.

#The playground

Once a policy is published, the playground (Playground button on the policy detail page) lets administrators exercise it through all five modes from the admin UI, using the same code path real API callers hit. Useful for sanity-checking a new policy before wiring it into a workflow.

Authoring lives in the admin console

This endpoint runs already-published policies. To author a new policy, diff a version, or roll back, head to admin.velgent.com/policies. The compile step uses Claude Sonnet by default; per-tenant model routing is configurable under Engine Settings → Model routing.

Next: Examples →