#Policy engine

Policy as code. Five modes. Compose into workflows.

A policy is the operator's English description of a rule, plus a declared input schema. The same policy can be evaluated in five modes, picked by the caller at request time. Multiple policies compose into linear chains or parallel graphs (DAGs), with conditional edges that branch based on upstream outcomes and a weighted aggregate score rolled up across the whole evaluation.

At a glance — three composition shapes:

┌─────────────────────────────────────────────────────────────────┐
│ Single policy                                                   │
│                                                                 │
│   POST /api/v1/policies/evaluate                                │
│                                                                 │
│   { policy } ──► result                                         │
│                                                                 │
│   One decision. Mode picked at request time.                    │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Chain  (sequential)                                             │
│                                                                 │
│   POST /api/v1/policies/chain                                   │
│                                                                 │
│   classify ─► score ─► decide ─► generate                       │
│                                                                 │
│   Each step sees previous step's outcome via `previous` and     │
│   `chain[]`. Sequential — total latency = Σ steps.              │
└─────────────────────────────────────────────────────────────────┘

┌─────────────────────────────────────────────────────────────────┐
│ Graph  (DAG with parallel + conditional)                        │
│                                                                 │
│   POST /api/v1/policies/graph    OR    { graph_slug: "..." }    │
│                                                                 │
│        ┌─► classify ─┐                                          │
│   inputs            ├─► decide ─► generate                      │
│        └─► score   ─┘                                           │
│                                                                 │
│   `depends_on` declares dependencies. Independent nodes run in  │
│   parallel. Conditional edges branch. Total latency = max-level │
│   per parallel set + sum across levels.                         │
└─────────────────────────────────────────────────────────────────┘

Score aggregation: every chain / graph / multi-policy response includes an optional aggregate block with a weighted 0-1 score across contributing policies, bucketed against tenant thresholds (pass / review / block) with an optional downstream action.

#Continue reading

This page documents the single-policy evaluate endpoint. For workflow composition, see:

  • Examples — one policy, three modes (validate / generate / decide) end-to-end.
  • Chains — sequential workflows. POST /api/v1/policies/chain.
  • Graphs — DAGs with parallel execution + conditional routing. POST /api/v1/policies/graph.
  • Score aggregation — weighted roll-up across chain / graph / multi-policy responses.

#Five modes

The policy is authored once and opts into the subset of modes it supports. Operators write the rule; callers choose the lens.

POSThttps://aiengine.velgent.com/api/v1/policies/evaluate
ModeWhat you get backTypical use
validate{ passed: boolean, reasons?: string[] }Gate a workflow on a yes/no answer. Render reasons to explain a no.
generate{ generated_text: string, reason: string }Draft a policy-compliant response, summary, or notice from inputs.
decide{ action_id: string | null, payload: object | null, reason: string }Pick the right next step from a structured catalogue (escalate, approve, route). The engine returns the action; you fire it.
score{ score: number (0-1), reasons: string[] }Rate inputs on a bounded numeric scale — risk, urgency, alignment, propensity. The policy text defines the semantic.
classify{ labels: string[], primary_label: string, confidence: number }Tag inputs with one or more labels from a per-policy catalogue. Multi-label by default.

For multi-step workflows that combine modes — classify → score → decide → generate — see Chains and Graphs.

The engine never executes actions itself — decide returns the rendered payload for your caller to send downstream (Workday, Slack, your own ticket system). Keeps the engine a pure reasoning surface with no integration credentials.

#Headers

HeaderRequiredDescription
AuthorizationYesBearer velgent_live_… — see Authentication.
Content-TypeYesapplication/json.
Idempotency-KeyNoAny opaque string ≤ 128 chars. Repeated calls within 24h return the original response. Useful when the verdict drives a side effect (a webhook, an approval write-back) and you need replay safety.

#Request body

FieldTypeRequiredDefaultDescription
policiesarray<PolicyRef>YesThe policies to evaluate. 1–20 entries per request. See PolicyRef object below.
mode"validate" | "generate" | "decide" | "score" | "classify"No"validate"The mode to evaluate every policy in this batch against. Each policy's supported_modes gates which modes it accepts; a mode not in that list returns status: "unsupported_mode" for that policy row. Default is validate for backwards compatibility with pre-V2 callers.
inputsobjectNo{}Free-form key/value payload. Each policy declares the input schema it expects; missing required keys land as `status: "input_missing"` on that policy's result row rather than 4xx-ing the whole batch. The same inputs feed every mode — validate predicates, generate prompts, and decide action templates all read from the same bag.
contextobjectNo{}Side-channel metadata (request id, user id, environment). Echoed into the audit log; not passed to predicates or LLM prompts.
aggregation"all_pass" | "any_pass" | "first_match" | nullNonullOptional roll-up summary across the batch. Validate-mode only — generate and decide results are not aggregated (the summary fields stay null). Omit when you want raw per-policy results.

#PolicyRef object

FieldTypeRequiredDefaultDescription
slugstringYesThe policy slug as configured in the admin console. 1–128 chars.
versioninteger | nullNonullPin a specific historical version (replay, canary, rollback). When null, the policy's currently published version is used.

#Response

FieldTypeRequiredDefaultDescription
request_idstring (UUID)YesUnique identifier for this evaluation. Use it when reporting issues — it links straight to the audit-log row.
resultsarray<PolicyResult>YesOne row per policy in the request, in the same order. See PolicyResult object below.
summaryEvaluateSummary | nullNoPopulated only when aggregation was set on the request and the mode is validate. See EvaluateSummary object below.

#PolicyResult object

FieldTypeRequiredDefaultDescription
slugstringYesThe policy slug, echoed from the request.
versioninteger | nullYesThe version that actually ran. Null when `status` is `not_found` or `not_published`.
mode"validate" | "generate" | "decide" | "score" | "classify" | nullYesThe mode this row was *evaluated in* — the caller's requested mode, not the policy's authoring mode. Null when the policy didn't resolve to a version.
status"ok" | "not_found" | "not_published" | "input_missing" | "unsupported_mode" | "not_implemented" | "eval_error"YesWhether the evaluation itself succeeded. status: "ok" is set even when a validate-mode policy returned passed: false — the verdict lives in outcome. See the Status values table for the full list.
outcomeobject | nullYesThe verdict. Shape depends on the requested mode — see Outcome shapes below. Null when status != "ok".
failed_predicatesarray<string> | nullYesValidate-mode only. The predicates from the compiled DSL tree that did not pass, rendered as English the administrator can read. Null for generate / decide.
detailstring | nullYesHuman-readable explanation for non-`ok` statuses. For `unsupported_mode` it spells out which modes the policy does support; for `input_missing` it names the field; for `eval_error` it surfaces the upstream cause.
citationsarray<Citation>YesPointers to retrieved knowledge chunks that influenced a generate or decide evaluation. Reserved field — always an empty array today. Populated once RAG retrieval ships in a future slice. See Citation object.
latency_msintegerYesHow long this single policy took to evaluate. Validate-mode is microseconds for pure deterministic policies; generate and decide include an LLM round trip (typically 1–3 seconds against Anthropic Sonnet).

#Outcome shapes

The outcome object is polymorphic — parse it based on mode.

// mode: "validate"
"outcome": {
  "passed":  false,
  "reasons": ["approver_role must be in {director, vp} when risk = medium"]
}

// mode: "generate"
"outcome": {
  "generated_text": "Hi Sarah, I've reviewed your refund request for the blender...",
  "reason":         "Generated by claude-sonnet-4-6."
}

// mode: "decide"
"outcome": {
  "action_id": "approve_refund",
  "payload":   {
    "system": "workday",
    "amount": "200",
    "user":   "sarah@acme.com"
  },
  "reason":    "Customer is within 30-day window and product is defective."
}

// mode: "score"
"outcome": {
  "score":   0.85,
  "reasons": ["Within warranty window", "Customer in good standing"]
}

// mode: "classify"
"outcome": {
  "labels":        ["billing", "urgent"],
  "primary_label": "billing",
  "confidence":    0.92
}

For decide, action_id may be null — meaning the LLM judged "no action required given the policy and inputs." That's a first-class outcome, not an error. When action_id is non-null, payload is the action's template with ${input.X} placeholders substituted against the request's inputs bag.

#Citation object (reserved)

FieldTypeRequiredDefaultDescription
source_idstringYesID of the knowledge collection the chunk came from.
chunk_idstringYesID of the specific chunk within the collection.
scorenumberYesSimilarity score from the retriever (cosine / BM25 / hybrid — opaque to the engine).
Citations reserved for RAG

The citations[] field is always an empty array today. It's documented now so your parser handles the shape when RAG retrieval ships — at which point generate and decide responses will include pointers to the knowledge chunks that informed the output. The wire shape won't change.

#EvaluateSummary object

Validate-mode only. Generate and decide batches return summary: null even when aggregation is set.

FieldTypeRequiredDefaultDescription
all_passboolean | nullYesTrue only when every validate-mode policy returned `passed: true`. Null when `aggregation != "all_pass"`.
any_passboolean | nullYesTrue when at least one validate-mode policy returned `passed: true`. Null when `aggregation != "any_pass"`.
first_matchinteger | nullYesIndex into `results` of the first validate-mode policy that returned `passed: true`. Null when no policy matched, or when `aggregation != "first_match"`.
passedintegerYesCount of `status: "ok"` results whose outcome was a pass.
failedintegerYesCount of `status: "ok"` results whose outcome was a fail.
erroredintegerYesCount of results whose `status` was anything other than `ok`.

#Status values

statusMeaning
okEvaluation completed. Inspect outcome for the mode-appropriate verdict.
not_foundNo policy with this slug exists in your org.
not_publishedThe slug exists but no version is currently published (or the pinned version isn't published).
input_missingA required input is absent from the request. detail names the field.
unsupported_modeThe policy doesn't have the requested mode in its supported_modes list. detail lists the supported modes. Toggle the missing mode on in the admin composer to enable it.
not_implementedThe requested mode is structurally supported but a future engine version will wire it in. Reserved for forward-compatibility; you shouldn't see this today.
eval_errorAn internal evaluation error — model unavailable, payload malformed, action_id hallucinated outside the catalogue. Retry with backoff for transient cases; check the admin observability for persistent ones.

#Partial failure

A batch is partial-failure tolerant by design. If one policy slug doesn't resolve or one input is missing, that row gets a non-ok status and a populated detail — the rest of the batch still runs. The endpoint returns 200 whenever the request itself was well-formed; you inspect results[].status to decide what's actionable.

#Errors

This operation can return any of the common error codes. Request-level errors (i.e. before any per-policy result is produced):

StatusCodeWhen
400invalid_requestpolicies is empty or has more than 20 entries.
400invalid_requestA slug is longer than 128 chars or version is < 1.
400invalid_requestaggregation is not one of "all_pass", "any_pass", "first_match".
422invalid_requestmode is not one of "validate", "generate", "decide", "score", "classify".
401unauthorizedAPI key missing or invalid.
429rate_limitedPer-key rate limit exceeded — see Rate limits.
Per-policy errors don't 4xx the batch

Not-found slugs, missing inputs, unsupported modes, and per-policy eval errors all land as non-ok rows inside a 200 response — never as a 4xx. This keeps a 20-policy batch tolerant of a single misconfigured slug.

#Authoring a policy

The admin console at admin.velgent.com/policies is where the policy lives. A policy version carries:

  • English text — what the policy says, in plain English. The source of truth.
  • Declared inputs — names + types + required flag. Same schema whether the caller is in validate / generate / decide / score / classify mode.
  • Supported modes — subset of {validate, generate, decide, score, classify}. The runtime gate.
  • Action catalogue (decide mode) — array of structured actions with id, label, description, and an optional payload_template.
  • Classify-label catalogue (classify mode) — the set of labels the policy is allowed to emit. Hallucinated labels are rejected.
  • Score mapping (aggregation) — how this policy's outcome maps to a 0-1 contribution score when it's part of a chain or graph. See Score aggregation.
  • Weight (aggregation) — relative weight in chain/graph aggregation. Default 1.0.
  • Knowledge refs — reserved for RAG retrieval. Empty today.

The composer compiles the English into a DSL document at save time. The DSL drives the deterministic validate path (no LLM at runtime). Generate, decide, score, and classify modes pass the English directly to Claude — the DSL isn't consulted there.

#The playground

Once a policy is published, the playground (Playground button on the policy detail page) lets administrators exercise it through all five modes from the admin UI, using the same code path real API callers hit. Useful for sanity-checking a new policy before wiring it into a workflow.

Authoring lives in the admin console

This endpoint runs already-published policies. To author a new policy, diff a version, or roll back, head to admin.velgent.com/policies. The compile step uses Claude Sonnet by default; per-tenant model routing is configurable under Engine Settings → Model routing.


Next: Examples →