#Score aggregation
Multi-policy evaluations (chains, graphs, and multi-policy batches)
roll up a single weighted aggregate score across all
contributing policies. Operators declare per-policy weights at
authoring time and per-tenant thresholds in engine settings; the
engine returns a uniform aggregate block on every response.
At a glance:
For each step that ran successfully:
score = map(step.mode, step.outcome, step.score_mapping)
│
▼
contribution = score × weight
│
▼
weighted_score = Σ contributions / Σ weights ← renormalised
│
▼
threshold = bucket(weighted_score, tenant.thresholds)
│
▼
action = tenant.actions[threshold] ← may be null
Skipped and failed steps are excluded from both numerator AND denominator — the score renormalises across what actually ran. Branches that didn't fire (because their conditional edge wasn't true) don't drag the aggregate down.
#Aggregate response block
Returned on every chain / graph / multi-policy response. Null when no policy contributed (all generate-mode excluded, or all steps failed/skipped).
"aggregate": {
"weighted_score": 0.84,
"threshold": "review",
"contributions": [
{ "step_id": "privacy_check", "mode": "validate", "score": 1.0, "weight": 0.4, "contribution": 0.40 },
{ "step_id": "geo_licensing", "mode": "score", "score": 0.85, "weight": 0.4, "contribution": 0.34 },
{ "step_id": "customer_tier", "mode": "classify", "score": 0.5, "weight": 0.2, "contribution": 0.10 }
],
"action": {
"kind": "queue_for_review",
"params": { "queue_id": "compliance-tier-2" }
}
}
#Per-mode score mapping
How a policy's mode-specific outcome maps to a 0-1 contribution
score. Operators author overrides on the policy via the admin
(or via POST /api/admin/policies with the score_mapping
field); defaults apply otherwise.
| Mode | Default mapping | Operator override |
|---|---|---|
validate | passed → 1.0, failed → 0.0 | {"type": "passed_to_score", "invert": true} flips polarity |
score | Passthrough (outcome.score directly) | None needed |
decide | Unmapped action → neutral 0.5 | {"type": "decide_to_score", "actions": {"approve": 1.0, "reject": 0.0, ...}} |
classify | Unmapped label → neutral 0.5 | {"type": "classify_to_score", "labels": {"low": 1.0, "high": 0.0, ...}} |
generate | Excluded from aggregation | {"type": "generated_text_present"} → 1.0 if text exists, 0.0 otherwise |
Convention: higher score = more compliant / lower risk /
better outcome. Operators with inverted policies (risk model
where high = bad) flip via the mapping (or use "invert": true
for validate).
#Per-tenant thresholds + actions
Stored on the org's settings, dialed via the admin engine-settings
page (PUT /api/admin/engine-settings/aggregation).
thresholds:
pass: 0.9 # score >= 0.9 → "pass"
review: 0.7 # 0.7 <= score < 0.9 → "review"
# below 0.7 → "block"
actions:
pass: { kind: "auto_approve", params: {} }
review: { kind: "queue_for_review", params: { queue_id: "compliance-tier-2" } }
block: { kind: "reject", params: { message: "Compliance denied." } }
Default thresholds are pass=0.9, review=0.7. Default actions
are empty — aggregate.action returns null until the tenant
wires them. The engine never executes actions; the payload is
returned for the caller's stack to fire (same "pure reasoning"
stance as decide-mode actions).
#Example
A graph with one validate + one score + one classify policy, weighted 0.4 / 0.4 / 0.2:
curl -X POST https://aiengine.velgent.com/api/v1/policies/graph \
-H "Authorization: Bearer velgent_live_..." \
-d '{
"graph_slug": "compliance/transaction-review",
"inputs": { "amount": 50000, "country": "US", "customer_tier": "premium" }
}'
Response (abbreviated):
{
"request_id": "...",
"steps": [
{ "id": "privacy_check", "status": "ok", "outcome": { "passed": true } },
{ "id": "geo_licensing", "status": "ok", "outcome": { "score": 0.85, "reasons": [...] } },
{ "id": "customer_tier", "status": "ok", "outcome": { "primary_label": "premium", "labels": ["premium"], "confidence": 0.92 } }
],
"aggregate": {
"weighted_score": 0.84,
"threshold": "review",
"contributions": [
{ "step_id": "privacy_check", "mode": "validate", "score": 1.0, "weight": 0.4, "contribution": 0.40 },
{ "step_id": "geo_licensing", "mode": "score", "score": 0.85, "weight": 0.4, "contribution": 0.34 },
{ "step_id": "customer_tier", "mode": "classify", "score": 0.5, "weight": 0.2, "contribution": 0.10 }
],
"action": { "kind": "queue_for_review", "params": { "queue_id": "compliance-tier-2" } }
}
}
(customer_tier's score is 0.5 because the classify policy doesn't
have an explicit score_mapping — operator would add
{"premium": 1.0, "standard": 0.7, "trial": 0.3} to lift the tier
contribution.)
Chain runs write operation: "policy_chain" to
audit_logs; graph runs write
operation: "policy_graph". Filter your activity
dashboards by this if you want to separately measure either.
Both ops still write one policy_evaluations row
per executed step.
Back to: Policy engine overview →