Policy Engine
Administrator-authored policies that govern the rest of your stack. Each policy is plain English + a declared input schema, usable in five modes (validate, generate, decide, score, classify). Compose multiple policies into linear chains or parallel DAGs with conditional routing. Get per-step results plus a weighted aggregate score back in one call. The engine never executes actions — it returns the rendered payload for your stack to fire.
What it does
- Plain-English authoring — no DSL or code review required.
- Five modes per policy — validate (pass/fail), generate (compliant text), decide (action + payload), score (0-1 number), classify (multi-label tag).
- Chains for sequential workflows — classify → score → decide → generate, each step seeing previous outcomes.
- Graphs (DAGs) for parallel + conditional — independent steps run concurrently; conditional edges branch on upstream outcomes.
- Per-tenant weighted aggregation — single 0-1 score across all contributing policies, bucketed pass / review / block.
- Per-policy action catalogue — decide mode returns structured payloads with ${input.X} substitution.
- Pure reasoning surface — engine never executes actions; the caller fires.
- Versioned + slug-based — author once in the admin, integrations reference by slug. Atomic publish, one-click rollback.
How it works
- 01Author once, in English
Administrators write each policy in plain English — a refund rule, a risk score, an incident classifier. Declare the inputs it needs. Pick which modes it supports.
- 02Compose into workflows
String policies into chains (sequential) or graphs (DAGs — parallel + conditional). Each step's outcome feeds the next; branches fire only when their condition is met.
- 03Decide. Score. Act.
One call returns per-step results plus a weighted aggregate score bucketed against tenant thresholds. Pass / review / block — with a structured action your stack fires.
The model
Deterministic where it matters. Reasoning where it pays.
Most policy engines either force administrators into a DSL (fast, inflexible) or call an LLM on every evaluation (flexible, expensive, non-deterministic). The Policy Engine splits the difference per mode.
Validateis pure DSL at runtime — no LLM, no network hop, microsecond P50 latency, deterministic verdicts. Verdicts ship with the exact failed predicates so your UI can render a useful "why this was rejected" without a second round trip.
Generate, Decide, Score and Classify call claude-sonnet-4at runtime — the English policy plus declared inputs go into a structured prompt. The response is compliant text (generate), a JSON action choice (decide), a 0-1 number (score), or one-or-more catalogue labels (classify). Decide and classify are both constrained to the policy's catalogue — the engine rejects hallucinated IDs before they reach your caller.
Chains and graphs are pure orchestration — the engine runs each step in its declared mode and threads outcomes forward through the ${step.outcome.*} substitution surface. Graph steps with no dependency on each other run in parallel (asyncio, capped at 8 concurrent). Conditional edges evaluate against upstream outcomes via a small null-safe expression DSL — no LLM round-trip just to decide whether to branch.
Per-tenant compiler model routing is exposed in the admin console under Engine Settings → Model routing. The policy_compile category accepts the same provider list as summarise — Anthropic, Groq, local Mistral, any OpenAI-compatible endpoint.
Knowledge retrieval (RAG) is reserved on the wire today — every policy can declare knowledge_refs and every response includes a citations[] field. Empty arrays until the retriever ships; the wire shape is final, so wire your parser now and the day citations populate, nothing on your side changes.
Use it
Full API referenceOne endpoint family — three shapes. Single policy for a one-shot verdict. Chain for sequential workflows. Graph for parallel + conditional. Every response carries per-step results plus a weighted aggregate bucketed against your tenant thresholds.
POST /api/v1/policies/evaluate
{
"policies": [{ "slug": "customer-refund" }],
"mode": "validate", // or "generate" / "decide" / "score" / "classify"
"inputs": { "customer": "sarah@acme.com", "cost": 200, ... }
}POST /api/v1/policies/chain
{
"steps": [
{ "id": "triage", "slug": "ticket-classifier", "mode": "classify" },
{ "id": "risk", "slug": "fraud-score", "mode": "score" },
{ "id": "decide", "slug": "refund-policy", "mode": "decide" }
],
"inputs": { ... }
}
// → results: [...] · aggregate: { score: 0.83, bucket: "pass", action: {...} }POST /api/v1/policies/graph
{
"graph_slug": "incident-triage-v3", // or inline "graph": { steps, edges }
"inputs": { ... }
}
// inline graph shape — independent steps run concurrently
{
"steps": [
{ "id": "pii", "slug": "pii-scan", "mode": "validate" },
{ "id": "sev", "slug": "severity", "mode": "classify" },
{ "id": "respond", "slug": "auto-respond", "mode": "generate",
"depends_on": ["pii", "sev"] }
],
"edges": [
{ "from": "sev", "to": "respond",
"when": { "op": "eq", "path": "sev.outcome.label", "value": "P1" } }
]
}"aggregate": {
"score": 0.83,
"bucket": "pass", // pass | review | block
"contributions": [
{ "step_id": "risk", "weight": 2.0, "score": 0.91 },
{ "step_id": "decide", "weight": 1.0, "score": 0.67 }
],
"action": { "id": "auto_approve", "payload": { ... } }
}Get started
From English to your first verdict in an afternoon.
- 01Author the policy
In the admin console, paste the rule in English. The compiler infers the input schema and produces a previewable predicate tree.
- 02Review and publish
Diff against the previous version, run it against sample inputs, then publish. Old in-flight calls finish on the previous version.
- 03Evaluate from your stack
POST inputs to /api/v1/policies/evaluate from anywhere — your backend, a ServiceNow flow, a Zapier step. Get back a structured verdict.