Stop duplicate subscription entitlements when payment webhooks retry
Payment webhooks will retry. If your handler is slow, returns a 5xx, or you have a transient DB outage, you can receive the same “payment_succeeded / invoice_paid / charge_refunded” event multiple times. If your downstream actions aren’t idempotent, you end up with:
- Double-granted entitlements (two seats, two plan upgrades, two credits)
- Refunds that don’t revoke access (or revoke twice)
- Finance ops spending days reconciling “why does the ledger say X but the product says Y?”
This is a classic “AI can spot it, but AI shouldn’t do it” problem: an LLM can help classify messy cases, but the actual financial/product actions must be deterministic, approval-gated where needed, and fully auditable.
Below is a workflow pattern where AI suggests, Autom Mate executes under control.
End-to-end governed workflow (Autom Mate)
1) Trigger
- Trigger: Incoming payment provider webhook (e.g.,
invoice.paid,charge.refunded,payment_failed) via REST/HTTP/Webhook action into Autom Mate. - Store the raw payload immediately (immutable) for later audit/replay.
2) Validation (before any side effects)
- Signature / authenticity check: Validate webhook signature (provider-specific) using PYTHON library.
- Schema validation: Ensure required fields exist (event id, customer id, amount, currency, status) using Condition modules.
- Idempotency gate (hard stop):
- Look up
event_id(and optionallyevent_type) in an “event ledger” table. - If already processed → exit (return 200 OK) with “duplicate” decision logged.
- If not processed → create a “processing” record.
- Look up
(Why this matters: payment APIs and webhook docs commonly recommend idempotency keys / dedupe by event id to prevent duplicate transactions and duplicate downstream effects. (cashfree.com))
3) AI triage (advisory only)
- Use an LLM step to suggest a classification and next action:
- “Safe auto-grant” vs “needs review” vs “block/hold”
- Confidence + rationale
- Risk flags (amount unusually high, customer recently refunded, multiple retries, etc.)
Important: AI output is never used as the final executor. It only proposes.
4) Approvals (human or policy-based)
- Policy-based auto-approval for low-risk cases:
- e.g., amount < $100, known customer, no prior disputes, event is terminal
- Human approval required for high-risk cases:
- e.g., refunds after fulfillment, large upgrades, repeated retries, mismatched currency
- Approval request sent to a finance/product ops channel via Autom Mate library (if your environment has Teams/Slack installed) or REST/HTTP/Webhook action to your chat/ITSM tool.
5) Deterministic execution (the controlled “do” step)
Once approved (or policy-approved), Autom Mate executes exactly once:
- Grant / update entitlements in your product system via REST/HTTP/Webhook action
- Use a deterministic idempotency key like
entitlement:{customer_id}:{invoice_id}
- Use a deterministic idempotency key like
- Post to internal ledger / billing DB via Database library microservice (if available in your deployment) or REST/HTTP/Webhook action to your finance service
- Update ticket / case (optional) via Autom Mate library (if you use an ITSM tool that’s installed) or REST/HTTP/Webhook action
Autom Mate’s orchestration model is designed for multi-step flows with monitoring and traceability across actions.
6) Logging / audit trail
Log every decision and action:
- Raw webhook payload hash
- Signature validation result
- Idempotency decision (new vs duplicate)
- AI suggestion + confidence (advisory)
- Approval identity + timestamp (or policy rule id)
- Execution steps + responses
- Autom version executed (for change/audit traceability)
Autom Mate supports execution monitoring and version-aware traceability, which is useful when auditors ask “what logic ran at the time?”
7) Exception handling / rollback
- If entitlement grant succeeds but ledger post fails:
- Mark the run as “partial”
- Create a compensating action (e.g., revoke entitlement) only if your policy allows automatic rollback
- Otherwise route to human approval for rollback
- If webhook processing fails mid-run:
- Keep the idempotency record in “failed” with error details
- Allow safe replay (same event id) without double-granting
Two mini examples
Example A — Duplicate invoice.paid arrives 3 times
- Webhook #1: passes signature, not seen before → approved by policy → entitlement granted → ledger posted → mark processed.
- Webhook #2 and #3: same
event_id→ Autom Mate exits early, logs “duplicate”, returns 200.
Example B — charge.refunded after fulfillment
- AI suggests “high risk: refund after fulfillment; needs review.”
- Autom Mate opens an approval task for finance/product ops.
- If approved: revoke entitlement + create a credit memo entry (deterministic steps).
- If rejected: keep entitlement, log decision, attach rationale.
Why AI alone is risky here
- Webhook payloads can be ambiguous; AI may misread “final” vs “intermediate” states.
- AI can hallucinate or over-generalize edge cases (partial refunds, multi-invoice customers, proration).
- Financial/product actions must be idempotent, deterministic, and auditable—especially when retries and replays are expected. (cashfree.com)
Autom Mate’s role is to enforce the guardrails: validations, approvals, deterministic execution, and a complete audit trail.
Discussion questions
- Where do you draw the line for policy auto-approval vs human approval (amount threshold, customer risk tier, event type)?
- Do you prefer compensating rollbacks (revoke entitlement) or manual remediation when downstream posting fails?