When invoice OCR “looks right” but creates duplicate vendor payments
A common AP failure mode isn’t bad intent—it’s good automation without governance.
Example: invoices arrive as PDFs in a shared mailbox. An OCR/LLM extracts vendor name, invoice number, amount, due date, and suggests a GL coding. It usually works… until:
- the vendor re-sends the same invoice with a different PDF filename
- the invoice number is read differently (
Ovs0, missing hyphens) - a credit memo is misread as an invoice
- the ERP already has the bill, but the bot can’t reliably detect it
Result: duplicate bills get created and paid, and the post-mortem is painful because the “why” is spread across email threads, OCR logs, and ERP history.
This is exactly where “AI suggests, Autom Mate executes under control” matters.
Why AI alone is risky for AP actions
AI is great at:
- extracting fields from messy PDFs
- suggesting likely vendor matches
- proposing GL coding
AI is not safe for:
- deciding whether something is a duplicate
- deciding whether to create a payable
- deciding whether to release a payment hold
Because those are financial control decisions (SoD, auditability, and deterministic policy enforcement).
Proposed pattern: AI triage + deterministic duplicate controls + governed execution
Systems (example)
- Inbound: AP mailbox / invoice intake tool
- ERP/AP: your ERP (NetSuite/SAP/Dynamics/etc.)
- Collaboration: Microsoft Teams
Integration note: since we can’t assume a specific ERP connector exists, ERP calls are modeled as REST/HTTP/Webhook actions. Autom Mate supports API-based integrations (REST/SOAP/custom) with centralized auth and logging. end workflow (copyable)
1) Trigger
- Trigger: New invoice email received (or intake system webhook)
- REST/HTTP/Webhook action: inbound webhook into Autom Mate (or email ingestion → webhook)
2) Validation (hard gates before any ERP write)
- Validate required fields exist: vendor identifier, invoice number, currency, amount, invoice date
- Validate invoice total matches line totals (if line items provided)
- Normalize fields deterministically:
- uppercase invoice number
- strip spaces/hyphens
- normalize vendor legal name → vendor_id mapping table
3) AI triage (suggestion only)
- REST/HTTP/Webhook action: call your OCR/LLM service to extract fields + confidence
- Autom Mate stores:
- extracted fields
- confidence scores
- the raw PDF hash (SHA-256)
Important: AI output is treated as untrusted input until it passes deterministic checks.
4) Deterministic duplicate detection (policy)
Run multiple checks (any hit → “potential duplicate”):
- Exact match:
(vendor_id, normalized_invoice_number)exists in ERP - Near match: invoice number similarity above threshold AND amount matches
- Document hash match: same PDF hash seen in last N days
- PO/receipt match: same PO + amount + date window
REST/HTTP/Webhook actions:
- Query ERP for existing bills by vendor + invoice number
- Query internal “invoice fingerprint” table (could be ERP custom table or internal DB)
5) Approvals (human or policy-based)
Route based on risk tier:
- Low risk: no duplicate signals + high extraction confidence → auto-create bill on payment hold
- Medium risk: one duplicate signal or low confidence → require AP lead approval
- High risk: multiple duplicate signals or vendor is “high-risk” → require AP lead + Finance manager approval
Autom Mate library: Microsoft Teams (approval request + buttons)
6) Deterministic execution (the key part)
After approval (or policy auto-approval):
- REST/HTTP/Webhook action: Create bill in ERP with:
- idempotency key =
vendor_id + normalized_invoice_number + amount - default
payment_hold = true
- idempotency key =
- REST/HTTP/Webhook action: Attach the original PDF + extracted metadata to the ERP record
- REST/HTTP/Webhook action: If explicitly approved to pay, remove hold / schedule payment
This keeps the “write” steps deterministic and repeatable, even if the trigger retries.
7) Logging / audit trail
Autom Mate logs should capture:
- trigger payload (email headers / webhook id)
- AI extraction output + confidence
- duplicate-check queries + results
- approval chain (who approved, when, what they saw)
- ERP request/response ids
Autom Mate supports systematic event logging and monitoring for auditability.
8) Exception - If ERP create fails after approval:
- open an exception task in Teams with the error + retry button
- If bill created but attachment upload fails:
- keep bill on hold
- retry attachment upload
- If a duplicate is detected after bill creation:
- automatically re-apply payment hold
- notify AP lead
- create a reversal/void task (policy-controlled)
Two mini examples
Example A — Vendor resend causes duplicate
- Trigger: vendor emails the same invoice twice (different PDF filename)
- AI extracts same invoice number, but second time reads
INV-1008asINV-100B - Duplicate controls catch:
- near-match invoice number + exact amount match
- Outcome:
- Autom Mate routes to AP lead approval in Teams
- AP lead marks as duplicate → no ERP write, audit log preserved
Example B — Legit invoice number reuse across subsidiaries
- Trigger: vendor uses same invoice number for two different legal entities
- Duplicate controls catch exact invoice number match, but vendor_id differs by entity
- Outcome:
- Autom Mate requires Finance manager approval
- Approved → bill created with correct entity + hold removed only after second approval
Discussion questions
- What’s your “minimum safe” duplicate policy: exact match only, or do you also use fuzzy matching + amount/date windows?
- Do you prefer auto-create on hold (then approve release) or approve before create for medium-risk invoices?