Problem (showing up a lot right now)
CI/CD supply-chain incidents keep repeating a familiar pattern: a compromised GitHub Action / workflow path leaks secrets (PATs, cloud keys, package registry tokens) into logs or exfiltrates them, and teams respond with a mix of panic-rotations, ad-hoc repo permission changes, and “AI says rotate everything” playbooks.
The hard part isn’t knowing what to do—it’s doing it safely, consistently, and provably across dozens/hundreds of repos without breaking builds or locking out responders.
Proposed community topic
A governed, deterministic “CI/CD secret exposure response” pattern in Autom Mate that:
- accepts high-signal alerts (from GitHub / secret scanners / SIEM)
- validates scope and blast radius
- requires explicit approvals for destructive steps
- executes rotations and permission changes deterministically
- produces an audit trail you can hand to security + compliance
End-to-end workflow (trigger → validation → approvals → deterministic execution → logging → exception handling)
1) Trigger
- Trigger on an inbound Webhook from your detector (e.g., secret scanner, SIEM, or GitHub security alert webhook).
- Payload includes: repo, workflow name, commit SHA, suspected secret type, detection confidence, and evidence pointers.
2) Validation (guardrails before any action)
- Enrich and validate via REST/HTTP calls:
- Confirm the repo exists and is in-scope (org allowlist).
- Confirm the workflow run/commit SHA matches the alert.
- Classify the secret type (GitHub token vs cloud key vs package registry token).
- Check whether the secret is actually referenced by active pipelines (to avoid unnecessary outages).
- If confidence < threshold or scope is ambiguous → route to “human review only” path.
3) Approvals (two-stage)
- Approval A (Security): approve the containment plan (what will be revoked/rotated, what will be paused).
- Approval B (Service Owner): approve downtime-impacting steps (e.g., disabling workflows, rotating prod deploy keys).
4) Deterministic execution (no free-form AI actions)
Execute a fixed, pre-approved runbook (all steps parameterized, no improvisation):
- Containment:
- Reduce token permissions / revoke compromised token(s) where possible.
- Temporarily disable the affected workflow(s) or block the risky trigger path.
- Rotation:
- Rotate the specific secret(s) (package registry token, cloud key, etc.).
- Update secret stores / repo secrets.
- Recovery:
- Re-enable workflows after a clean test run.
- Open/append a ticket with the full action log.
5) Logging & auditability
- Write a structured “incident action ledger” per run:
- who approved what
- exact API calls executed
- before/after state (where safe)
- timestamps + correlation IDs
6) Exception handling / rollback
- If rotation breaks builds:
- auto-create a high-priority ticket and page the owner.
- roll back only non-security-critical changes (e.g., re-enable workflow with restricted permissions), while keeping revoked tokens revoked.
- If an API call fails mid-run:
- stop the run, mark partial completion, and require re-approval to continue.
Two mini examples
Mini example 1: “Compromised GitHub Action leaked secrets into logs”
- Trigger: alert includes repo + run ID.
- Validation: confirm the run used a vulnerable action version.
- Approval: Security approves revoking GitHub tokens + rotating package registry token.
- Execution: revoke token(s), rotate npm token, update repo secrets, re-run pipeline.
Mini example 2: “Suspicious workflow added to a repo (possible GhostAction-style implant)”
- Trigger: webhook from repo monitoring tool.
- Validation: diff workflow YAML, confirm new outbound network destinations.
- Approval: Service owner approves disabling workflows; Security approves org-wide token review.
- Execution: disable workflow, quarantine branch, rotate only the secrets that workflow could access.
Discussion questions
- What’s your minimum approval bar for revocation vs rotation vs workflow disablement?
- Do you prefer “pause everything” containment, or “surgical containment” based on workflow permission scoping?