Guide

Zero‑Downtime Zapier→Make/n8n Migration Guide (14‑Day Shadow‑Run + 15‑Minute Cutover)

Copy‑ready, zero‑downtime playbook for solo operators migrating a revenue‑critical Zap to Make or n8n. Includes a 14‑day shadow‑run with correlation IDs and parity gates, a risk register, cost model formulas (tasks vs credits vs executions) with Q2‑2026 facts, a 15‑minute cutover runbook with webhook toggle/rollback, and a post‑cutover validation script outline.

A field‑tested, numbers‑first playbook to move a revenue‑critical Zap to Make or n8n without downtime. You’ll instrument a 14‑day shadow‑run with correlation IDs and parity checks, then execute a 15‑minute cutover with a one‑click rollback and a post‑cutover validation script. All cost math is framed around Zapier tasks vs Make credits vs n8n executions with Q2‑2026 plan facts noted so you can price the move before you flip the switch.

1) Gate 0 — Readiness and SSO parity (Pass/No‑Go)

Use this to decide if you even schedule a shadow‑run. If you fail any Pass/No‑Go gate, fix it first.

Pass/No‑Go gates:

  • SSO parity required? Choose a target plan that actually supports it.
    • Zapier SSO: Team (and up).
    • Make SSO: Enterprise only.
    • n8n SSO: Business/Enterprise.
  • Webhook throughput headroom during parallel run:
    • Zapier incoming webhooks: ~20,000 requests/5 minutes per user; bursts may return 200 with delayed processing. Design retries with backoff.
    • Make incoming webhooks: ~300 requests/10 seconds per scenario; above that returns 429. Fan‑out or shard if needed.
    • n8n: no single posted SLA for inbound retries; rely on upstream retries + your relay’s queue.
  • Data protection: If PII/PHI is present, confirm data processing terms on the target and redact logs.
  • Vendor lockouts: Premium app steps, special search semantics, or MCP/Agent steps you depend on in Zapier—confirm target equivalents or a custom node/module.

Outcome: Green‑light only when SSO, throughput, and compliant logging are achievable on the chosen target plan.

2) Translation map — from Zap to Make/n8n

Inventory your current Zap and map each step to target constructs. Label risk, idempotency, and cost impact.

Make a one‑row‑per‑step table with these columns:

  • Step # / Name
  • Type: Trigger | Read | Search/Lookup | Transform | Write (creates) | Write (updates) | Branch/Filter | Delay/Rate‑limit
  • Idempotent? yes/no (writes are usually no; plan to guard)
  • External rate limit? e.g., 100 req/min
  • Zapier task cost? yes/no (Filters/Paths/Tables/Forms don’t count tasks)
  • Target equivalent (Make Module | n8n Node)
  • Notes on differences (pagination, search returns, null handling)

Example mappings:

  • Zapier “Catch Hook” → Make “Custom webhook” OR n8n “Webhook” node (use production URL for live, test URL for bench).
  • Zapier “Find or Create Record” →
    • Make: [Search module] → [Create module if empty] with a router branch.
    • n8n: [HTTP/DB/Search node] → IF → [Create] with an idempotency key.
  • Zapier “Paths/Filters” → Make “Router/Filter” or n8n “IF/Switch”.

Outcome: A translation map that exposes cost drivers (writes, AI/code execution) and reliability risks (search ambiguity, rate‑limited APIs).

3) Shadow‑run architecture + instrumentation

Run production traffic through your source Zap while duplicating the same payloads to the target workflow. Log both sides and compare.

Shadow‑run architecture patterns:

  • Dual‑post relay (recommended): Point the upstream system to a small relay (Cloudflare Worker, API Gateway, Fly.io) that:
    1. Adds a correlation ID (ULID/UUIDv7) and Idempotency-Key header.
    2. Posts payload A to Zapier (authoritative) and payload B to Make/n8n (shadow).
    3. Writes a compact log row per event.
  • Native dual hooks: If the source can notify two webhooks, send to Zapier and target directly and add the correlation ID in a pre‑hook function.

Correlation ID spec:

  • Key: corr_id (ULID), deterministic when a natural key exists (e.g., order_id|created_at → hash → ULID), else random per event.
  • Propagate: Include on every outbound call and write to every resulting record’s "external_id" or "note" field when possible.

Log schema (Notion/Airtable/DB):

  • timestamp
  • corr_id
  • source_summary (e.g., Zap step path taken)
  • target_summary (scenario/workflow path)
  • result: pass | soft_fail | hard_fail | skipped
  • latency_ms: zapier_p95 | target_p95 (store individual too)
  • write_fingerprint: hash of the would‑be side effect (e.g., email|amount|invoice_date)
  • idempotency_key
  • failure_class: mapping_error | rate_limited | auth | timeout | data_contract | other

Parity assertions (evaluate per event):

  • Count: 1 input → exactly 1 target run (no dupes).
  • Branch parity: the same logical branch taken.
  • Field parity: selected fields equal after normalization (trim, lower, dates to ISO).
  • Side‑effect fingerprint parity: match fingerprints; in shadow, direct writes OFF or routed to a sandbox.

Safety controls:

  • Shadow write suppression: in Make/n8n, wrap every write with if(SHADOW_MODE) guard; in Make, use a Router to a "log‑only" module; in n8n, gate with an IF node.
  • Idempotency: on write modules/HTTP calls, include Idempotency-Key: {corr_id} when the API supports it; otherwise pre‑check by fingerprint.
  • Backpressure: respect Zapier’s delayed processing note and Make’s 429 window; relay retries with exponential backoff and jitter.

4) 14‑day operating plan + parity gates

Run the parallel system for two weeks to capture real edge cases and weekly cycles.

Daily loop (15–20 min):

  • Review dashboard: events, pass rate, hard_fail rate, p95 latency deltas (target minus Zapier), top failure classes.
  • Triage 10 newest hard_fails; tag root cause; ship a fix if under 30 minutes.
  • Sample 10 passes; manually eyeball the fingerprints and transformed fields.

Parity thresholds to meet before cutover:

  • Event coverage ≥ 95% of typical weekly volume.
  • Pass rate (no diffs) ≥ 98.5% for 3 consecutive days.
  • Hard_fail rate ≤ 0.2% daily, none in the last 24h for P0 paths.
  • p95 latency delta ≤ +500 ms vs Zapier for customer‑visible steps.

Minimal dashboard (metrics to chart):

  • Events/day, pass%, soft_fail%, hard_fail%.
  • p50/p95 latency by platform.
  • Top 3 failure classes with counts.

Risk register (copy this table):

  • risk_id | description | likelihood (L/M/H) | impact (L/M/H) | owner | mitigation | trigger_condition | rollback_action | status

Go/No‑Go meeting (Day 13):

  • Confirm SSO on target is live (if required) and enforced.
  • Confirm rate‑limit headroom at observed p95 + 20%.
  • Dry‑run the rollback toggle (see section 7) and measure time to restore (target ≤ 60 seconds).

5) Cost model — tasks vs credits vs executions (with 2026 facts)

Use this sheet to model current and target spend. Build one row per workflow.

Columns (keep names verbatim):

  • workflow_name
  • runs_per_month
  • zapier_tasks_per_run (successful action steps only; Filters/Paths/Tables/Forms = 0)
  • zapier_included_tasks | zapier_plan_cost_usd | zapier_overage_cost_per_task_usd (if applicable)
  • make_credits_per_run (≈ sum of modules; Code App ~ 2 credits/1s exec)
  • make_included_credits | make_plan_cost_usd
  • n8n_executions_per_run (1 per workflow trigger + any sub‑workflow calls)
  • n8n_included_executions | n8n_plan_cost_usd
  • Derived: zapier_monthly_cost, make_monthly_cost, n8n_monthly_cost, winner

Formulas (paste into your sheet):

zapier_monthly_tasks = runs_per_month * zapier_tasks_per_run
zapier_overage_tasks = MAX(zapier_monthly_tasks - zapier_included_tasks, 0)
zapier_monthly_cost = zapier_plan_cost_usd + zapier_overage_tasks * zapier_overage_cost_per_task_usd

make_monthly_credits = runs_per_month * make_credits_per_run
make_monthly_cost = make_plan_cost_usd + MAX(make_monthly_credits - make_included_credits, 0) * make_overage_cost_per_credit_usd

n8n_monthly_executions = runs_per_month * n8n_executions_per_run
n8n_monthly_cost = n8n_plan_cost_usd + MAX(n8n_monthly_executions - n8n_included_executions, 0) * n8n_overage_cost_per_execution_usd

winner = MIN(zapier_monthly_cost, make_monthly_cost, n8n_monthly_cost)

Seed facts to note (Q2‑2026—verify at purchase time):

  • Zapier: Tasks bill only for successful action steps; Filters/Paths/Tables/Forms do not consume tasks. Webhook triggers on Professional, Team, Enterprise. SSO on Team+.
  • Make: Credit‑based; most modules ≈ 1 credit/action. Free includes 1,000 credits/mo. Paid plans allow 1‑minute scheduling. Webhook throughput ~300 requests/10s per scenario. SSO Enterprise‑only.
  • n8n Cloud: Execution‑based quotas (e.g., Pro ~10k; Business ~40k). SSO on Business/Enterprise.

Tip: In the shadow‑run, log tasks/credits/executions per event so your sheet uses observed numbers—not guesses.

6) Cutover window — 15‑minute runbook

A precise, time‑boxed plan to switch traffic with minimal risk. Practice this once the day before.

Prep (T‑30 to T‑10 minutes):

  • Freeze deploys to all related systems.
  • Confirm dashboards/alerts active; open your parity log view.
  • Pre‑enable target scenario/workflow in “ready” state (live webhook URL responding 200, writes guarded by a feature flag).
  • Shorten TTLs or, better, use a proxy/flag switch (see section 7) so you don’t depend on DNS propagation.

Cutover (15‑minute window):

  1. T‑00: Flip inbound route from Zapier to target.
    • If using a relay: toggle PRIMARY_DESTINATION = target (feature flag or env var) and publish.
    • If using direct hooks: update the source system’s webhook URL to target; leave Zap off or set to "off" first to avoid duplicate handling.
  2. T+01: In Zapier: turn the Zap OFF. In Make/n8n: disable SHADOW_MODE flag so writes are now real.
  3. T+03: Trigger a known‑good test event end‑to‑end; verify target side‑effects.
  4. T+05: Sample 10 live events; check parity dashboard (now target vs target expectations). Watch p95 latency.
  5. T+10: Announce “cutover complete” in your change log; keep rollback toggle armed.

Success criteria during window:

  • No 4xx/5xx spike on the target webhook.
  • First 10 live events complete with correct side‑effects within expected latency.
  • No duplicate writes detected in monitoring (idempotency guard working).

7) One‑click rollback patterns

Your safety net. Aim for ≤ 60 seconds to restore Zapier as the handler if needed.

Pattern A — Proxy flag (fastest):

  • Front all inbound webhooks with a proxy (Cloudflare Worker, API Gateway, Fly.io). Route by an env var/feature flag:
    • PRIMARY_DESTINATION = zapier|target
    • SECONDARY_MIRROR = zapier|target|none
  • One‑click rollback: flip the flag back to zapier, redeploy (Workers publish is near‑instant), and re‑enable the Zap.

Pattern B — Dual webhooks in source app:

  • If the upstream app supports multiple destinations, keep both registered. Rollback = disable target endpoint and re‑enable Zap.

Operational checklist for rollback:

  • Toggle route back to Zapier.
  • Re‑enable the Zap.
  • Re‑enter SHADOW_MODE on target to prevent writes while you investigate.
  • Rehydrate missed events: replay from your relay queue or source app’s event log into Zapier (respect Zapier’s webhook headroom and add jittered batches).

Test it in rehearsal:

  • Time the full restore.
  • Verify that replay does not create dupes (idempotency/fingerprint checks on Zap side if available).

8) Post‑cutover validation script (outline + alerts)

Keep this simple and local—no need for a full test harness. The goal is to continuously sample recent events for correctness and latency.

Inputs:

  • events: last 100 corr_ids from the target’s webhook table.
  • assertions: JSON list of field checks and branch expectations from your translation map.

Script outline (Node/TS pseudo‑code):

for (const e of events) {
  const src = await getZapierShadowLog(e.corr_id)   // your log store
  const tgt = await getTargetLog(e.corr_id)

  assertEqual(branchKey(src), branchKey(tgt), 'branch parity')
  assertFingerprintEqual(src.write_fingerprint, tgt.write_fingerprint)

  // Field equality with normalizers
  for (const a of assertions) {
    const left = normalize(src.fields[a.key], a.normalize)
    const right = normalize(tgt.fields[a.key], a.normalize)
    assertEqual(left, right, `field:${a.key}`)
  }

  record({
    corr_id: e.corr_id,
    parity: 'pass',
    latency_delta_ms: tgt.latency_ms - src.latency_ms
  })
}

Alerting:

  • Page/notify on: ≥ 3 hard fails in 5 minutes, or p95 delta > +750 ms for 10 minutes.
  • Auto‑open a ticket with the failure_class and top 3 fingerprints involved.

After 24–48 hours clean, archive the logs, lock the risk register, and remove the rollback arm.

Bonus: Keep the relay in “mirror disabled” mode for 1 week. If you must roll back days later, you can flip quickly without touching DNS in the upstream app.