Jordan: April ninth. Wednesday night. I'm reconciling invoices — the monthly ritual where I pull up OpenAI, Anthropic, and Zapier side by side and pretend I'm not scared to look. And this time... the Zapier number doesn't make sense. I'm staring at a task count that's forty-two percent over my plan limit. Forty-two percent. Which means every task past the cap has been billed at one-point-two-five times the base rate. I didn't get a warning. I didn't get an email that said "hey, you're close." The Zaps just... kept running. And Zapier kept charging.
Jordan: So I start digging. Which Zaps burned through the overage? It's not the critical ones — not the client onboarding flows, not the invoice sync. It's a batch enrichment Zap I built for one client's CRM. Runs every six hours, hits an API, updates about three hundred records. Totally non-critical. Could have paused for a week and nobody would have noticed. But it didn't pause. Because I had no system telling it to pause. No alert at sixty percent. No throttle at eighty. No kill switch at a hundred. Just... an open tap running into a paid meter.
Jordan: That overage cost me about two hundred and thirty dollars. Not catastrophic. But here's what kept me up — I have twelve clients. If three of them had spiked the same way in the same month, that's seven hundred dollars of margin I didn't budget for. Gone. Because a non-critical Zap didn't know it was non-critical.
Jordan: That Thursday morning I built the system I'm showing you today.
Jordan: The problem is not that AI costs are unpredictable. OpenAI publishes per-token prices. Anthropic publishes per-token prices. Zapier tells you exactly what a task costs past your plan limit — one-point-two-five times base. The problem is that nobody builds the wiring between those numbers and an actual stop signal. So today you're getting that wiring. One usage ledger that tracks tokens, tasks, and Make credits per client. Slack alerts at sixty, eighty, and a hundred percent of budget. Auto-throttle rules that slow non-critical flows before they spike. And a kill switch on every platform — Zapier, Make, n8n — that shuts down the stuff that doesn't matter before it eats the margin on the stuff that does.
Jordan: Okay, so here's what I was dealing with that Thursday morning. Twelve clients. Three cost surfaces per client — LLM tokens across OpenAI and Anthropic, Zapier tasks, and Make credits. And every one of those surfaces has a different unit, a different price, and a different way of telling you how much you've used. OpenAI gives you a usage dashboard with exports. Anthropic has their Usage and Cost API — which is actually great, you can pull spend programmatically. Zapier shows task counts in your account settings. Make shows credit consumption in the scenario logs. But none of them talk to each other. And none of them can tell you "client X is at seventy-eight percent of their monthly budget across all platforms."
Jordan: That's the gap. Not visibility per vendor — visibility per client, across vendors. So the first thing I built was a cost ledger. Google Sheets. One row per usage event. Twelve columns — date, client name, source, model or plan, metric type, quantity, unit, and then the pricing fields. Input price per million tokens, output price per million tokens, task unit price for Zapier, credit unit price for Make. And then a calculated cost column that picks the right formula based on the metric type.
Jordan: This takes roughly twenty minutes to set up from scratch. If you grab the template from the Resources page, it takes about five — you just replace the bracketed fields with your client names and your current pricing.
Jordan: Now — the pricing tab matters more than people think. OpenAI, Anthropic, and Google all publish per-million-token rates. You can look them up right now. But here's the catch with Make — they moved from operations to credits, and most non-AI modules still cost one credit per operation. Fine. But their built-in AI modules? Variable. The credit cost scales with tokens, processing time, even file size in some cases. So if you're running a Make scenario that calls an AI module, you cannot just count operations anymore. You have to track credits, and you have to know that a single AI step might burn five or ten credits where a normal HTTP module burns one.
Jordan: I learned this the fun way. Had a client whose content generation scenario looked like it was running fifty operations a month. Fifty. Totally fine. Except those fifty operations included an AI summarizer that was consuming about three hundred credits. The scenario log showed it. I just... wasn't looking at the right number.
Jordan: So the ledger is layer one. Layer two is alerts. And this is where it gets satisfying, because the math is simple. You set a monthly budget per client per source in a Budgets tab. The Rollup tab pulls actual spend from the ledger, divides by budget, and flags when you cross sixty, eighty, or a hundred percent. A column called Alert Level does the threshold check. Another column called Last Notified tracks which alert already fired so you don't get duplicate Slack messages.
Jordan: The Slack integration is one Zap or one Make scenario — watches the Rollup tab for new alerts, posts to your channel, and updates Last Notified. That's it. At sixty percent, you get an informational ping. At eighty, a warning. At a hundred, an action alert. And the hundred-percent alert is the one that triggers the kill switch — but I'll get to that.
Jordan: Now, someone's going to say — and I've gotten this DM — "Jordan, OpenAI has a usage dashboard. Anthropic has budget alerts built in. Why am I building a spreadsheet?" Fair question. And the answer is: use those tools. Absolutely use them. Set budget alerts in every vendor console you have access to. But here's what those dashboards cannot do. They cannot turn off a Zapier Zap. They cannot stop a Make scenario. They cannot deactivate an n8n workflow. They can tell you the house is on fire. They cannot turn off the stove. That's what the kill switch is for, and that's why the ledger exists — to be the single source that connects vendor spend to platform actions.
Jordan: Alright — layer three. Throttling. This is the part that runs quietly in the background and prevents you from ever needing the kill switch in the first place. The idea is simple: slow down non-critical flows so they don't spike your usage in bursts.
Jordan: On Zapier, the tool is Delay After Queue. You add it near the top of any Zap that handles bursty events — webhooks, bulk table updates, anything that might fire fifty times in a minute. You give the queue a name, set a wait between runs — I usually start with ten to fifteen seconds — and now that Zap processes one run at a time instead of stampeding through your task limit. Zapier's own docs recommend this pattern specifically to avoid hitting their rate limits. And those limits are real — four hundred fifty requests per minute per Zap-table combination, a hundred fifty per five seconds. If you're doing anything with Zapier Tables, you will hit these without a queue.
Jordan: On n8n — if you're self-hosting — the lever is an environment variable called N8N underscore CONCURRENCY underscore PRODUCTION underscore LIMIT. Set it to whatever your infrastructure and your external APIs can handle. I run mine at five for most client instances. If you're using queue mode with workers, you set worker concurrency separately. And then inside individual workflows, you add Wait nodes to pace loops that call rate-limited APIs. The combination of concurrency caps plus in-workflow waits keeps you under external rate limits without manual babysitting.
Jordan: Make is the easiest here, honestly. Make has automatic retry with backoff built in. If a module hits a rate limit or a timeout, Make retries with staged delays and caps parallel retries at three per scenario. So you don't need to build a queue — you need to not fight the queue that already exists. Space your scenario schedules conservatively. If you're replaying a backlog, chunk it through a Data Store instead of dumping everything at once.
Jordan: Throttle first, scale later. I have that written on a sticky note on my monitor. It's not glamorous advice. But every time I've ignored it, I've regretted it within a week.
Jordan: Layer four. The kill switch. This is the part that actually stops the bleeding when a client's usage hits a hundred percent of budget.
Jordan: The principle is critical — you only kill non-critical automations. You never put a client's invoice sync, their onboarding flow, or their payment processing on the kill list. You decide in advance which flows are non-critical — batch enrichments, report generators, content schedulers, analytics syncs — and you tag them. In the Budgets tab, each source-client pair has a Priority column: critical or non-critical. And a Kill Switch Armed column: true or false.
Jordan: When the Rollup tab hits a hundred percent and the priority is non-critical and the kill switch is armed — that's when the automation fires.
Jordan: On Zapier, you use Zapier Manager. It's a built-in app that can turn Zaps on and off. Your kill-switch Zap watches the Rollup tab, loops through a list of non-critical Zap IDs, and calls Zapier Manager to turn each one off. Then it posts to Slack with a re-enable link — a webhook URL that triggers a second Zap to turn everything back on. One click. That's the re-enable.
Jordan: On Make, you hit the Scenarios API. POST to slash api slash v2 slash scenarios slash your scenario ID slash stop. That deactivates the scenario. To re-enable, same endpoint but with slash start. You need a personal access token with scenarios-write scope. I run this from a separate Make scenario — the toggler scenario — so it never accidentally disables itself. Yes, I learned that one the hard way too.
Jordan: On n8n, the built-in n8n node has Activate and Deactivate workflow operations. There's even an official template — template thirty-two twenty-nine — that demonstrates scheduled activation and deactivation using the native API. Quick note if you're on n8n v2: the UI labels changed to Publish and Unpublish, but the activation endpoints still work through the API and the built-in node. So your kill-switch workflow deactivates the non-critical workflows by ID, posts confirmation to Slack, and provides a webhook-triggered re-enable path. Same pattern as Zapier and Make — just different verbs.
Jordan: Actually — I want to flag something about the n8n endpoint documentation. The built-in node and the official templates clearly support Activate and Deactivate operations, and community implementations confirm the REST endpoints work. But as of right now, n8n doesn't have a single static reference page that enumerates those endpoints the way Make's Scenarios API docs do. So if you're building this on n8n, follow the template and the built-in node operations rather than trying to reverse-engineer endpoint paths from the docs. It works. The documentation just isn't as tidy as Make's.
Jordan: So that's the full system. Ledger feeds the Rollup. Rollup triggers alerts at sixty, eighty, a hundred. Throttle rules keep non-critical flows from spiking in the first place. And the kill switch catches anything that gets through the throttle and hits the cap.
Jordan: I want to be honest about one thing, though. This system adds complexity. You now have a spreadsheet to maintain, alert automations to monitor, and kill-switch Zaps or scenarios that could theoretically misfire. If you accidentally put a critical flow on the non-critical list, you could disable something that matters. That's a real risk. Which is why the test plan matters — and I mean actually running it. Add a fake client to your Budgets tab with tiny budgets. Push test rows into the ledger that cross each threshold. Watch the Slack alerts fire. Watch the kill switch toggle your test Zaps off. Click the re-enable link. Confirm everything comes back. Do this before you arm it on a real client. Takes about thirty minutes. Saves you from the kind of mistake that's much harder to explain than a two-hundred-dollar overage.
Jordan: And recalibrate monthly. Token prices change — Gemini started billing for Search Grounding features back in January. Make's credit costs shift when they update AI modules. Your clients' usage patterns drift as they adopt the tools you built for them. The budget numbers in that spreadsheet are not set-and-forget. They're a monthly five-minute review.
Jordan: That two-hundred-and-thirty-dollar overage from April? It would not have happened with this system. The sixty-percent alert would have pinged me mid-month. The throttle would have slowed the enrichment Zap down before it burned through the remaining tasks. And if it somehow still hit the cap, the kill switch would have turned it off — and I would have gotten a Slack message with a one-click re-enable link instead of a surprise on my invoice.
Jordan: That's the difference between monitoring costs and controlling them. Monitoring tells you what happened. Control stops what's about to happen.
Jordan: Here's your one move this week. Open a Google Sheet — or grab the Guardrails Template Pack on the Resources page, which has the ledger, the alert formulas, and the kill-switch recipes for all three platforms ready to go. Pick your highest-spend client. Set their budget. Wire the sixty-percent alert to Slack. Just that. You can add the throttle and the kill switch next week. But the alert alone will change how you think about your margins.
Jordan: I'm Jordan. This is Headcount Zero. Go build the guardrails.