TemplateMay 29, 2026

Solo SLA Loop Starter Kit (Copy‑Paste Templates for a Human‑in‑the‑Loop Incident Loop)

Copy‑paste templates to stand up a minimal, human‑in‑the‑loop SLA communication loop: a Notion ops manual, Statuspage API updater snippets (with approval gate), Slack cadence workflow, a two‑model credit calculator, and a post‑incident report template. Built for solo operators who need reliability signals without hiring.

From EpisodeBuild Your SLA Monitoring Loop: From Alert to Credit in One Afternoon

Contents↓ Download PDF

1) Notion Ops Manual — SLA definitions, routing, and comms templates 2) Statuspage updater snippets — curl, Make, and Zapier (with approval gate)3) Slack incident channel workflow — cadence enforcement without spam 4) SLA Credit Calculator — two ready models with spreadsheet formulas 5) Post‑incident report (PIR) — fill‑in template for consistency

How to use this kit:

Copy each section into your tools (Notion, Slack, Make/Zapier, your terminal/spoke service).
Replace every field in [BRACKETS] with your details. Keep the suggested defaults unless you have stronger reasons.
Test in a sandbox/private page first. Do not auto‑publish public updates in the first [APPROVAL_WINDOW_MINUTES] minutes — approve manually.
Ship the loop: Monitor → Draft update (private) → Slack cadence reminders → Approve→Publish → Auto‑credit → Post‑incident report.

Suggested defaults you can keep today:

[UPTIME_SLO_PERCENT]=99.9
[UPDATE_CADENCE_MINUTES]=20 or 30
[APPROVAL_WINDOW_MINUTES]=15
Credit model: 5% per 30 minutes of downtime, capped at 50% (see Calculator section).

1) Notion Ops Manual — SLA definitions, routing, and comms templates

Copy this whole block into a Notion page called “Ops Manual → SLAs & Incidents.” Replace [BRACKETS].

SLA Policy for [SERVICE_NAME]

Scope

Covered service(s): [SERVICE_SCOPE]
Environments: [ENVIRONMENTS] (e.g., Production only)
Business hours (local time [TIMEZONE]): [OFFICE_HOURS_START]–[OFFICE_HOURS_END], Mon–Fri
Off‑hours policy: [OFF_HOURS_POLICY] (e.g., best‑effort with slower updates)

Availability Target (Uptime SLO)

Target: [UPTIME_SLO_PERCENT]% monthly
Minutes in month: [MINUTES_IN_MONTH] (e.g., 43,200 for 30 days)
Allowed downtime this month (mins) = ROUND((1 - [UPTIME_SLO_PERCENT]/100) * [MINUTES_IN_MONTH], 1)

Support SLAs

Time to First Response (TTFR): within [TTFR_MINUTES] minutes via [PRIMARY_CHANNELS] during business hours
Next Response Time: within [NEXT_RESPONSE_MINUTES] minutes while ticket is open (business hours)
Resolution goal (not a guarantee): [TARGET_MTTR_HOURS] hours for Sev‑2+, [TARGET_MTTR_DAYS] days for Sev‑3

Severity Levels (tie to customer impact)

Sev‑1 Critical: [CRITERIA_SEV1] (e.g., full outage for >[SEV1_MINUTES] mins or data loss)
Sev‑2 Major: [CRITERIA_SEV2] (e.g., degraded core function; workarounds exist)
Sev‑3 Minor: [CRITERIA_SEV3] (e.g., minor feature or narrow subset)

Components and Mapping

Statuspage Page ID: [STATUSPAGE_PAGE_ID]
Components:
- [COMPONENT_NAME_1] → [COMPONENT_ID_1]
- [COMPONENT_NAME_2] → [COMPONENT_ID_2]

Alert Routing

Monitor source(s): [MONITOR_TOOL] (checks: [CHECK_TYPES])
Open incident if: [OPEN_CRITERIA] (e.g., 2 consecutive failures across 2 regions)
Route to Slack channel: #[INCIDENT_CHANNEL_PREFIX]-[DATE]
Escalation: [ESCALATION_RULE] (e.g., after [ESCALATE_MINUTES] mins, call [PHONE_NUMBER])

External Communication Runbook

Update cadence during active incidents: every [UPDATE_CADENCE_MINUTES] minutes with new info or next‑ETA.
First 10–15 mins: draft internally, do not auto‑publish. Require manual approval.
Message templates (fill and reuse):
- Investigating: “We’re investigating increased [SYMPTOM] affecting [AFFECTED_COMPONENTS]. Next update by [NEXT_UPDATE_ETA].”
- Identified: “We’ve identified a cause related to [CAUSE_HINT]. Mitigation in progress. Next update by [NEXT_UPDATE_ETA].”
- Monitoring: “A fix has been rolled out. We’re monitoring recovery. Next update by [NEXT_UPDATE_ETA].”
- Resolved: “This incident is resolved. Impact: [IMPACT_SUMMARY]. Duration: [DURATION]. Root cause: [ROOT_CAUSE_ONE_LINE].”

SLA Credits (Policy Reference)

Model A (Uptime bands):
- 99.1%–99.98%: 10% credit; 95%–99%: 25% credit; <95%: 50% credit of monthly fee for affected service.
Model B (Per‑interval): 5% credit per 30 minutes of downtime, capped at 50% per month.
Applied on request within [CREDIT_REQUEST_WINDOW_DAYS] days and against future invoices only.

Data Retention

Retain incident timelines & post‑incident reports for [RETENTION_MONTHS] months in Notion.

Ownership

Incident Commander (IC): [PRIMARY_OWNER_NAME] ([PRIMARY_OWNER_CONTACT])
Delegate: [DELEGATE_NAME] ([DELEGATE_CONTACT])

↑ Back to top

2) Statuspage updater snippets — curl, Make, and Zapier (with approval gate)

Use these snippets to integrate a monitor → (draft) → approve → publish flow. Always store your key as a secret and use the Authorization header.

Key notes before you paste:

Auth header format: Authorization: OAuth [STATUSPAGE_API_KEY].
As of June 30, 2026, query‑param API keys are deprecated — use the header above.
Prefer: build the JSON → post to Slack for approval → on ✅ approval, call the API.

Environment placeholders:

[STATUSPAGE_API_KEY] (store in secret manager)
[STATUSPAGE_PAGE_ID]
[INCIDENT_ID]
[COMPONENT_ID_1], [COMPONENT_ID_2]
[INCIDENT_NAME], [PUBLIC_MESSAGE], [NEXT_UPDATE_ETA]

A) Create Incident (no notifications yet)

curl -X POST \
  https://api.statuspage.io/v1/pages/[STATUSPAGE_PAGE_ID]/incidents \
  -H &#39;Content-Type: application/json&#39; \
  -H &#39;Authorization: OAuth [STATUSPAGE_API_KEY]&#39; \
  -d &#39;{
    &quot;incident&quot;: {
      &quot;name&quot;: &quot;[INCIDENT_NAME]&quot;,
      &quot;status&quot;: &quot;investigating&quot;,
      &quot;impact_override&quot;: &quot;[none|minor|major|critical]&quot;,
      &quot;deliver_notifications&quot;: false,
      &quot;body&quot;: &quot;[PUBLIC_MESSAGE]&quot;,
      &quot;component_ids&quot;: [&quot;[COMPONENT_ID_1]&quot;, &quot;[COMPONENT_ID_2]&quot;]
    }
  }&#39;

B) Append Update (identified/monitoring) — notify subscribers

curl -X POST \
  https://api.statuspage.io/v1/pages/[STATUSPAGE_PAGE_ID]/incidents/[INCIDENT_ID]/incident_updates \
  -H &#39;Content-Type: application/json&#39; \
  -H &#39;Authorization: OAuth [STATUSPAGE_API_KEY]&#39; \
  -d &#39;{
    &quot;incident_update&quot;: {
      &quot;status&quot;: &quot;[investigating|identified|monitoring|resolved]&quot;,
      &quot;deliver_notifications&quot;: true,
      &quot;body&quot;: &quot;[PUBLIC_MESSAGE] Next update by [NEXT_UPDATE_ETA].&quot;
    }
  }&#39;

C) Resolve Incident — final public message

curl -X POST \
  https://api.statuspage.io/v1/pages/[STATUSPAGE_PAGE_ID]/incidents/[INCIDENT_ID]/incident_updates \
  -H &#39;Content-Type: application/json&#39; \
  -H &#39;Authorization: OAuth [STATUSPAGE_API_KEY]&#39; \
  -d &#39;{
    &quot;incident_update&quot;: {
      &quot;status&quot;: &quot;resolved&quot;,
      &quot;deliver_notifications&quot;: true,
      &quot;body&quot;: &quot;Resolved. Impact: [IMPACT_SUMMARY]. Duration: [DURATION]. Root cause: [ROOT_CAUSE_ONE_LINE].&quot;
    }
  }&#39;

Optional: set component statuses explicitly (instead of just associating components) by using a components object when creating the incident:

{
  &quot;incident&quot;: {
    &quot;name&quot;: &quot;[INCIDENT_NAME]&quot;,
    &quot;status&quot;: &quot;investigating&quot;,
    &quot;impact_override&quot;: &quot;major&quot;,
    &quot;deliver_notifications&quot;: false,
    &quot;body&quot;: &quot;[PUBLIC_MESSAGE]&quot;,
    &quot;components&quot;: { &quot;[COMPONENT_ID_1]&quot;: &quot;major_outage&quot; }
  }
}

Make (Integromat) HTTP module values:

URL: https://api.statuspage.io/v1/pages/[STATUSPAGE_PAGE_ID]/incidents
Method: POST
Headers: Content-Type: application/json, Authorization: OAuth [STATUSPAGE_API_KEY]
Body (raw): paste the JSON from A) with mapped fields from your monitor alert
Next step: Slack → Post message to #[INCIDENTS_CHANNEL] with the JSON preview + “React ✅ to publish”
Filter: only proceed to HTTP “Append Update” step if Slack reaction contains ✅ within [APPROVAL_WINDOW_MINUTES]

Zapier (Webhooks by Zapier → Slack):

Trigger: [MONITOR_ALERT_TRIGGER]
Action 1: Code step (build incident JSON from trigger)
Action 2: Slack → Send a message to #[INCIDENTS_CHANNEL] including the JSON preview
Action 3 (Path A, if approved): Webhooks by Zapier → Custom Request
- Method: POST
- URL: https://api.statuspage.io/v1/pages/[STATUSPAGE_PAGE_ID]/incidents
- Data: JSON from Action 1
- Headers: Authorization: OAuth [STATUSPAGE_API_KEY], Content-Type: application/json

Security & safety:

Keep deliver_notifications: false until human approval.
Log incident_id responses in your DB/sheet: [STORAGE_LOCATION] for later updates.
Rate‑limit and de‑dup: only open a new incident if one isn’t already open for the same component/symptom in the last [DEDUP_WINDOW_MINUTES] minutes.

↑ Back to top

3) Slack incident channel workflow — cadence enforcement without spam

Create or reuse a dedicated channel per incident. Use one of these patterns and pin the block below.

Channel naming:

#[INC]-[YYYYMMDD]-[SHORT_SLUG] (e.g., #inc-20260529-api-timeouts)

Pinned message template:

Incident: [INCIDENT_NAME]
Opened: [OPENED_AT_LOCAL]
IC: [INCIDENT_COMMANDER]  |  Delegate: [DELEGATE]
Update cadence: every [UPDATE_CADENCE_MINUTES] minutes until resolution
First 10–15 mins: draft internally; do not auto‑publish
Latest public status: [LINK_TO_STATUSPAGE_INCIDENT]
Next update due by: [NEXT_UPDATE_ETA]

Slack quick commands (fastest setup):

Start cadence (20m): /remind #inc-… "Post public update (what changed + next ETA)." every 20 minutes
Start cadence (30m): /remind #inc-… "Post public update (what changed + next ETA)." every 30 minutes
Stop cadence when resolved: /remind list → “Mark as complete”

Workflow Builder (button‑start with approval signal):

Trigger: “Shortcut” named “Start Incident Cadence”. Inputs: [UPDATE_CADENCE_MINUTES].
Step: Post a message to the channel with the pinned template.
Step: Add a Delay for [UPDATE_CADENCE_MINUTES] minutes.
Step: Post “Reminder: publish an update only if there’s new info. Else push the next ETA.”
Loop: Repeat steps 3–4 until someone posts “/resolve” or adds the 🟢 emoji to the pinned message.

Copy‑ready update blocks (paste, then customize):

Investigating: “We’re investigating increased [SYMPTOM] impacting [COMPONENTS/USERS]. Next update by [NEXT_UPDATE_ETA].”
Identified: “Cause identified ([CAUSE_HINT]). Mitigating now. Next update by [NEXT_UPDATE_ETA].”
Monitoring: “Fix deployed. Monitoring metrics and user reports. Next update by [NEXT_UPDATE_ETA].”
Resolved: “Resolved. Impact: [IMPACT_SUMMARY]. Duration: [DURATION]. Root cause: [ROOT_CAUSE_ONE_LINE].”

Noise guardrails:

Never post “no change.” If nothing changed, post a shorter note with a fresh next‑ETA.
Collapse duplicates: centralize thread(s) with links, archive stray chatter.

↑ Back to top

4) SLA Credit Calculator — two ready models with spreadsheet formulas

Decide your policy and paste one of these into a Notion table or a spreadsheet. Fields in [BRACKETS] are your inputs.

Inputs (all models):

[MONTHLY_FEE] (e.g., 2000)
[TOTAL_DOWNTIME_MINUTES] (e.g., 65)
[TOTAL_MINUTES_IN_MONTH] (e.g., 43200)

Derived:

Uptime % = 1 - ([TOTAL_DOWNTIME_MINUTES]/[TOTAL_MINUTES_IN_MONTH])

Model A — Uptime bands (tiered credits)

Bands:
- 99.1%–99.98% → 10%
- 95%–99% → 25%
- <95% → 50%
Spreadsheet formula (credit %):

=IF([Uptime%]&lt;0.95,50, IF([Uptime%]&lt;=0.99,25, IF([Uptime%]&lt;0.9998,10,0)))

Credit $: =[MONTHLY_FEE] * [Credit%]

Model B — Per‑interval credits (simple, SMB‑friendly)

Parameters: [INTERVAL_MIN]=30, [CREDIT_PER_INTERVAL_PERCENT]=5, [MAX_CREDIT_PERCENT]=50
Intervals = CEILING([TOTAL_DOWNTIME_MINUTES]/[INTERVAL_MIN])
Credit % = MIN([Intervals]*[CREDIT_PER_INTERVAL_PERCENT], [MAX_CREDIT_PERCENT])
Credit $ = =[MONTHLY_FEE] * ([Credit%]/100)

Example (paste into your sheet to validate):

Given [MONTHLY_FEE]=2000, [TOTAL_DOWNTIME_MINUTES]=65, [TOTAL_MINUTES_IN_MONTH]=43200 → Uptime ≈ 99.85%
Model B: Intervals = CEILING(65/30)=3 → Credit %=15% → Credit $= $300

Implementation tip:

Store incident metadata in a sheet/db row: [INCIDENT_ID], [START_AT], [END_AT], [DURATION_MIN], [AFFECTED_COMPONENT], [CREDIT_MODEL], [CREDIT_$].
Auto‑email a draft credit note to [BILLING_CONTACT_EMAIL] when an incident is marked Resolved and [TOTAL_DOWNTIME_MINUTES] > 0.

↑ Back to top

5) Post‑incident report (PIR) — fill‑in template for consistency

Copy this into a new Notion page titled “Post‑Incident Report (PIR) Template.” Use it after every Sev‑1/Sev‑2.

Post‑Incident Report — [INCIDENT_NAME]

Summary

Date: [DATE]
Duration: [DURATION] (start [START_AT_LOCAL] → end [END_AT_LOCAL])
Severity: [SEVERITY]
Components affected: [COMPONENTS]
Customer impact: [IMPACT_SUMMARY]

Timeline (UTC)

[YYYY‑MM‑DD HH:MM] — Detected by [SOURCE]
[YYYY‑MM‑DD HH:MM] — Incident opened on Statuspage (link: [INC_LINK])
[YYYY‑MM‑DD HH:MM] — [KEY_UPDATE]
[YYYY‑MM‑DD HH:MM] — Resolved

Root Cause

Primary cause: [ROOT_CAUSE]
Contributing factors: [FACTORS]
Why not detected earlier: [GAP]

Remediation

Fix implemented: [FIX]
Validation/monitoring in place: [VALIDATION]
Owner: [OWNER]

SLA & Credits

Downtime minutes: [TOTAL_DOWNTIME_MINUTES]
Credit model used: [CREDIT_MODEL]
Calculated credit: [CREDIT_PERCENT]% → $[CREDIT_DOLLARS]
Applied on invoice: [INVOICE_MONTH]

Follow‑ups (checklist)

Add/adjust monitor to catch [MISSED_SIGNAL]
Update runbook section: [SECTION]
Backfill tests/alerts by [DATE]
Notify affected customers with PIR link by [DATE]

Solo SLA Loop Starter Kit (Copy‑Paste Templates for a Human‑in‑the‑Loop Incident Loop)

1) Notion Ops Manual — SLA definitions, routing, and comms templates

SLA Policy for [SERVICE_NAME]

Scope

Availability Target (Uptime SLO)

Support SLAs

Severity Levels (tie to customer impact)

Components and Mapping

Alert Routing

External Communication Runbook

SLA Credits (Policy Reference)

Data Retention

Ownership

2) Statuspage updater snippets — curl, Make, and Zapier (with approval gate)

3) Slack incident channel workflow — cadence enforcement without spam

4) SLA Credit Calculator — two ready models with spreadsheet formulas

5) Post‑incident report (PIR) — fill‑in template for consistency

Post‑Incident Report — [INCIDENT_NAME]

Summary

Timeline (UTC)

Root Cause

Remediation

SLA & Credits

Follow‑ups (checklist)

Links