Checklist

When Clients Ask for On‑Prem: The 1‑Page Checklist

A vendor‑neutral, citable one‑pager to satisfy “on‑prem” asks without self‑hosting. Use it to prove residency, ZDR/MAM status, per‑tenant BYO keys, strict logging, and a visible kill‑switch — plus a clear decision box for when self‑host is actually required.

Use this when a prospect says "on‑prem." It’s a vendor‑neutral proof pack you can complete in under an hour if your stack already supports per‑tenant BYO keys, region pinning, strict logging, and a visible kill‑switch. Work top‑to‑bottom, attach the screenshots noted, and ship the PDF to procurement the same day.

  1. 1

    Capture the exact requirement in writing

    Paste the clause(s) that triggered the ask and extract five fields: required region(s), allowable retention window, training/use restrictions, logging/export expectations, and third‑party processor stance. Note the reviewer’s email and due date.

  2. 2

    Map the workload to eligible endpoints

    List the specific models and endpoints you’ll use and exclude any that can’t meet ZDR/residency (e.g., OpenAI Assistants/Threads/Vector Stores; Vertex Search/Maps grounding). Keep only stateless/ZDR‑eligible paths for this workload.

  3. 3

    Implement BYO API keys per tenant

    Create a tenant record with encrypted key storage (KMS/Secrets Manager), owner email, last‑4 of key, and expiry/rotation date. Keys must be scoped to the client’s own provider project/workspace and revocable without code changes.

  4. 4

    Pin inference to the client’s required region

    Set region controls explicitly per request: OpenAI use the regional domain (e.g., eu.api.openai.com) or project‑level residency; Anthropic set inference_geo: "us" and verify usage.inference_geo in responses; Vertex use region endpoints/location (e.g., us-central1); Azure deploy Direct Models in the required geography. Capture a settings screenshot for each.

  5. 5

    Set retention and abuse‑monitoring controls (ZDR/MAM)

    Apply the strongest eligible setting and record it: OpenAI ZDR or MAM at org/project; Anthropic ZDR on API; Vertex disable in‑memory caching and note any exceptions; Azure request/verify Content Logging disabled on approval. Document what is and isn’t covered.

  6. 6

    Verify endpoint eligibility and exclusions

    For every endpoint in scope, note “ZDR‑eligible” or “stateful/not eligible” with the doc title you’ll cite. Remove any feature with unavoidable storage from this workload or carve it out explicitly.

  7. 7

    Define strict, structured run logging

    Log every call as redacted JSON with: timestamp, correlation_id (UUIDv4), tenant_id, provider, model, region, provider request_id, token counts, latency ms, billing_cents, input_hash/output_hash (SHA‑256 of redacted text), redactions_applied, and result/err codes. Do not store raw secrets or full PII.

  8. 8

    Set log retention and export controls

    Match the client’s window (e.g., 30–90 days) and auto‑export immutable daily JSONL/CSV to a client‑owned bucket (S3/GCS/Blob) with object‑lock or WORM enabled. Provide a read‑only portal link to downloads and an audit trail of exports.

  9. 9

    Implement a one‑click kill‑switch

    Gate all model calls behind a boolean feature flag: “AI Integration Enabled.” The toggle must short‑circuit requests instantly, log who flipped it and when, and surface as a visible red button in the client portal. Schedule a monthly fire‑drill test.

  10. 10

    Expose controls and status in the client portal

    Show provider, model, region, ZDR/MAM status, Azure/Vertex/OpenAI/Anthropic residency note, log retention, last key rotation date, export link, and kill‑switch state. Make the kill‑switch actionable; keep other fields read‑only with clear tooltips.

  11. 11

    Run and record a verification test

    Execute three test prompts per provider with region pinning on, then verify: Anthropic usage.inference_geo, OpenAI regional host or project residency, Vertex location, Azure deployment geography and content‑logging state. Confirm a structured log entry was created and the export job captured it.

  12. 12

    Assemble the procurement evidence pack

    Combine into a single PDF/folder: requirement summary, endpoint/feature scope table, region/ZDR screenshots, logging field spec + sample redacted entry, export policy, kill‑switch screenshot + test log, and a change‑control note (who updates settings, how fast).

  13. 13

    Decision box: when self‑hosting is justified

    Only escalate to self‑host if any are true: strict air‑gap is contractually required; the needed geography isn’t supported; the workload depends on a feature with unavoidable storage; third‑party processing is disallowed. Document the exception scope and keep all other paths managed.