When Clients Ask for On‑Prem: The 1‑Page Checklist
A vendor‑neutral, citable one‑pager to satisfy “on‑prem” asks without self‑hosting. Use it to prove residency, ZDR/MAM status, per‑tenant BYO keys, strict logging, and a visible kill‑switch — plus a clear decision box for when self‑host is actually required.
Use this when a prospect says "on‑prem." It’s a vendor‑neutral proof pack you can complete in under an hour if your stack already supports per‑tenant BYO keys, region pinning, strict logging, and a visible kill‑switch. Work top‑to‑bottom, attach the screenshots noted, and ship the PDF to procurement the same day.
- 1
Capture the exact requirement in writing
Paste the clause(s) that triggered the ask and extract five fields: required region(s), allowable retention window, training/use restrictions, logging/export expectations, and third‑party processor stance. Note the reviewer’s email and due date.
- 2
Map the workload to eligible endpoints
List the specific models and endpoints you’ll use and exclude any that can’t meet ZDR/residency (e.g., OpenAI Assistants/Threads/Vector Stores; Vertex Search/Maps grounding). Keep only stateless/ZDR‑eligible paths for this workload.
- 3
Implement BYO API keys per tenant
Create a tenant record with encrypted key storage (KMS/Secrets Manager), owner email, last‑4 of key, and expiry/rotation date. Keys must be scoped to the client’s own provider project/workspace and revocable without code changes.
- 4
Pin inference to the client’s required region
Set region controls explicitly per request: OpenAI use the regional domain (e.g.,
eu.api.openai.com) or project‑level residency; Anthropic setinference_geo: "us"and verifyusage.inference_geoin responses; Vertex use region endpoints/location(e.g.,us-central1); Azure deploy Direct Models in the required geography. Capture a settings screenshot for each. - 5
Set retention and abuse‑monitoring controls (ZDR/MAM)
Apply the strongest eligible setting and record it: OpenAI ZDR or MAM at org/project; Anthropic ZDR on API; Vertex disable in‑memory caching and note any exceptions; Azure request/verify Content Logging disabled on approval. Document what is and isn’t covered.
- 6
Verify endpoint eligibility and exclusions
For every endpoint in scope, note “ZDR‑eligible” or “stateful/not eligible” with the doc title you’ll cite. Remove any feature with unavoidable storage from this workload or carve it out explicitly.
- 7
Define strict, structured run logging
Log every call as redacted JSON with:
timestamp,correlation_id(UUIDv4),tenant_id,provider,model,region, providerrequest_id, token counts, latency ms,billing_cents,input_hash/output_hash(SHA‑256 of redacted text),redactions_applied, and result/err codes. Do not store raw secrets or full PII. - 8
Set log retention and export controls
Match the client’s window (e.g., 30–90 days) and auto‑export immutable daily JSONL/CSV to a client‑owned bucket (S3/GCS/Blob) with object‑lock or WORM enabled. Provide a read‑only portal link to downloads and an audit trail of exports.
- 9
Implement a one‑click kill‑switch
Gate all model calls behind a boolean feature flag: “AI Integration Enabled.” The toggle must short‑circuit requests instantly, log who flipped it and when, and surface as a visible red button in the client portal. Schedule a monthly fire‑drill test.
- 10
Expose controls and status in the client portal
Show provider, model, region, ZDR/MAM status, Azure/Vertex/OpenAI/Anthropic residency note, log retention, last key rotation date, export link, and kill‑switch state. Make the kill‑switch actionable; keep other fields read‑only with clear tooltips.
- 11
Run and record a verification test
Execute three test prompts per provider with region pinning on, then verify: Anthropic
usage.inference_geo, OpenAI regional host or project residency, Vertexlocation, Azure deployment geography and content‑logging state. Confirm a structured log entry was created and the export job captured it. - 12
Assemble the procurement evidence pack
Combine into a single PDF/folder: requirement summary, endpoint/feature scope table, region/ZDR screenshots, logging field spec + sample redacted entry, export policy, kill‑switch screenshot + test log, and a change‑control note (who updates settings, how fast).
- 13
Decision box: when self‑hosting is justified
Only escalate to self‑host if any are true: strict air‑gap is contractually required; the needed geography isn’t supported; the workload depends on a feature with unavoidable storage; third‑party processing is disallowed. Document the exception scope and keep all other paths managed.