Use any LLM, keep Swiss personal data on-shore — the FADP AI gateway

An employee pastes a customer email into ChatGPT to draft a reply. The email has a name, an AHV number, an IBAN. Thirty seconds saved.

It is also a cross-border transfer of personal data to a US-hosted service — the exact act the revised Swiss Federal Act on Data Protection (and the GDPR) build their guardrails around. One person, one email. Now multiply by your whole company, every day.

That is the real generative-AI problem in a Swiss enterprise. It is not “is the model any good.” It is “where did the personal data just go.”

Three tempting non-answers

“Ban it.” You get shadow IT. People paste into the model on their phone instead, and now you have the same transfer with none of the oversight.

“We have a DPA, and the provider is Data-Privacy-Framework certified, so it’s fine.” A transfer mechanism makes the transfer lawful; it does not make the personal data stay. You still owe a data-protection impact assessment, you still owe proportionality and data minimisation (FADP Art. 6; GDPR Art. 5(1)(c)), and you still inherit the residual risk — provider-side logging, sub-processors, foreign-authority access. “Trust us” is a contract, not a control.

“Self-host an open model.” Real, and sometimes right — but expensive, usually a step down in capability, and it does not absolve you of anything: a weaker model in your own datacentre still processes the same personal data, and your staff will still want the frontier models for the hard tasks.

The reframe

The violation isn’t the model misbehaving. It’s the data leaving the boundary.

So the fix isn’t a better model or a better promise. It’s a boundary — one that keeps the personal data on-shore while still letting you call whatever model you like.

The gateway: tokenize → call → detokenize

Put a small proxy between your application and the LLM API. On the way out it swaps every identifier for a stable placeholder; on the way back it restores them.

Draft a reply to Hans Muster (AHV 756.1234.5678.97, IBAN CH93 0076 …) about his
double charge.
        │  🛡️  the only thing that crosses the border ↓
Draft a reply to [PERSON_1] ([AHV_1], [IBAN_1]) about his double charge.
        │  the model answers on placeholders ↓
"Dear [PERSON_1], we're sorry about the double charge and will refund CHF 240…"
        │  🛡️  restored on the way back ↓
"Dear Hans Muster, we're sorry about the double charge and will refund CHF 240…"

The model drafts a perfect, personalised reply — and it only ever saw [PERSON_1]. The real AHV and IBAN never left your network. Even if the provider logs every prompt it ever receives, it logged placeholders.

See it live. kevin.ars.md — pick a Swiss business document, pick Gemini, Claude, or DeepSeek, and watch the identifiers get tokenised before the prompt crosses the border and restored on the way back. There’s a “paste your own” tab too; the cleaning happens in your browser.

Why it has to be deterministic — and offline

The tempting shortcut is to use an LLM, or a cloud “PII detection API,” to do the redaction. Don’t.

A model-based redactor inherits the same blind spots that let a model leak in the first place — you cannot use an unreliable thing as your reliability boundary. And a cloud PII API is worse: to find the personal data, you have already sent the personal data somewhere. The boundary has to be dumb, local, and unbluffable.

So it’s regex and checksums. A Swiss AHV validated by its EAN-13 check digit; an IBAN by ISO-7064 mod-97; a card by Luhn. That cannot be socially engineered, runs air-gapped, and fails closed — if something looks like an identifier, it is withheld, not waved through. The principle is old and dull and correct: never trust the model to police itself; put a deterministic code boundary around it.

”But does redacting the prompt make the model dumber?”

Fair question — so I measured it, rather than guessing. Three real Swiss business documents, three models (Gemini 3.5 Flash, Claude Sonnet 4.6, DeepSeek V4 Pro), each task run twice — on the raw document and on the sanitised-then-restored one — and judged blind by a vendor-diverse panel.

Privacy: total, and free. No raw personal data reached a model, and the round-trip was flawless — the restored answer was correct every time.
Utility: a small, task-dependent cost. Near-neutral on extraction and summarisation; a mild style tax on open-ended customer copy that a tuned gateway prompt only partly closes. Correctness — names, amounts, numbers — was intact in both arms.

The headline is not “it’s free.” It’s “near-total privacy for a small, measurable, mitigable utility cost” — and for most teams that is an easy trade for a provable residency boundary. (The numbers and the caveats.)

What it looks like in production

An OpenAI-compatible reverse proxy: your app changes its base_url and nothing else. (Or an SDK/middleware in the app, or an API-gateway plugin.)
It runs on-premise or in a Swiss region. The token↔value map — the one place the real personal data lives — never leaves your network. It’s in memory, per-session, and never persisted.
Every request emits an audit line for the DPO: “14:32 — 1 name, 1 AHV, 1 IBAN redacted before egress.” Your record of processing, for free.
It is model-agnostic. Gemini today, Claude tomorrow, a model hosted abroad next week — the boundary doesn’t care, because nothing sensitive reaches any of them.

Where it stops

Structured identifiers (AHV, IBAN, card, phone, email) are the deterministic core. Free-form personal data — names, street addresses — needs a named-entity model; run a small one locally alongside, and fail closed on the high-risk flows.
Some tasks genuinely need the real value (validate this IBAN; compute an age from a date of birth). Tokenisation is a per-field policy, not a blanket switch.
This is data minimisation and residency, not a DPIA, not a lawful-basis analysis, not a contract. It removes the transfer question for the data it redacts; it does not remove your other obligations.

Defence in depth, deliberately dumb — the outer layer, not the only one.

The point

You do not have to choose between using frontier AI and keeping Swiss personal data in Switzerland. A few hundred lines of deterministic code at the egress boundary buys you both — and, unlike a promise, you can prove it and audit it.

Try it: kevin.ars.md · how it works · the benchmark.

Use any LLM. Keep the personal data in Switzerland.

Three tempting non-answers

The reframe

The gateway: tokenize → call → detokenize

Why it has to be deterministic — and offline

”But does redacting the prompt make the model dumber?”

What it looks like in production

Where it stops

The point

Related posts

The weak spots keep moving

Two models, opposite weak spots

The bot that couldn't say no

Comments