When teams start using AI, the first risk is usually not the model. It is the prompt.
People ask a good question, paste in useful context, and quietly include names, emails, contract numbers, internal notes, patient details, or payment data. The model receives something that is useful for answering, but dangerous if it should not have seen that context in the first place.
The simplest way to think about it is this:
- Keep the information the model needs to do the task.
- Redact identifiers that are not needed.
- Mask values when the structure matters but the original value does not.
- Route the request to a safer workflow when the data is too sensitive.
The data classes you should watch first
Start with the highest-risk categories.
1. Personal identifiers
Names, emails, phone numbers, ID numbers, passport numbers, user handles, and employee IDs are the easiest values to leak and the easiest to miss.
If the model does not need the exact identity, replace it with a stable placeholder.
Example:
Maria Lopez->[PERSON_01]maria@acme.es->[EMAIL_01]ES12345678Z->[ID_01]
2. Contract and account identifiers
Reference numbers often look harmless, but they can reveal a customer relationship, a case, or a transaction path.
If the model only needs the shape of the problem, mask the identifier.
Example:
CT-48291->[CONTRACT_01]INV-1042->[INVOICE_01]
3. Financial values
Amounts, invoices, payment schedules, and revenue data should be handled carefully. A model may need the existence of a value, but not the exact figure.
Example:
EUR 48,200->[AMOUNT_01]payment overdue by 37 days->payment overdue by [DAYS_01]
4. Health and regulated data
Patient names, diagnoses, treatment notes, claim information, and other regulated records need stronger controls. In many cases, the right answer is not redaction alone but a routed flow with explicit policy.
5. Internal secrets
API keys, tokens, credentials, internal URLs, IPs, access policies, and unreleased product details do not belong in prompts.
These should be blocked, not just masked.
A practical prompt policy
One of the easiest ways to reduce risk is to classify every prompt before it reaches the model.
Use three decisions:
- Safe to send as-is: no sensitive context.
- Safe after transformation: redact, mask, or anonymize first.
- Do not send: block or route to a human.
That policy can be simple enough to explain to every team:
- Sales can send lead context, but not unmasked personal data.
- Legal can send contract summaries, but not full client files unless approved.
- Support can send ticket context, but not secret tokens or payment details.
What redaction should preserve
Good redaction does not destroy the task.
If the model still needs to understand the request, preserve:
- the type of entity
- the relationship between entities
- the task objective
- the sequence of events
- the size or scale when relevant
For example, this prompt keeps the structure:
"Summarize the renewal risk for [PERSON_01] at [COMPANY_01]. The contract is [CONTRACT_01] and the payment exposure is [AMOUNT_01]."
The model still has what it needs to reason. It just does not see the real identity.
Common mistakes
Manual redaction after the fact
If people paste raw data into ChatGPT and then try to clean it up later, the sensitive data already left the environment.
The gateway has to sit before the model, not after it.
Redacting too much
If you remove all context, the model becomes less useful and users go back to sending raw data.
Good privacy controls preserve meaning while removing exposure.
Assuming every tool is equally safe
A browser extension, a team workspace, a consumer chat app, and a private gateway are not the same thing.
The policy must be specific to the route, the user, and the data class.
A simple operational checklist
Before a prompt reaches an LLM, ask:
- Does this prompt contain personal, financial, health, or confidential business data?
- Does the model actually need the exact value?
- Can we replace identifiers with placeholders?
- Should this request be routed to a safer flow or a human?
- Is the final prompt logged and auditable?
If the answer to any of those questions is unclear, you probably need a gateway layer.
Where Privacy Gateway fits
Privacy Gateway is the control layer that sits between your company and the model.
It detects sensitive data, applies policy, masks what should not be exposed, and only then sends the prompt forward. That lets teams use AI without turning every request into a data-loss event.
If you want the practical version of that flow, see our Privacy Gateway overview.
Want to apply this in your company?
Let's talk about how agentic AI can transform your processes.