Content·Regulação·20 Jan 2026·9 min

LGPD and LLMs: what changes when the model "sees" the data

Self-hosting, redaction, semantic anonymization. A practical map out of legal uncertainty and into a running, compliant system.

When the model is yours, LGPD applies the way it always has. When the model belongs to a third party and processes your customer's personal data, it changes. We list the three decisions that actually cost money.

Decision 1 — Where the model runs

A model behind a public API isn't forbidden by itself. It's forbidden without the right contract and controls. Self-hosting open weights eliminates part of the problem but adds operational cost. Hybrid — sensitive in-house, general in cloud — is usually the path.

Decision 2 — What the model sees

Redaction before the call covers 80% of cases. For the remaining 20%, semantic anonymization via embeddings or token substitution preserves utility without exposing data. Worth the investment.

Decision 3 — What gets logged

Prompt-and-response logging becomes a new personal-data store. It needs retention, access control and right-to-be-forgotten equal to the original store. Almost no project treats this properly at the start.

LGPD for LLMs isn't different from LGPD for any system. It's just new, so it feels different.

LGPD and LLMs: what changes when the model "sees" the data

Decision 1 — Where the model runs

Decision 2 — What the model sees

Decision 3 — What gets logged

Ready to take your AI from the lab and into production?