Encryption

Forze encrypts data with envelope encryption: a key backend you control (a KMS) holds the key-encryption key (the KEK), and the framework only ever asks it to wrap and unwrap short-lived data keys (DEKs). The KEK never leaves the backend, so the same model covers managed keys and full bring-your-own-key (BYOK) — the difference is only which backend, and whose tenant, the key belongs to.

The result is the same shape you already know from multi-tenancy: a cross-cutting concern declared on the spec, applied by adapters without a line in your handlers, and fail-closed — a field marked for encryption that finds no key wired refuses to persist as plaintext rather than degrading silently.

The envelope¶

Each value is sealed into a self-describing EncryptedEnvelope — the wrapped DEK, the AEAD ciphertext, and the algorithm metadata travel together, so a reader needs only the key backend to open it, never an out-of-band scheme. The Keyring generates a DEK, has the backend wrap it, and caches and reuses it across a scope up to max_dek_messages before rotating — one KMS round-trip amortized over many values.

Every ciphertext is bound to associated data (AAD): at minimum the tenant and the field name, so an envelope lifted from one tenant or column cannot be replayed into another. The cipher is AEAD (AES-256-GCM by default, ChaCha20- Poly1305 optional) — tampering fails the open, it doesn't return garbage.

Wiring the keyring¶

CryptoDepsModule composes the whole stack from a key backend and a directory that maps a tenant to its KEK reference:

from forze.application.execution import CryptoDepsModule
from forze.application.contracts.crypto import KeyRef, StaticKeyDirectory
from forze_vault.adapters import VaultTransitKeyManagement

CryptoDepsModule(
    kms=VaultTransitKeyManagement(client=vault),  # mount lives on the client config
    directory=StaticKeyDirectory(KeyRef(key_id="app-kek")),  # one KEK for the deployment
)

That registers the key manager, the AEAD, the directory, and the composed Keyring under their dep keys. Integrations that opt into encryption resolve the keyring from here — they never construct one. For per-tenant keys, swap the directory (see Per-tenant keys below).

MockKeyManagement is dev/test only

The in-memory MockKeyManagement from forze_mock derives keys locally — it is for tests and local runs, never production. It exists so the encryption paths exercise end-to-end without a real KMS; it protects nothing.

Where data gets encrypted¶

Three surfaces opt in independently, each declaring its own coverage tier. You encrypt only what needs it.

Document fields¶

A document spec names the fields to seal. Tiers run weakest to strongest — none < field < envelope — and a spec derives its tier from what it marks:

DocumentSpec(
    name="patients",
    read=Patient,
    encryption=FieldEncryption(
        encrypted=frozenset({"ssn", "diagnosis"}),
        searchable=frozenset({"email"}),   # deterministic — see below
        binds_record_id=True,              # bind the row id into the AAD
    ),
)

A single FieldEncryption policy declares the whole shape, and the SearchSpec over the same table shares the same object — so the two can't drift. encrypted fields are randomized AEAD ciphertext; searchable fields use a deterministic cipher so equality queries still match. Setting binds_record_id=True folds the record's id into the AAD of every randomized field, so a ciphertext can't be copied between rows — it applies only to randomized fields, never searchable ones (whose ciphertext must stay record-independent to compare).

Marking a field requires a wired keyring

A spec that marks any field for encryption but finds no KeyringDepKey (or no deterministic cipher, when it declares searchable fields) raises at factory time rather than writing plaintext. The check is fail-closed by design.

The same FieldEncryption policy carries across planes. Point a SearchSpec, an AnalyticsSpec, or a graph node/edge kind at it and those surfaces seal the same fields on write and decrypt them out of every read path — search results, warehouse rows (offset / cursor / chunked / projections), and graph get / neighbors / walk / shortest-path. Encrypted fields stay confidential: they're never content-searchable, aggregatable, or matchable in a graph predicate (that's physics, not a limit) — so encrypt what you store-and-return but never query by, and use searchable (deterministic) fields for the equality lookups you do need. Each plane fails closed the same way (core.{search,analytics,graph}.encryption_wiring). One caveat when sharing a policy: binds_record_id needs a stable per-record id, so it applies to the document and graph (key-addressed) planes only — an AnalyticsSpec (warehouse rows have no id) and an endpoint-identity graph edge reject it at wiring. Leave it off the policy you share with those, or give them their own. The downstream caches inherit it: when a search route encrypts, its result-snapshot runs (the frozen models kept for stable re-pagination) are sealed at rest too, so the snapshot store never re-exposes what the document sealed — automatic, no extra config.

Object storage¶

Object bytes encrypt per route with a single flag — the stored object is the envelope, decrypted transparently on read:

S3StorageConfig(bucket="uploads", encrypt=True)

Outbox and inbox¶

The transactional outbox chooses how far ciphertext travels, via OutboxEncryptionTier on the spec — none < at_rest < end_to_end:

Tier	Encrypted where	Decrypted by
`none`	nowhere	—
`at_rest`	the outbox row	the relay, before publishing
`end_to_end`	row and broker payload	the consumer, before the handler

OutboxSpec(name="events", codec=codec, encryption="end_to_end")

At at_rest the payload is ciphertext in your database and plaintext on the wire; at end_to_end it stays sealed through the broker and is opened only by the consumer after dedup — the message broker never sees plaintext. The payload AAD is reconstructable from the envelope headers (tenant and event id), so any transport carries it: queue, stream, or pub/sub, across every messaging backend. Legacy plaintext rows written before a tier was raised still relay.

Idempotency result cache¶

The idempotency store replays an operation's full result for a duplicate request, so a Forze-owned store (Redis/Postgres) holds that return value at rest. Seal it with one flag — the result is sealed on commit and opened on replay (metadata stays plaintext), bound to (tenant, op:key):

IdempotencySpec(name="orders", encrypt_result=True)

Searchable fields and rotation¶

Deterministic (searchable) fields need a stable root secret, set on the crypto module — the same plaintext always seals to the same ciphertext, which is what lets equality queries hit:

CryptoDepsModule(
    kms=...,
    directory=...,
    deterministic_root=load_secret("search-root"),          # >= 32 bytes
    deterministic_previous_root=load_secret("search-root-prev"),  # rotation only
)

Rotating the root is a two-phase overlap: set deterministic_previous_root to the old value, and reads match values written under either root while new writes use the current one. Run reencrypt_documents to re-index every searchable value under the new root, then drop the previous one.

Searchable fields trade secrecy for queryability

Deterministic encryption leaks equality — identical plaintexts are visible as identical ciphertexts. Mark a field searchable only when you must query it by exact value; otherwise leave it randomized in FieldEncryption.encrypted.

Declaring a minimum¶

As with tenant isolation, coverage can be prescriptive. Set required_encryption on a deps module and wiring refuses to assemble any surface whose derived tier is weaker — a fail-closed floor checked once, at startup, never per request:

PostgresDepsModule(
    client=...,
    required_encryption="field",  # every document route must seal something
)

A spec that forgot to mark a field, or a storage route left in the clear, fails to wire instead of quietly persisting plaintext. Leave it unset (the default) and nothing is enforced — coverage stays opt-in per spec.

Per-tenant keys (BYOK)¶

Stronger isolation gives each tenant its own KEK, so one tenant's data is unreadable with another's key — and a tenant can supply or revoke their own. Swap the static directory for a per-tenant one:

from forze.application.contracts.crypto import TenantTemplateKeyDirectory

directory = TenantTemplateKeyDirectory(
    template="tenant/{tenant_id}/kek",
    default_key_id="shared-kek",  # used when no tenant is bound
)

CryptoDepsModule(kms=VaultTransitKeyManagement(client=vault), directory=directory)

The KEK itself is provisioned per tenant through the same TenantProvisionerPort seam onboarding uses for schemas and buckets. forze_vault ships VaultTransitTenantProvisioner, which resolves the tenant through that same directory and creates its Transit key on onboarding (teardown opt-in via allow_deletion, since deleting a KEK is irreversible data loss):

from forze_vault.adapters import VaultTransitTenantProvisioner

TenancyDepsModule(
    tenant_management={"main"},
    tenant_provisioner=VaultTransitTenantProvisioner(
        client=vault, directory=directory  # same directory the keyring resolves with
    ),
)

Compose it with other provisioners (a schema, a bucket, a key) via CompositeTenantProvisioner so onboarding a tenant readies every backend at once.

Observability¶

The keyring exports the same pull-based metrics as the rest of the engine. Pass your keyrings to instrument_crypto by label — instrument_crypto({"default": keyring}, meter=meter) — to see DEK generation, unwrap calls, cache hits, and cold misses, the signal for whether max_dek_messages is sized right for your traffic. See Observability for the meter setup.