Security Considerations

This page explains how OpenWit protects data and control traffic from ingestion to query. But before jumping into details, let’s take a look at the security layers and their mechanism.

Security layers at a glance

OpenWit applies defense in depth. The table lists each layer and the mechanism used at that layer.

LayerSecurity mechanism
Ingestion (Kafka, gRPC, HTTP)API key or header token authentication, optional OAuth2 or mTLS
Inter-node communicationmTLS with certificates, signed gossip messages
Data at restOptional AES256 encryption in object storage through OpenDAL
Metadata accessPostgres roles with connection-level TLS
Access controlRole-based configuration for per-tenant routing and token scopes
AuditingControl-plane logs every administrative change such as schema, route, node state
Secrets managementIntegrate with Vault or Kubernetes Secrets for credentials
Network boundariesOnly Proxy and Control are exposed. All other nodes are internal only

Network boundaries and exposed surfaces

Only two services sit on the edge: Proxy for client queries and Control for administration. All other roles remain inside the cluster network. Keeping the surface area small lowers risk and makes inspection and firewall rules simple. Use TLS on the public endpoints.

Ingestion security

Producers send data over Kafka, gRPC or HTTP. At the gateway you enable API key or header token authentication for simple producer access, or OAuth2 when you need a standard token flow. For transport security or private networks, you can use mTLS so clients present certificates and the gateway verifies them before accepting payloads. Choose the lightest option that fits the trust level of your producers, then keep it consistent across environments.

The gateway already performs schema validation and normalization, which prevents malformed or unexpected fields from entering the system. This protects downstream nodes while keeping the ingestion path predictable.

Inter-node communication

Nodes communicate through gossip and gRPC, and move data with Arrow Flight. The document calls out mTLS between nodes so each side verifies the peer before sending control or data. It also calls out signed gossip messages, which prevent forged cluster metadata from being accepted. Together they protect both discovery and RPC.

Data at rest

When Parquet and index files are ready, Storage uploads them to the object store through OpenDAL. You can enable AES256 encryption at rest through your object store so uploaded artifacts are encrypted on disk. The document lists S3, Azure and GCS as common backends, which all support encryption at rest.

Metadata store protection

Postgres is the source of truth for batch, file and index metadata. Use roles to separate duties and connection-level TLS so credentials and catalog queries are encrypted on the wire. This keeps discovery and pruning accurate and safe even when the database runs outside the data plane.

Access control model

OpenWit uses role-based configuration. You can route per tenant and scope tokens to specific datasets or operations. This keeps producers and readers limited to what they need while letting the same cluster serve multiple tenants. Keep scopes narrow and map them to clear routes.

Auditing and change tracking

The control-plane logs every administrative change. Examples include schema updates, routing changes and node state transitions. With structured logs in place you can trace who changed what and when, then correlate that with data plane effects like indexing or upload.

Secrets management

The document recommends integrating with Vault or Kubernetes Secrets for credentials. Keep API keys, OAuth client secrets, DB passwords and certificates out of images and code. Reference them from the runtime environment only. Rotate them on a schedule.

Quick reference table

AreaWhat to check
PerimeterOnly Proxy and Control are reachable from outside. Certificates are valid and TLS is required.
IngestionGateways enforce token or OAuth2. mTLS is enabled where required. Schema validation rejects bad payloads.
Inter-nodeNodes present valid certificates. Gossip signatures verify.
StorageObject store shows AES256 at rest. Uploads succeed and are recorded.
MetadataPostgres enforces roles and TLS. Connections reject plaintext.
AccessToken scopes map to the correct tenants and routes.
AuditingControl-plane logs contain schema, route and node state events with timestamps.
SecretsCredentials are read from Vault or Kubernetes Secrets. No secrets in images.