Security Considerations
This page explains how OpenWit protects data and control traffic from ingestion to query. But before jumping into details, let’s take a look at the security layers and their mechanism.
Security layers at a glance
OpenWit applies defense in depth. The table lists each layer and the mechanism used at that layer.
| Layer | Security mechanism |
|---|---|
| Ingestion (Kafka, gRPC, HTTP) | API key or header token authentication, optional OAuth2 or mTLS |
| Inter-node communication | mTLS with certificates, signed gossip messages |
| Data at rest | Optional AES256 encryption in object storage through OpenDAL |
| Metadata access | Postgres roles with connection-level TLS |
| Access control | Role-based configuration for per-tenant routing and token scopes |
| Auditing | Control-plane logs every administrative change such as schema, route, node state |
| Secrets management | Integrate with Vault or Kubernetes Secrets for credentials |
| Network boundaries | Only Proxy and Control are exposed. All other nodes are internal only |
Network boundaries and exposed surfaces
Only two services sit on the edge: Proxy for client queries and Control for administration. All other roles remain inside the cluster network. Keeping the surface area small lowers risk and makes inspection and firewall rules simple. Use TLS on the public endpoints.
Ingestion security
Producers send data over Kafka, gRPC or HTTP. At the gateway you enable API key or header token authentication for simple producer access, or OAuth2 when you need a standard token flow. For transport security or private networks, you can use mTLS so clients present certificates and the gateway verifies them before accepting payloads. Choose the lightest option that fits the trust level of your producers, then keep it consistent across environments.
The gateway already performs schema validation and normalization, which prevents malformed or unexpected fields from entering the system. This protects downstream nodes while keeping the ingestion path predictable.
Inter-node communication
Nodes communicate through gossip and gRPC, and move data with Arrow Flight. The document calls out mTLS between nodes so each side verifies the peer before sending control or data. It also calls out signed gossip messages, which prevent forged cluster metadata from being accepted. Together they protect both discovery and RPC.
Data at rest
When Parquet and index files are ready, Storage uploads them to the object store through OpenDAL. You can enable AES256 encryption at rest through your object store so uploaded artifacts are encrypted on disk. The document lists S3, Azure and GCS as common backends, which all support encryption at rest.
Metadata store protection
Postgres is the source of truth for batch, file and index metadata. Use roles to separate duties and connection-level TLS so credentials and catalog queries are encrypted on the wire. This keeps discovery and pruning accurate and safe even when the database runs outside the data plane.
Access control model
OpenWit uses role-based configuration. You can route per tenant and scope tokens to specific datasets or operations. This keeps producers and readers limited to what they need while letting the same cluster serve multiple tenants. Keep scopes narrow and map them to clear routes.
Auditing and change tracking
The control-plane logs every administrative change. Examples include schema updates, routing changes and node state transitions. With structured logs in place you can trace who changed what and when, then correlate that with data plane effects like indexing or upload.
Secrets management
The document recommends integrating with Vault or Kubernetes Secrets for credentials. Keep API keys, OAuth client secrets, DB passwords and certificates out of images and code. Reference them from the runtime environment only. Rotate them on a schedule.
Quick reference table
| Area | What to check |
|---|---|
| Perimeter | Only Proxy and Control are reachable from outside. Certificates are valid and TLS is required. |
| Ingestion | Gateways enforce token or OAuth2. mTLS is enabled where required. Schema validation rejects bad payloads. |
| Inter-node | Nodes present valid certificates. Gossip signatures verify. |
| Storage | Object store shows AES256 at rest. Uploads succeed and are recorded. |
| Metadata | Postgres enforces roles and TLS. Connections reject plaintext. |
| Access | Token scopes map to the correct tenants and routes. |
| Auditing | Control-plane logs contain schema, route and node state events with timestamps. |
| Secrets | Credentials are read from Vault or Kubernetes Secrets. No secrets in images. |