Configuration

This page explains every section of the OpenWit configuration and how the fields work together. For better understanding, we have broken down the entire configuration file in small parts with required explanation.

values like ${POSTGRES_DSN} or ${TLS_CERT_PATH} are placeholders. Use env vars or a secrets manager. Do not hardcode secrets.

1. Deployment and cluster basics

environment: "production"

deployment:
  mode: "local"
  discovery_mode: "gossip"

mandatory_nodes:
  required_types:
    - "control"
    - "proxy"
    - "ingestor"
    - "storage"
    - "indexer"
    - "search"
    - "janitor"
  minimum_counts:
    control: 1
    proxy: 2
    ingestor: 2
    storage: 3
    indexer: 2
    search: 2
    janitor: 1

Explanation

  • environment sets the runtime posture. Use production for hard defaults. Use development or local for monolith runs.
  • deployment.mode chooses single process or a distributed cluster. local or monolith is for dev. distributed is for production where gossip and the control plane are active. Do not use local in production because you lose HA.
  • discovery_mode tells nodes how to find peers. gossip is decentralized, kubernetes uses DNS, static is explicit endpoints. On k8s choose kubernetes. On VMs choose gossip.
  • mandatory_nodes.required_types lists roles that must exist. minimum_counts states how many for basic health. In production, keep control ≥ 3 for quorum. Scale ingest, storage, search by load. Too low risks unavailability. Too high wastes resources.

2. Control plane, health, autoscaling

control_plane:
  enabled: true
  grpc_endpoint: "http://localhost:7019"

  leader_election:
    enabled: true
    lease_duration_seconds: 30
    renew_deadline_seconds: 20
    retry_period_seconds: 5

  monitoring:
    interval_seconds: 10
    thresholds:
      cpu_usage_percent: 80
      memory_usage_percent: 85
      queue_depth_warning: 10000
      queue_depth_critical: 50000
      error_rate_percent: 5

  optimization:
    enabled: true
    decision_interval_seconds: 60
    targets:
      max_latency_p99_ms: 1000
      min_throughput_messages_per_sec: 10000
      max_memory_usage_percent: 80
      max_cpu_usage_percent: 75

  segment_remediation:
    enabled: true
    check_interval_seconds: 60
    max_concurrent_actions: 5
    action_timeout_seconds: 300
    retry_attempts: 3
    backoff_multiplier: 2.0
    max_retries: 3
    stuck_threshold_minutes: 10
    auto_remediate: true
    max_remediate_per_cycle: 100

  autoscaling:
    enabled: true
    scale_up:
      cpu_threshold_percent: 70
      memory_threshold_percent: 75
      queue_depth_threshold: 5000
    scale_down:
      cpu_threshold_percent: 30
      memory_threshold_percent: 40
      queue_depth_threshold: 1000
    min_replicas: 1
    max_replicas: 20
    cooldown_seconds: 300

  backpressure_advanced:
    max_buffer_size: 10000
    high_watermark: 0.8
    low_watermark: 0.3
    backoff_duration_ms: 100
    max_backoff_ms: 5000
    retry_config:
      max_attempts: 3
      base_delay_ms: 1000
      max_delay_ms: 30000
      exponential_base: 2.0

Explanation

  • control_plane.enabled turns on orchestration. Nodes call grpc_endpoint for control APIs.
  • leader_election.* picks an active control instance. Keep the lease about 30 s so failover is not too slow or too twitchy.
  • monitoring.interval_seconds is the poll cadence. thresholds trigger warnings based on CPU, memory, queue depth, error rate. Tune to your SLOs.
  • optimization.* runs decisions every decision_interval_seconds and aims for the targets you set. These targets are the basis for scaling and placement.
  • segment_remediation.* heals stuck segments with bounded concurrency, timeouts, and backoff. Keep concurrency small to avoid a thundering herd.
  • autoscaling.* defines scale up and down triggers. Guard with min_replicas, max_replicas, and a long cooldown_seconds so you can observe the effect.
  • backpressure_advanced.* caps internal queues and sets retry backoff. Too small buffers lead to frequent 429. Too big buffers risk memory pressure.

3. Proxy front door

proxy:
  enabled: true
  routing:
    strategy: "round_robin"
    health_check_interval_ms: 5000
    max_retries: 3
    retry_timeout_ms: 1000

  http:
    enabled: true
    port: 4318
    max_concurrent_requests: 10000
    request_timeout_ms: 30000
    max_payload_size_mb: 100
    upstream_pool:
      max_idle_per_host: 10
      idle_timeout_seconds: 90
      connection_timeout_seconds: 30

  grpc:
    enabled: true
    port: 4317
    max_concurrent_streams: 1000
    channel_pool:
      max_channels_per_backend: 10
      keepalive_time_ms: 10000
      keepalive_timeout_ms: 5000
      idle_timeout_minutes: 5

  kafka:
    enabled: true
    consumer:
      group_id_prefix: "openwit-proxy"
      session_timeout_ms: 30000
      heartbeat_interval_ms: 3000
      max_poll_records: 500
    producer:
      batch_size: 16384
      linger_ms: 10
      compression_type: "snappy"
      max_in_flight_requests: 5
    health_check:
      enabled: true
      interval_seconds: 30
      timeout_seconds: 5
      metadata_timeout_ms: 5000
    health_reporting:
      report_interval_seconds: 10
      include_upstream_health: true
      include_connection_stats: true
      include_latency_metrics: true

  weights:
    cpu_weight: 0.3
    memory_weight: 0.3
    connection_weight: 0.4

Explanation

  • routing.strategy decides how to spread client traffic. Weighted strategy uses weights for CPU, memory, and connections.
  • HTTP has clear concurrency, timeout, and payload limits plus an upstream connection pool. gRPC has stream limits and a channel pool with keepalive. Kafka pass-through has consumer, producer, and broker health settings.

4. Ingestion: Kafka, gRPC, HTTP

ingestion:
  sources:
    kafka: { enabled: true }
    grpc:  { enabled: true }
    http:  { enabled: false }

  kafka:
    brokers: "ns1005571.ip-147-135-65.us:9092"
    group_id: "ingestion-group"
    topics:
      - "v6.qtw.traces.*.*"
    consumer_type: "standard"

    high_performance:
      num_consumers: 8
      processing_threads: 16
      num_grpc_clients: 4
      channel_buffer_size: 100000

    fetch_min_bytes: 1048576
    fetch_max_bytes: 52428800
    max_partition_fetch_bytes: 10485760

    enable_zero_copy: true
    preallocate_buffers: true
    buffer_pool_size: 1000

    topic_index_config:
      default_index_position: 3
      patterns:
        - match: "v6.qtw.*.*.*"
          index_position: 3

    auto_generate:
      enabled: false
      prefix: "auto_"
      strategy: "hash"

    consumer:
      session_timeout_ms: 180000
      heartbeat_interval_ms: 3000
      max_poll_interval_ms: 900000
      max_poll_records: 50000

    batching:
      batch_size: 1000
      batch_timeout_ms: 5000

    telemetry_type: "traces"

Explanation

  • sources toggles entry points. Enable only what you need to keep the surface small.
  • Kafka basics: brokers, group_id, topics, consumer_type. You can switch to high_performance to raise parallelism and buffer sizes.
  • high_performance.* sets consumer count, processing threads, gRPC clients, and channel buffer size. These lift throughput at higher memory cost.
  • fetch_* increase pull size. Large values raise RAM usage. Start conservative, then ramp.
  • enable_zero_copy, preallocate_buffers, buffer_pool_size are memory speedups. Watch lifetime and footprint.
  • topic_index_config controls where to read the index position from topic parts. auto_generate can create names with a prefix and strategy.
  • consumer.* sets liveness and polling limits. batching.batch_size is 1000 in the sample to avoid gRPC size limits. batch_timeout_ms forces flush on time.
grpc:
    bind: "0.0.0.0"
    port: 50051
    runtime_size: 8
    max_concurrent_requests: 10000
    connection_pool_size: 100

    max_message_size: 208715200
    max_receive_message_size_mb: 200
    max_send_message_size_mb: 100

    keepalive_time_ms: 60000
    keepalive_timeout_ms: 20000
    max_connection_idle_seconds: 300
    max_connection_age_seconds: 0
    max_connection_age_grace_seconds: 10

    otlp_traces_enabled: true
    otlp_metrics_enabled: true
    otlp_logs_enabled: true

    worker_threads: 4
    max_concurrent_streams: 1000

    compression_enabled: true
    accepted_compression_algorithms: ["gzip", "zstd"]

    tls:
      enabled: false
      cert_path: "${TLS_CERT_PATH}"
      key_path: "${TLS_KEY_PATH}"

Explanation

  • Bind address, port, and runtime_size control concurrency. Use max_concurrent_requests and connection_pool_size to match load.
  • Size limits allow big batches. Ensure downstream can handle them. Very large messages create memory spikes.
  • Keepalive settings prevent idle disconnects. OTLP toggles pick signal types. Compression reduces bandwidth at some CPU cost.
  • TLS is off in the sample. This is a risk in production. Turn it on and load certs from secrets.
http:
    port: 4318
    max_concurrent_requests: 5000
    request_timeout_ms: 30000
    max_payload_size_mb: 100

Explanation

  • HTTP is fine for canaries and checks. Prefer gRPC or Kafka for sustained load.

5. Processing, WAL, pipeline, backpressure

processing:
  buffer:
    max_size_messages: 1000000
    max_size_bytes: 10737418240
    flush_interval_seconds: 10
    flush_size_messages: 50000
  worker_threads: 16
  io_threads: 8
  compression: "snappy"
  dedup_enabled: true
  dedup_window_seconds: 60

Explanation

  • Buffer caps protect memory. Flushes happen on time or size. Dedup removes duplicates inside a 60 s window. Memory needs grow with rate and window.
wal:
  enabled: true
  dir: "./data/wal"
  max_size_bytes: 53687091200
  segment_size_bytes: 1073741824
  sync_on_write: false
  long_term_retention_hours: 168

wal_cleanup:
  enabled: true
  interval_minutes: 30
  batch_size: 100
  short_term:
    delete_after_storage: true
  long_term:
    retention_hours: 168
    delete_after_index: true

Explanation

  • WAL ensures durability before ack. The sample has sync_on_write: false for throughput. Enable sync for strict durability. Cleanup removes short-term WAL after storage, long-term after indexing.
pipeline:
  batch_size: 10000
  batch_timeout_seconds: 10
  worker_threads: 16

backpressure:
  enabled: true
  initial_queue_size: 1000
  max_queue_size: 100000
  target_utilization_percent: 70.0

Explanation

  • Pipeline batch and timeout should match ingestion and gRPC limits. Backpressure caps queues and targets safe utilization.

6. System metrics and LSM engine

lsm_engine:
  memtable_size_limit_mb: 256

system_metrics:
  collection_interval_seconds: 5
  enable_cpu_metrics: true
  enable_memory_metrics: true
  enable_disk_metrics: true
  enable_network_metrics: true

Explanation

  • LSM memtable limit is small and safe for local storage paths. System metrics collect CPU, memory, disk, network every 5 s.

7. Storage, compaction, cloud backends, Parquet

storage:
  backend: "local"
  data_dir: "./data/storage"

  ram_time_threshold_seconds: 3600
  disk_time_threshold_seconds: 86400
  parquet_split_size_mb: 500

Explanation

  • Local backend with a data dir. Data sits in RAM for 1 h, on disk for 1 day, then is ready for cloud uploads at about 500 MB splits.
compaction:
    enabled: true
    target_size_mb: 100
    min_files_to_compact: 5
    max_files_per_task: 20
    min_file_age_minutes: 5

    scan_interval_minutes: 30
    task_timeout_minutes: 60
    stale_check_minutes: 10

    parallelism: 2
    memory_limit_mb: 1024
    max_concurrent_reads: 5

    retention_overlap_hours: 2
    query_grace_period_ms: 5000
    validation_enabled: true
    use_hard_links: true

Explanation

  • Compaction merges small files. Size, timing, and safety settings are explicit. Validation checks output. Hard links speed transition. Keep parallelism and reads low to avoid IO spikes.
local:
    enabled: true
    path: "./data/storage"
    max_disk_usage_percent: 80
    cleanup_interval_seconds: 3600

Explanation

  • Local housekeeping prevents full disks. Clean every hour.
azure:
    enabled: true
    account_name: "mwbetaazureeastus"
    container_name: "mwopenwitbeta"
    access_key: "PdNPJEDb/.../o5wUw=="
    prefix: "openwit/storage"
    endpoint: "https://mwbetaazureeastus.blob.core.windows.net"
    # connection_string: ""
    # sas_token:

Explanation

  • Azure is enabled with account, container, and endpoint. The sample shows an inline access key which should be moved to a secrets manager. Both connection string and SAS token are alternatives. Rotate example credentials before any use.
s3:
    enabled: false
    bucket: "${S3_BUCKETNAME}"
    region: "${S3_REGION}"
    access_key_id: "${AWS_ACCESS_KEY_ID}"
    secret_access_key: "${AWS_SECRET_ACCESS_KEY}"
    prefix: "openwit/storage"
    # endpoint: ""

 gcs:
    enabled: false
    bucket: "${GCS_BUCKET_NAME}"
    service_account_key: "${GCS_SERVICE_ACCOUNT_KEY}"
    # project_id: "${GCP_PROJECT_ID}"

Explanation

  • S3 and GCS are present but disabled. Use env vars for secrets. Endpoint is optional for S3-compatible systems.
parquet:
    row_group_size: 1000000
    compression_codec: "zstd"
    compression_level: 3
    enable_statistics: true
    data_page_size_kb: 1024

  file_rotation:
    file_size_mb: 200
    file_duration_minutes: 60
    enable_active_stable_files: true
    upload_delay_minutes: 2

  local_retention:
    retention_days: 7
    cleanup_interval_hours: 1
    delete_after_upload: false

Explanation

  • Parquet writer and rotation values are tuned for scan speed, compression, and manageable object size. Align file_size_mb with cloud split size. Keep a short upload delay to batch arrivals. Local retention governs cleanup on the node.

8. Networking, gossip, inter-node, memory, monitoring

networking:
  discovery:
    mode: "static"
    kubernetes:
      enabled: false

  service_endpoints:
    control_plane: "http://localhost:7019"
    storage: "http://localhost"
    ingestion: "http://localhost"
    indexer: "http://localhost"
    search: "http://localhost:8083"

  gossip:
    listen_addr: "127.0.0.1:8000"
    seed_nodes:
      - "127.0.0.1:9092"
      - "127.0.0.1:9093"
      - "127.0.0.1:9094"
      - "127.0.0.1:9095"
    heartbeat_interval_ms: 1000
    failure_detector_timeout_ms: 5000

  inter_node:
    connection_timeout_ms: 5000
    request_timeout_ms: 30000
    max_connections_per_node: 10
    keepalive_interval_ms: 10000

Explanation

  • service_endpoints are easy URLs for roles. Useful in monolith or static setups.
  • Gossip seeds and heartbeats set the membership fabric. A 1 s heartbeat with a 5 s failure timeout is aggressive. Keep the network stable.
  • inter_node.* sets timeouts, connection caps, and keepalive. Match max_connections_per_node with gRPC channel pools to avoid congestion.
memory:
  max_memory_gb: 32

monitoring:
  prometheus_port: 9090
  health_check_port: 8080
  node_health_thresholds:
    warning_cpu_percent: 70
    warning_memory_percent: 75
    warning_disk_percent: 80
    critical_cpu_percent: 90
    critical_memory_percent: 95
    critical_disk_percent: 95
    check_interval_ms: 1000
    unhealthy_duration_ms: 5000
    recovery_duration_ms: 3000

Explanation

  • Global memory cap must match container limits. Overcommitting causes OOM.
  • Monitoring exposes Prometheus and health ports. Thresholds and durations add hysteresis so status does not flap.
metastore:
  backend: "postgres"
  sled:
    path: "./data/metastore"
    cache_size_mb: 512
  postgres:
    connection_string: "postgresql://sanjithmeta:sanjith123@localhost:5432/sanjithdb"
    max_connections: 20
    schema_name: "metastore"

Explanation

  • backend: postgres is recommended for production. Use TLS and store credentials in secrets. Keep the pool small. Use managed Postgres with backups and PITR. Sled is fine for monolith.
indexing:
  enabled: true
  batch_size: 1000
  worker_threads: 4

search:
  enabled: true
  max_results: 1000

Explanation

  • Indexing has a worker pool and a batch size per job. It is CPU and memory heavy, so keep it asynchronous. Search caps results to protect memory and network.

10. Actor system, janitor, node-local storage, alerting, security

actors:
  mailbox_size: 1000
  worker_threads: 16

janitor:
  cleanup_interval_seconds: 3600
  retention_days: 30

storage_node:
  data_dir: "./data/storage_node"

alerting:
  enabled: false

security:
  tls:
    enabled: false

Explanation

  • Actors use a mailbox for pending messages. Size too small throttles. Size too big uses memory. Match worker threads to CPU cores.
  • Janitor enforces lifecycle on a set interval. Retention must match compliance.
  • Alerting off is risky in production. Enable it in real clusters. TLS is also off in the sample. Do not run without TLS or mTLS in production.

11. Performance tuning and service ports

performance:
  threads:
    worker_threads: 16
    blocking_threads: 32
    io_threads: 8
  memory:
    allocator: "jemalloc"
    memory_limit_gb: 32
  network:
    tcp_nodelay: true
    tcp_keepalive_seconds: 60
    socket_buffer_size_kb: 8192

Explanation

  • Size worker threads near CPU cores. Use extra blocking threads for Parquet IO. jemalloc is recommended for high concurrency. Tune socket settings for stable traffic.
service_ports:
  control_plane: { service: 7019, gossip: 9092 }
  storage:      { service: 8081, arrow_flight: 9401, gossip: 9096 }
  ingestion:    { grpc: 50051, http: 4318, arrow_flight: 8089, gossip: 9095 }
  indexer:      { service: 8085, arrow_flight: 8090, gossip: 9097 }
  proxy:        { service: 8080, gossip: 9094 }
  search:       { service: 8083, grpc: 50059, gossip: 9098 }

Explanation

  • Port map for each role. Avoid collisions. Expose only what the edge needs. Route internal ports inside the cluster network.

Safety Notes

  • Do not run local in production since HA is disabled.
  • Keep control replicas at three or more when you move to production.
  • TLS is off in the sample. Turn it on for both external and inter-node traffic.
  • The Azure access key appears inline in the sample. Replace secrets with env vars or a secrets manager. Rotate keys.