Why Openwit

There are multiple reasons why Openwit can be considered a top choice for the telemetry purpose. In this doc, we will address each reason suggesting why one should consider Openwit.

1. Purpose-built for Telemetry

Openwit targets logs, traces and metrics. These signals are high throughput, time ordered and semi structured. The pipeline is tuned to accept data once, apply smart indexing and answer time range queries with low latency. This avoids the compromises seen when a general system is bent into a telemetry store.

What this means in practice:

  • Producers can stream continuously over Kafka, gRPC or HTTP
  • Batches keep I/O predictable and efficient
  • Time based queries behave well at scale

2. Rust-native Performance

Rust gives memory safety, zero cost abstractions and no GC pauses. Openwit uses async actors to keep latency stable during heavy ingest and parallel queries. The result is predictable tails under load which suits ingestion heavy telemetry.

What this means in practice

  • Concurrency without head of line blocking
  • Small steady overhead during peaks
  • A good fit for CPU bound vector work later in the plan

3. Columnar Everywhere

Openwit uses Apache Arrow in memory and Parquet at rest. This brings high compression, fast scans and SIMD friendly execution. It also integrates cleanly with DataFusion so filters push down and vector operators run efficiently.

What this means in practice

  • Less data moved per query because columns compress well
  • Batches flow as Arrow RecordBatches through ingest and into storage
  • Parquet is the durable unit in object storage and the input for search pruning

4. Simple Roles with Clear Protocols

The cluster is a set of focused nodes. Control, ingest, storage, indexer, search, proxy and cache talk over open interfaces. gRPC carries control and medium messages. Arrow Flight moves heavy columnar data. Gossip shares lightweight health and metadata so nodes discover each other. This keeps coordination light and data movement fast.

What this means in practice

  • Clear responsibilities and easier debugging per role
  • You scale only the roles that need it
  • Control plane stays small and observable

5. Durability and Consistency by Design

Every batch is persisted to a short term WAL before Openwit acknowledges the sender. Long term WALs and Parquet uploads enable full recovery after crashes or restarts. The catalog in Postgres records files, indexes and time ranges so state remains consistent.

How the pieces fit

  • Short term WAL for immediate durability at ./data/wal/short on local disk.
  • Long term WAL aggregated by day at ./data/wal/long for restore and heavy jobs.
  • Active → Stable Parquet then upload to S3 or Azure or GCS via OpenDAL.
  • Metadata in Postgres as the source of truth for batches, files and indexes.

6. Extensible Indexing Model

Openwit lets you pick the right index for the job. Bitmap helps equality filters on low or medium cardinality fields. Bloom or loom filters help membership tests. Zonemaps help numeric or time pruning. Tantivy adds full text search for log messages. Index files live next to Parquet and link back to metadata.

What this means in practice

  • Search prunes files early so queries touch less data
  • Text search blends with SQL filters in one plan
  • You tune per dataset instead of a one size index

You do not choose between a search engine and an OLAP database. The same system supports SQL for structured fields and full text for messages. A single query can filter by fields and match text. Results return as Arrow batches.

What this means in practice

  • One engine for dashboards and investigations.
  • No external sync jobs between systems.
  • Consistent security and metadata across both styles.

8. Cloud-first and Local-friendly

Run Openwit as a single monolith for local work or as a distributed cluster in production. Both use the same binary and the same configuration which makes promotion from laptop to cluster straightforward.

What this means in practice

  • Quick setup for development or tests.
  • Smooth path to containers and orchestration.
  • Scale ingest, storage, search and cache independently when needed.