Getting Started — Standalone Mode

Standalone Mode, also called Monolith Mode, runs all components in a single process. It is built for local development, testing and small single-node deployments. You get the same dataflow and actor behaviors as a cluster with none of the external coordination.

What Standalone Mode includes

  • Control for control plane behavior, gossip simulation, metrics aggregation
  • Ingestion for Kafka, gRPC and HTTP handlers behind an ingestion gateway
  • Storage for Arrow to Parquet conversion and WAL management
  • Indexing for bitmap, bloom or loom, zonemap and Tantivy
  • Search using SQL with DataFusion
  • Cache that simulates RAM and disk tiers
  • Proxy as client entrypoint and lightweight router
  • Janitor that runs cleanup and maintenance tasks

All inter-node communication that would be gRPC, gossip or Arrow Flight in a cluster runs in-process through async message passing between actors. You switch to a cluster by changing config, not code.

How it Works Internally

Config defaults for local runs

  • deployment.mode = monolith
  • environment = local
  • use_kubernetes = false
  • leader_election = false

Local paths auto-created

  • ./data, ./data/storage, ./data/metastore, ./data/wal

Discovery and storage defaults

  • Gossip and discovery are simulated internally
  • Storage backend defaults to local filesystem
  • Metastore defaults to sled. You can point to Postgres if needed

Runtime model

  • Control and all actors share one threadpool with async non-blocking channels
  • Memory and buffers are capped so it runs on a laptop
    • Max heap about 4 GB
    • Buffer pool about 1 GB

Startup sequence

  • Control starts first and initializes a local registry of nodes
  • Ingest, storage, search, proxy, indexer and cache actors start with local channels that simulate RPC
  • Gossip still runs and shares metadata with in-process peers for uniform behavior

Dataflow stays realistic

  • Ingest to Storage uses Arrow Flight semantics in memory
  • Storage to Indexer is asynchronous
  • Proxy to Search uses loopback async channels
  • Janitor runs periodic cleanup as if in cluster mode

Local persistence

  • Short-term WAL at ./data/wal/short/
  • Long-term WAL at ./data/wal/long/
  • Parquet at ./data/storage/
  • Indexes under ./data/metastore/indexes/ or next to Parquet
  • Batch and file metadata in sled by default, or Postgres if connected

Unified observability

  • Logs, metrics and tracing are consolidated locally
  • Metrics are exposed on a single port for quick debugging
  • Tracing spans show ingest to WAL to storage to index to query

Why this mode matters

Use Standalone Mode as a sandbox to:

  • Develop and test ingestion via Kafka, OpenTelemetry and HTTP
  • Exercise schema validation and normalization rules
  • Validate WAL durability and restart behavior
  • Benchmark Arrow to Parquet conversion
  • Prototype new index types or query plans
  • Debug end to end with full local telemetry

Codepaths are identical to production. Only mode and endpoint bindings change.

Advantages

  • Simple setup with no Kubernetes, no external discovery, no remote coordination
  • Same ingestion, WAL, storage and indexing logic as production
  • Rapid iteration with instant restarts and a single log stream
  • Safe laptop footprint in the 4 to 8 GB range
  • Zero external dependencies by default. Postgres is optional

Limitations

  • No horizontal scale because all roles share one machine
  • Not fault tolerant. A process crash requires restart from WALs
  • Discovery is simulated rather than networked
  • Postgres HA and distributed metadata are not enabled by default
  • Object storage uploads use local disk or a dummy OpenDAL backend