How to Choose the Right Delivery Mode
Every edge in a FlowDSL flow has a delivery mode. Choosing the wrong mode is either costly (unnecessary MongoDB writes for cheap transforms) or dangerous (using direct for a payment charge). This guide gives you a systematic way to make that decision.
The decision tree
flowchart TD
Q1{"Is data loss\nunacceptable?"}
Q2{"Expensive external call?\n(LLM, payment, SMS, API)"}
Q3{"Fan-out to external systems\nor event sourcing?"}
Q4{"Throughput >10k/sec\nand step is cheap?"}
Q5{"Benefits from worker\npool smoothing?"}
DQ[durable]
DQ2["durable\n+ idempotencyKey"]
EB[stream]
D[direct]
EQ[ephemeral]
CP[checkpoint]
Q1 -->|YES| DQ
Q1 -->|NO| Q2
Q2 -->|YES| DQ2
Q2 -->|NO| Q3
Q3 -->|YES| EB
Q3 -->|NO| Q4
Q4 -->|YES| D
Q4 -->|NO| Q5
Q5 -->|YES| EQ
Q5 -->|NO| CP
The questions
Q1: Is data loss unacceptable?
If losing this packet would cause a business problem — a payment not charged, a support ticket not created, a notification not sent — the answer is yes.
→ durable
Q2: Does this step involve an expensive external call?
LLM invocations, payment charges, email/SMS sends, calls to third-party APIs with rate limits or per-call costs — these are expensive. Re-running them unnecessarily is costly or dangerous.
→ durable + idempotencyKey
The idempotency key is essential here because durable provides at-least-once delivery. Without it, a retry would call the LLM or send the SMS twice.
Q3: Do you need fan-out to external systems?
If multiple independent consumers need to react to this event — external services, analytics, audit logs, other teams' systems — publish to the event bus. Kafka's consumer group model handles independent consumption without flow changes.
→ stream
Q4: Is throughput >10k/sec and the step cheap?
High-volume, CPU-bound, deterministic steps (parsing, field extraction, format conversion) that run in microseconds and can be replayed from the source don't need a queue. In-process is fastest.
→ direct
Q5: Benefits from worker pool smoothing?
Medium-throughput steps with variable processing time benefit from a queue that decouples the producer and consumer rates. Redis streams provide this at low durability cost.
→ ephemeral
Otherwise: Long multi-stage pipelines where each step is expensive and you want resume-from-last-stage semantics.
→ checkpoint
By workload class
Stateful business workflow
Each event is a complete unit of work with multiple external interactions. Data loss is unacceptable throughout.
# Email triage workflow — all durable
edges:
- from: EmailFetcher
to: LlmAnalyzer
delivery:
mode: durable
idempotencyKey: "{{payload.messageId}}-analyze"
- from: LlmAnalyzer
to: RouteEmail
delivery:
mode: durable
- from: RouteEmail.urgent
to: SendSmsAlert
delivery:
mode: durable
idempotencyKey: "{{payload.messageId}}-sms"
High-throughput data pipeline
High-volume event processing where throughput matters, stages are progressively more expensive, and the first stages can be replayed cheaply from the source.
# Telemetry pipeline — modes escalate in durability as value increases
edges:
# Fast, cheap parse — in-process
- from: IngestTelemetry
to: ParseEvent
delivery:
mode: direct
# Medium throughput enrichment — worker smoothing
- from: ParseEvent
to: EnrichWithMeta
delivery:
mode: ephemeral
stream: telemetry-enrich
# Expensive aggregation — stage-level resume
- from: EnrichWithMeta
to: AggregateMetrics
delivery:
mode: checkpoint
batchSize: 100
# LLM anomaly detection — expensive, must not duplicate
- from: AggregateMetrics
to: DetectAnomalies
delivery:
mode: durable
idempotencyKey: "{{payload.windowId}}-anomaly"
# Publish to downstream consumers
- from: DetectAnomalies
to: PublishAnomalyEvent
delivery:
mode: stream
topic: telemetry.anomalies
Mode comparison at a glance
| Scenario | Mode | Reason |
|---|---|---|
| JSON field extraction | direct | Cheap, in-process, deterministic |
| Log parsing at 100k/sec | direct | Throughput > durability |
| Burst absorption | ephemeral | Worker pool, variable rate |
| Long ETL pipeline | checkpoint | Stage-level resume |
| Payment charge | durable | Critical, must not lose |
| LLM call | durable + idempotency | Expensive, non-deterministic |
| Fanout to analytics | stream | Multiple independent consumers |
| Send SMS | durable + idempotency | Irreversible side effect |
| Archive spam | direct | Idempotent, cheap, fast |
When in doubt
Use durable. It has strong guarantees and reasonable performance (typically <5ms overhead on MongoDB). You can always optimize to direct or ephemeral after you have real performance data. The cost of under-engineering delivery semantics is data loss; the cost of over-engineering is a few milliseconds of latency.
Default rule: Start with
durable. AddidempotencyKeywhenever the node has side effects. Optimize later.
Next steps
- Delivery Modes — the full explanation of each mode
- Idempotency — implementing safe idempotent nodes
- Stateful vs Streaming — the two workload classes
AsyncAPI ↔ FlowDSL Integration
Full guide to referencing AsyncAPI event contracts in FlowDSL, including schema evolution and breaking change handling.
Error Handling, Dead Letters, and Recovery
How FlowDSL handles failures at every level — node errors, delivery failures, dead letters, and manual recovery.

