Stateless vs. Stateful Stream Processing

Unlike traditional static datasets with fixed boundaries, streaming data refers to a continuous, unbounded flow of time-series events generated dynamically in real-time.

State in stream processing refers to data and calculation results maintained across multiple events. The processing paradigm can be categorized into two modes:

  • Stateless stream processing: Each input record is processed independently, and the output is based solely on the current input record.

  • Stateful stream processing: The system maintains and updates states of past events as it processes new data. The output is influenced not only by the current record but also by the historical context of previously processed data.

In DolphinDB, states are involved in streaming operations such as stateful functions in the reactive state engine and aggregate functions in the time series engine. Therefore, the correctness and persistence of states is crucial for the accuracy and stability of the stream processing system.