An Introduction To Reactive Streams

As you may know, Java 9 is about to be released on September 21st. Among many new features, the Reactive Streams integration was probably the one I was expecting the most. The goal of this very post is to introduce the original Reactive Streams initiative. As a second step in another post, I will describe what Java 9 will bring.

The original initiative

Reactive Streams is an initiative started in 2013 with different companies like Netflix, Pivotal, Lightbend etc. The core goal is to provide a standard for asynchronous stream processing with a non-blocking back pressure.

To explain what the concept of back pressure is, let us take a concrete example. Your organization is composed of legacy applications. Some of them are, most likely, running on old technologies with limited scalability & performance capabilities. Then you put in place a new revolutionary system, capable to handle thousands of messages per second. If this new system communicates synchronously with some of your legacies (without any throttling mechanisms provided by a service gateway in between), those legacies could simply just crash at some point due to the number of incoming requests sent by the new system. This scenario might seem a bit far-fetched but actually, I did experience it.

To tackle this problem, the core idea of back pressure is to not force a subscriber system to buffer arbitrary amounts of data sent by a publisher.

As mentioned the back pressure problem could also be tackled with using a service gateway in between but also using messaging middleware. For example, if you have a JMS middleware in between a publisher and a subscriber, the requests buffering is not done by the subscriber itself. Yet from my perspective, the main benefit of Reactive Streams is the capability to achieve back pressure without being forced to use any middleware.

Basically, a subscriber can indicate directly to a publisher how many messages it will be able to manage:

  • During the initial subscription
  • And each time a new message is received (received does not necessarily mean processed)

We could also argue that the buffering problem is simply shifted somewhere else, on the publisher side. As an example, if the publisher is a GUI application which shall manage incoming user requests then it is up to this very application to handle the buffering. From a design point of view, it does make sense to buffer them on the publisher and not on the subscriber. Furthermore, in case of a synchronous interaction, some frameworks are releasing the initial request object from the publisher JVM (meaning available for garbage collection) once the reply has been sent by the subscriber. So in this case, the buffering is even done twice.

The Reactive Streams API

The Reactive Streams working group released an API for several languages (Java and .NET, the JavaScript API has not been released yet). This API is simply composed of four different interfaces (that belongs to the org.reactivestreams package for Java):

Subscription

A simple interface acting as a kind of channel description and providing two methods:

  • cancel(): Used by a subscriber to cancel its subscription
  • request(long): Used by a subscriber to request n messages to a Publisher

Publisher

A generic interface providing a single subscribe(Subscriber) method. At first glance considering the original publish/subscribe pattern, it might be strange for a publisher to subscribe to a subscriber. But this is actually the strength of Reactive Streams, the capability for a publisher to adapt its flow of data according to a subscriber.

Subscriber

A generic interface providing 4 methods:

  • onComplete(): Notify the subscriber when a publisher is closed
  • onError(Throwable): Notify the subscriber when a publisher is closed with an error state
  • onNext(T): Notify the subscriber a message was sent. If everything goes right, after having processed a message, the subscriber will usually invoke then Subscription.request(long).
  • onSubscribe(Subscription): Notify the subscriber that a publisher just started a subscription (through the Publisher.subscribe(Subscriber) method). This is also a way for the subscriber to keep a reference on the Subscription object provided.

Processor

A simple interface extending Publisher and Subscriber interfaces. In other words a Processor can acts as both a Publisher and a Subscriber.

API summary

Bear in mind that a data stream has not necessarily an end. So the sequence of actions is usually the following:

onSubscribe onNext* (onError | onComplete)?

So far several actors have released a Reactive Streams compliant implementation like Akka, Vert.x, Reactor, RxJava etc. Nonetheless, Java 9 itself is not compatible with the original API as the four interfaces are redefined in the java.util.concurrent package. I will describe this point in more details in an upcoming post.

Reactive Streams IO

You may have noticed that so far I was solely speaking about an API, not a protocol (just like JMS vs AMQP for instance). And obviously an API does not bring interoperability cross languages.

To tackle this, another working group is (was?) in charge to define a protocol on top of network protocols (either uni or bi-directional like TCP, WebSockets, HTTP/2 etc.). Unfortunately it seems there is no much activity on the Github project even though the first RC was initially planned in 2015. Yet we can easily imagine this should not be an easy job at all but I am looking forward to getting some news.

When to use Reactive Streams?

From my perspective, I can imagine two applications of Reactive Streams.

First, you have to protect a legacy exposing a synchronous API (as described in the example above) from the rest of the world. You can implement a kind of Back Pressure Layer (BPL) acting as a facade on top of the legacy. This is obviously not the best possible application as the request buffering is still not kept at the publisher level. Yet this could be a smarter implementation compared to the dumb throttling policy one you can implement with a service gateway.

Second, you want to implement a system based on reactive principles (responsiveness, resiliency, elasticity and message-driven). In that case, Reactive Streams might appear as a de facto standard for component interactions without being forced to use a load-centric service or messaging middleware.

Conclusion

Reactive Streams does appear as a compelling solution to handle back pressure. Furthermore, the fact that Java 9 integrated this concept is definitely a step ahead for the original Reactive Streams initiative.

In my next post, I am going to describe what will be brought by Java 9 in more details.

Further reading