Issue 191: Add at-least-once stream processing example #192

claudiofahey · 2019-03-14T06:11:52Z

These examples are intended to illustrate how at-least-once semantics can be achieved with Pravega. In particular, these illustrative examples do not use Apache Flink.

...les/src/main/java/io/pravega/example/streamprocessing/ExactlyOnceMultithreadedProcessor.java

.../src/main/java/io/pravega/example/streamprocessing/NonRecoverableMultithreadedProcessor.java

...src/main/java/io/pravega/example/streamprocessing/NonRecoverableSingleThreadedProcessor.java

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/README.md

...les/src/main/java/io/pravega/example/streamprocessing/RecoverableMultithreadedProcessor.java

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

...les/src/main/java/io/pravega/example/streamprocessing/ExactlyOnceMultithreadedProcessor.java

...c/main/java/io/pravega/example/streamprocessing/ExactlyOnceMultithreadedProcessorWorker.java

...src/main/java/io/pravega/example/streamprocessing/NonRecoverableSingleThreadedProcessor.java

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/Parameters.java

tkaitchuck

@claudiofahey

But consider what happens if the application crashes after commit() but before the reader group is updated with readNextEvent. Upon recovery, the reader group will resume at the previous checkpoint and duplicate records will be created. To avoid this, we still need to use the two-phase commit process of flush, persist transaction IDs, commit, update reader group. I may be able to do this with another Pravega stream instead of a shared file system but I don't see how to completely eliminate this need.

Yes. This is the reason the readerOffline() call takes a Position object. It allows you to tell Pravega exactly what data was processed and what data was not when the worker dies.

So yes, the two phase scheme you are doing works, but you could make it better by doing it per-reader as opposed to per-readerGroup.

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

RaulGracia · 2019-08-12T16:35:34Z

@claudiofahey due to issues we had in previous releases merging develop into master, we have changed the release approach and the current development branch is dev. We plan to delete develop to avoid confusions, so could you please re-open this PR against dev? Thanks!

claudiofahey · 2019-08-13T22:27:07Z

This PR is ready for re-review.

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

...les/src/main/java/io/pravega/example/streamprocessing/ExactlyOnceMultithreadedProcessor.java

...-client-examples/src/main/java/io/pravega/example/streamprocessing/AtLeastOnceProcessor.java

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/README.md

claudiofahey · 2020-06-20T20:59:22Z

The latest commit should provide at-least-once semantics in a clean way using only Pravega. I believe the code is complete but it does not have any automated tests yet. We also need to update the README to describe how this works.

obsolete

claudiofahey · 2020-08-04T04:27:19Z

This PR is ready for review. For a description of this PR, see https://github.com/claudiofahey/pravega-samples/blob/issue-191-streamprocessing/pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/README.md.
Unit and integration tests can be run with ./gradlew pravega-client-examples:build.

fpj

It looks good, I only have a couple of small comments.

fpj · 2020-08-16T15:54:55Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/AppConfiguration.java

+        return getEnvVar("INSTANCE_ID", UUID.randomUUID().toString());
+    }
+
+    public String getStream1Name() {


Minor comment, but could we use input stream and output stream rather than stream 1 and 2?

AppConfiguration is shared by EventGenerator, AtLeastOnceApp, and EventDebugSink. Stream 1 is the output for EventGenerator and the input for AtLeastOnceApp. Stream 2 is the output for AtLeastOnceApp and the input for EventDebugSink. To avoid the confusion from this point of view, I chose Stream 1 and Stream 2. It is a little odd, I admit. I added comments to the code to clarify this.

fpj · 2020-08-16T16:24:58Z

...ega-client-examples/src/main/java/io/pravega/example/streamprocessing/ReaderGroupPruner.java

+        // We must ensure that we add this reader to the membership synchronizer before the reader group.
+        membershipSynchronizer.startAsync();
+        membershipSynchronizer.awaitRunning();
+        task = executor.scheduleAtFixedRate(new PruneRunner(), heartbeatIntervalMillis, heartbeatIntervalMillis, TimeUnit.MILLISECONDS);


I'm wondering if it would be best to add some randomization to the period between runs of the prune runner. If distinct readers run it at the same period and close to each other, then we would have them calling reader offline unnecessarily for the same offline readers.

I don't feel very strongly about the comment because:

reader offline is idempotent

crashes are rare

it should only matter for a larger reader groups

In any case, if you feel that you want to introduce it here, then I think it is a small improvement.

I have randomized the initial delay and the period as you have suggested.

Signed-off-by: Claudio Fahey <[email protected]>

- Removed deleteStream calls. - No longer waiting for transactions to go into COMMIT state after commit() call. - Now using Callable instead of Runnable. - Improved shutdown cleanup. Signed-off-by: Claudio Fahey <[email protected]>

…aging state. - Added additional startup scripts. - Updated default parameters. - Updated documentation. Signed-off-by: Claudio Fahey <[email protected]>

Signed-off-by: Claudio Fahey <[email protected]>

claudiofahey · 2021-03-19T18:04:53Z

@RaulGracia, I have addressed all of your comments. However, it appears that there is an intermittent timeout failure in the test kill5of6Test. I will continue to investigate.

RaulGracia · 2021-03-22T09:18:57Z

@claudiofahey maybe the failure you are seeing are related to the sporadic failures we see in Pravega core repo related to tests involving Pravega standalone: pravega/pravega#5864

RaulGracia · 2021-03-25T11:58:33Z

@claudiofahey I tested the sample and it works. Also compiled with JDK11 and JDK8 and it also works (./gradlew installDist). But I also observed the failure in test kill5of6Test. Please, let's try to get this test passing so we can merge this one soon.

…detection. Signed-off-by: Claudio Fahey <[email protected]>

Signed-off-by: Claudio Fahey <[email protected]>

claudiofahey · 2021-03-25T21:30:49Z

FYI, I ran the integration 20 times and all have passed:

for i in {1..20}; do ./gradlew clean test; done |& tee -a /tmp/test.log
grep "SUCCESSFUL" /tmp/test.log
BUILD SUCCESSFUL in 3m 42s
BUILD SUCCESSFUL in 3m 31s
BUILD SUCCESSFUL in 3m 56s
BUILD SUCCESSFUL in 3m 41s
BUILD SUCCESSFUL in 3m 47s
BUILD SUCCESSFUL in 6m 25s
BUILD SUCCESSFUL in 4m 18s
BUILD SUCCESSFUL in 4m 34s
BUILD SUCCESSFUL in 4m 25s
BUILD SUCCESSFUL in 4m 32s
BUILD SUCCESSFUL in 4m 16s
BUILD SUCCESSFUL in 4m 16s
BUILD SUCCESSFUL in 6m 29s
BUILD SUCCESSFUL in 4m 5s
BUILD SUCCESSFUL in 4m 8s
BUILD SUCCESSFUL in 4m 6s
BUILD SUCCESSFUL in 4m 26s
BUILD SUCCESSFUL in 4m 29s
BUILD SUCCESSFUL in 4m 2s
BUILD SUCCESSFUL in 3m 37s

Signed-off-by: Raúl Gracia <[email protected]>

RaulGracia · 2021-03-26T08:48:09Z

@claudiofahey tests passed for me. I had to upgrade the Scala version of Spark in the Hadoop sample, as build was broken by one of the previous PRs. But now looks good, thanks!.

RaulGracia · 2021-03-26T08:50:19Z

@tkaitchuck please, would you mind to do another review to either approve it or request more changes?

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/AtLeastOnceApp.java

tkaitchuck · 2021-03-26T21:37:35Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/AtLeastOnceApp.java

+        processor.startAsync();
+
+        // Add shutdown hook for graceful shutdown.
+        Runtime.getRuntime().addShutdownHook(new Thread(() -> {


Shouldn't the processor do this itself if this is needed.?

It gets messy to do this in the AtLeastOnceProcessor because then it would need to deal with removing the shutdown hook, but only doing that when the JVM is not actually shutting down. The point of this shutdown hook is to shutdown the whole application gracefully, which in this case happens to consist of only one AtLeastOnceProcessor service. But in general, an application may consist of many services that all need to be shutdown in a particular order. So it seems appropriate that AtLeastOnceApp should coordinate the shutdown.

tkaitchuck · 2021-03-26T21:40:08Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventDebugSink.java

+
+    public void run() throws Exception {
+        final ClientConfig clientConfig = ClientConfig.builder().controllerURI(getConfig().getControllerURI()).build();
+        try (StreamManager streamManager = StreamManager.create(getConfig().getControllerURI())) {


I don't see the change

tkaitchuck · 2021-03-26T21:41:26Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventDebugSink.java

+                }
+            }
+        } finally {
+            readerGroupManager.deleteReaderGroup(readerGroup);


I don't think we necessarily want to delete the group because one reader shutdown.

In this case we do want to delete the reader group. Each instance of EventDebugSink will read the entire stream so it creates a random reader group for itself and then it cleans up when it is done. There's no need for load balancing between EventDebugSink instances (in contrast to AtLeastOnceApp).

I also added a createStream private method to this class.

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventDebugSink.java

tkaitchuck · 2021-03-26T21:43:00Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

+
+    public void run() throws Exception {
+        final ClientConfig clientConfig = ClientConfig.builder().controllerURI(getConfig().getControllerURI()).build();
+        try (StreamManager streamManager = StreamManager.create(getConfig().getControllerURI())) {


"Same here" referred to making this a private method.

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

...ga-client-examples/src/test/java/io/pravega/example/streamprocessing/WorkerProcessGroup.java

Signed-off-by: Claudio Fahey <[email protected]>

tkaitchuck · 2021-03-31T23:35:06Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

+        createStreams();
+        final Random rand = new Random(42);
+        try (final EventStreamClientFactory clientFactory = EventStreamClientFactory.withScope(getConfig().getScope(), clientConfig)) {
+            try (final EventStreamWriter<SampleEvent> writer = clientFactory.createEventWriter(


These don't need to nest. These two items can be in the same try block.

tkaitchuck · 2021-03-31T23:38:59Z

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java

+                long sum = 0;
+                for (; ; ) {
+                    sequenceNumber++;
+                    final String routingKey = String.format("%3d", rand.nextInt(1000));


Random locals like this don't need to be declared final.

tkaitchuck · 2021-03-31T23:42:54Z

...-client-examples/src/test/java/io/pravega/example/streamprocessing/StreamProcessingTest.java

+public class StreamProcessingTest {
+    static final Logger log = LoggerFactory.getLogger(StreamProcessingTest.class);
+
+    protected static final AtomicReference<SetupUtils> SETUP_UTILS = new AtomicReference<>();


This is bad from. Please use a resource to avoid this pattern.

RaulGracia · 2021-04-16T16:21:17Z

@claudiofahey can you address @tkaitchuck comments so we can close this one? I think we are close.

shrids · 2021-04-19T15:39:10Z

...-client-examples/src/main/java/io/pravega/example/streamprocessing/AtLeastOnceProcessor.java

+            Position lastFlushedPosition = null;
+            try {
+                while (isRunning()) {
+                    final EventRead<T> eventRead = reader.readNextEvent(readTimeoutMillis);


readNextEvent can throw a TruncatedDataException when StreamRetention kicks in. We would need to retry reading the event in such a scenario, right?

RaulGracia · 2021-07-27T07:48:46Z

@claudiofahey any updates addressing @tkaitchuck feedback? This one is actually looks close to get merged, wondering if it is worth it to do a last push for it.

claudiofahey requested review from RaulGracia and EronWright March 14, 2019 06:13

RaulGracia requested a review from elizabethbain March 14, 2019 18:07

tkaitchuck requested changes Mar 15, 2019

View reviewed changes

This comment has been minimized.

Sign in to view

claudiofahey requested a review from tkaitchuck March 17, 2019 04:25

vijikarthi previously requested changes Apr 22, 2019

View reviewed changes

tkaitchuck reviewed Apr 26, 2019

View reviewed changes

pravega-client-examples/src/main/java/io/pravega/example/streamprocessing/EventGenerator.java Outdated Show resolved Hide resolved

EronWright removed their request for review May 2, 2019 01:02

claudiofahey requested review from vijikarthi and tkaitchuck May 2, 2019 01:04

claudiofahey force-pushed the issue-191-streamprocessing branch from a6f9340 to f4e5b94 Compare August 13, 2019 22:24

claudiofahey changed the base branch from develop to dev August 13, 2019 22:26

tkaitchuck reviewed Oct 16, 2019

View reviewed changes

claudiofahey force-pushed the issue-191-streamprocessing branch from f4e5b94 to 0e57327 Compare June 1, 2020 00:11

tkaitchuck reviewed Jun 20, 2020

View reviewed changes

claudiofahey requested review from tkaitchuck and fpj and removed request for elizabethbain June 21, 2020 05:56

claudiofahey removed the request for review from vijikarthi August 4, 2020 04:24

fpj reviewed Aug 16, 2020

View reviewed changes

Claudio Fahey added 4 commits August 21, 2020 04:54

Issue 191: Add non-Flink stream processing example.

85ea475

Signed-off-by: Claudio Fahey <[email protected]>

Issue 191: Moved inner classes to top level. Removed obsolete classes.

666c86e

Signed-off-by: Claudio Fahey <[email protected]>

Issue 191: Processor is now stateless to avoid complications with man…

13c3931

…aging state. - Added additional startup scripts. - Updated default parameters. - Updated documentation. Signed-off-by: Claudio Fahey <[email protected]>

Claudio Fahey added 3 commits March 19, 2021 17:30

Fix typo in streamprocessing/README

913ef6e

Signed-off-by: Claudio Fahey <[email protected]>

Fix formatting

ec39792

Signed-off-by: Claudio Fahey <[email protected]>

Removed all commented and unused code from SetupUtils.java

097fb8b

Signed-off-by: Claudio Fahey <[email protected]>

Claudio Fahey added 3 commits March 25, 2021 19:10

Decrease heartbeat interval for kill5of6Test to speed up dead reader …

a06dbc6

…detection. Signed-off-by: Claudio Fahey <[email protected]>

Adding logging

5c09d55

Signed-off-by: Claudio Fahey <[email protected]>

Add retry to killAndRestart1of1ForcingDuplicatesTest

bd6029c

Signed-off-by: Claudio Fahey <[email protected]>

claudiofahey requested a review from RaulGracia March 25, 2021 19:19

Fix retry in killAndRestart1of1ForcingDuplicatesTest

9215290

Signed-off-by: Claudio Fahey <[email protected]>

Claudio Fahey and others added 2 commits March 26, 2021 01:05

Merge branch 'dev' into issue-191-streamprocessing

33902d8

Fixed Scala version of Spark dependency in Hadoop samples.

843d4c2

Signed-off-by: Raúl Gracia <[email protected]>

RaulGracia approved these changes Mar 26, 2021

View reviewed changes

tkaitchuck requested changes Mar 26, 2021

View reviewed changes

Claudio Fahey added 6 commits March 27, 2021 01:25

Make methods private

8c880cd

Signed-off-by: Claudio Fahey <[email protected]>

Create private method createStream

4dc1aae

Signed-off-by: Claudio Fahey <[email protected]>

Make SampleEvent immutable. Replace @cleanup with try-with-resources.

1ad8163

Signed-off-by: Claudio Fahey <[email protected]>

Update README

afa5139

Signed-off-by: Claudio Fahey <[email protected]>

Add createStreams method to AtLeastOnceApp

40675a4

Signed-off-by: Claudio Fahey <[email protected]>

Add createStreams private method to EventGenerator

f67ea43

Signed-off-by: Claudio Fahey <[email protected]>

claudiofahey requested a review from tkaitchuck March 27, 2021 01:56

tkaitchuck reviewed Mar 31, 2021

View reviewed changes

shrids reviewed Apr 19, 2021

View reviewed changes

Merge branch 'dev' into issue-191-streamprocessing

55a8607

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue 191: Add at-least-once stream processing example #192

Issue 191: Add at-least-once stream processing example #192

claudiofahey commented Mar 14, 2019 •

edited

Loading

This comment has been minimized.

tkaitchuck left a comment

RaulGracia commented Aug 12, 2019

claudiofahey commented Aug 13, 2019

claudiofahey commented Jun 20, 2020

claudiofahey commented Aug 4, 2020

fpj left a comment

fpj Aug 16, 2020

claudiofahey Aug 21, 2020 •

edited

Loading

fpj Aug 16, 2020

claudiofahey Aug 21, 2020

claudiofahey commented Mar 19, 2021

RaulGracia commented Mar 22, 2021

RaulGracia commented Mar 25, 2021

claudiofahey commented Mar 25, 2021

RaulGracia commented Mar 26, 2021

RaulGracia commented Mar 26, 2021

tkaitchuck Mar 26, 2021

claudiofahey Mar 27, 2021

tkaitchuck Mar 26, 2021

tkaitchuck Mar 26, 2021

claudiofahey Mar 27, 2021

claudiofahey Mar 27, 2021

tkaitchuck Mar 26, 2021

tkaitchuck Mar 31, 2021

tkaitchuck Mar 31, 2021

tkaitchuck Mar 31, 2021

RaulGracia commented Apr 16, 2021

shrids Apr 19, 2021

RaulGracia commented Jul 27, 2021

Issue 191: Add at-least-once stream processing example #192

Are you sure you want to change the base?

Issue 191: Add at-least-once stream processing example #192

Conversation

claudiofahey commented Mar 14, 2019 • edited Loading

This comment has been minimized.

tkaitchuck left a comment

Choose a reason for hiding this comment

RaulGracia commented Aug 12, 2019

claudiofahey commented Aug 13, 2019

claudiofahey commented Jun 20, 2020

claudiofahey commented Aug 4, 2020

fpj left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

claudiofahey Aug 21, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

claudiofahey commented Mar 19, 2021

RaulGracia commented Mar 22, 2021

RaulGracia commented Mar 25, 2021

claudiofahey commented Mar 25, 2021

RaulGracia commented Mar 26, 2021

RaulGracia commented Mar 26, 2021

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

RaulGracia commented Apr 16, 2021

Choose a reason for hiding this comment

RaulGracia commented Jul 27, 2021

claudiofahey commented Mar 14, 2019 •

edited

Loading

claudiofahey Aug 21, 2020 •

edited

Loading