You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have been using this npm module with one consumer instance for a while now and have had no problems with events from AWS Kinesis being lost. However, since my application will be hosted in Kubernetes with horizontal autoscalers, I did some testing with varying consumer amounts to see how it performed. One thing I noticed was that Kinesis events were lost when I went from 1 consumer to 2. Both these consumers were using the same consumer group, but were separate instances of my consumer application. Let me explain.
I would start up one consumer and let it initialize correctly, we'll call this consumer consumer1. I have a data stream with a provisioned number of shards at 4 and the DynamoDB associated with this consumer was also initialized correctly. I would then use a producer program to ingest documents into the Kinesis data stream at a document rate of 1 doc/sec. These documents were labeled with IDs starting from 0 all the way to 300 so that I could keep track. I let consumer1 consume about 50 documents and then initialized a 2nd consumer (consumer2). After about 2 minutes after consumer2 initialized, consumer2 starts receiving documents as well. I let the producer run until the 300 documents have been successfully ingested into the Kinesis data stream and I let the two consumers run about 5 minutes after the 300th document was ingested to make sure they were done receiving documents.
After this trail run ends, I print out what document IDs the two consumers received and also what shard those documents came in through. I do the same for the producer to make sure that each ID was successfully inserted in the the Kinesis data stream and also to see which shard the producer placed the document into.
When I combine the two consumer lists, there are many document IDs missing. All the missing document IDs come after the ID of 50, meaning that as soon as I introduce the second consumer that's when documents go missing. When I look at the DynamoDB, two of the shards have consumer1 as the leaseOwner and the other two shards have consumer2 as the leaseOwner. So the leases were distributed correctly however, documents just failed to be properly received. I see no processRecord errors in my consumer applications indicating they failed to receive some of these documents. I tried changing the leaseAcquisitionInterval but that didn't help as I was still losing documents.
Any help on this would be greatly appreciated as this is one of the only npm modules we can use for consuming Kinesis documents in pure Nodejs.
The text was updated successfully, but these errors were encountered:
nsnider7
changed the title
Documents Missing When Consumers Scales from 1 to 2
Documents Missing When Consumers Scale from 1 to 2
Aug 26, 2022
nsnider7
changed the title
Documents Missing When Consumers Scale from 1 to 2
Events Lost When Consumers Scale from 1 to 2
Aug 26, 2022
I have been using this npm module with one consumer instance for a while now and have had no problems with events from AWS Kinesis being lost. However, since my application will be hosted in Kubernetes with horizontal autoscalers, I did some testing with varying consumer amounts to see how it performed. One thing I noticed was that Kinesis events were lost when I went from 1 consumer to 2. Both these consumers were using the same consumer group, but were separate instances of my consumer application. Let me explain.
I would start up one consumer and let it initialize correctly, we'll call this consumer consumer1. I have a data stream with a provisioned number of shards at 4 and the DynamoDB associated with this consumer was also initialized correctly. I would then use a producer program to ingest documents into the Kinesis data stream at a document rate of 1 doc/sec. These documents were labeled with IDs starting from 0 all the way to 300 so that I could keep track. I let consumer1 consume about 50 documents and then initialized a 2nd consumer (consumer2). After about 2 minutes after consumer2 initialized, consumer2 starts receiving documents as well. I let the producer run until the 300 documents have been successfully ingested into the Kinesis data stream and I let the two consumers run about 5 minutes after the 300th document was ingested to make sure they were done receiving documents.
After this trail run ends, I print out what document IDs the two consumers received and also what shard those documents came in through. I do the same for the producer to make sure that each ID was successfully inserted in the the Kinesis data stream and also to see which shard the producer placed the document into.
When I combine the two consumer lists, there are many document IDs missing. All the missing document IDs come after the ID of 50, meaning that as soon as I introduce the second consumer that's when documents go missing. When I look at the DynamoDB, two of the shards have consumer1 as the leaseOwner and the other two shards have consumer2 as the leaseOwner. So the leases were distributed correctly however, documents just failed to be properly received. I see no processRecord errors in my consumer applications indicating they failed to receive some of these documents. I tried changing the leaseAcquisitionInterval but that didn't help as I was still losing documents.
Any help on this would be greatly appreciated as this is one of the only npm modules we can use for consuming Kinesis documents in pure Nodejs.
The text was updated successfully, but these errors were encountered: