Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[improve][broker] LS-1263 Do-not-merge #252

Closed
wants to merge 74 commits into from
Closed

[improve][broker] LS-1263 Do-not-merge #252

wants to merge 74 commits into from

Conversation

mukesh-ctds
Copy link
Collaborator

Motivation

Explain here the context, and why you're making that change. What is the problem you're trying to solve.
This PR sync all commits from apache/branch-3.0 into 3.1_ds which are not present.

Modifications

Describe the modifications you've done.

  • Cherry-picked commits from branch-3.0 which are not present on 3.1_ds

Verifying this change

  • Make sure that the change passes the CI checks.

(Please pick either of the following options)

This change is a trivial rework / code cleanup without any test coverage.

Does this pull request potentially affect one of the following parts:

If yes was chosen, please highlight the changes

  • Dependencies (does it add or upgrade a dependency): (yes)
  • The public API: (no)
  • The schema: (no)
  • The default values of configurations: (no)
  • The wire protocol: (no)
  • The rest endpoints: (no)
  • The admin cli options: (no)
  • Anything that affects deployment: (no)

Documentation

Check the box below or label this PR directly (if you have committer privilege).

Need to update docs?

  • doc-required

    (If you need help on updating docs, create a doc issue)

  • no-need-doc

    (Please explain why)

  • doc

    (If this PR contains doc changes)

Technoboy- and others added 30 commits April 17, 2024 15:50
… many requests (apache#21216)

Motivation: The Pulsar client will close the socket if it receives a `ServiceNotReady` error when doing a lookup. The Broker will respond to the client with a `TooManyRequests` error if there are too many lookup requests in progress, but the Pulsar Proxy responds to the client with a `ServiceNotReady` error in the same scenario.

Modifications: Make Pulsar Proxy respond to the client with a `TooManyRequests` error if there are too many lookup requests in progress.
(cherry picked from commit d6c3fa4)
(cherry picked from commit 9a7c4bb)
…er id when switch ledger (apache#21201)

### Modifications
- Print a warning log if the SSL handshake error
- Print ledger ID when switching ledger

(cherry picked from commit 8485d68)
(cherry picked from commit f243925)
…ing queue is not empty (apache#21259)

Reproduce steps:
- Create a reader.
- Reader pulls messages into `incoming queue`, do not call `reader.readNext` now.
- Trim ledger task will delete the ledgers, then there is no in the topic.
- Now, you can get messages if you call `reader.readNext`, but the method `reader.hasMessageAvailable` return `false`

Note: the similar issue of `MultiTopicsConsumerImpl` has been fixed by apache#13332, current PR only trying to fix the issue of `ConsumerImpl`.

Make `reader.hasMessageAvailable` return `true` when `incoming queue` is not empty.

(cherry picked from commit 6d82b09)
(cherry picked from commit 38c3f0c)
### Motivation

After trimming ledgers, the variable `lastConfirmedEntry` of the managed ledger might rely on a deleted ledger(the latest ledger which contains data).

There is a bug that makes pulsar allow users to set the start read position to an unexisting ledger or a deleted ledger when creating a subscription. This makes the `backlog` and `markDeletedPosition` wrong.

### Modifications

Fix the bug.

(cherry picked from commit 4ee5cd7)
(cherry picked from commit dd28bb4)
…gers (apache#21250)

### Background
- But after trimming ledgers, `ml.lastConfirmedPosition` relies on a deleted ledger when the current ledger of ML is empty.
- Cursor prevents setting `markDeletedPosition` to a value larger than `ml.lastConfirmedPosition`, but there are no entries to read<sup>[1]</sup>.
- The code description in the method `advanceCursors` said: do not make `cursor.markDeletedPosition` larger than `ml.lastConfirmedPosition`<sup>[2]</sup>

### Issue
If there is no durable cursor, the `markDeletedPosition` might be set to `{current_ledger, -1}`, and `async mark delete` will be prevented by the `rule-2` above. So he `backlog`, `readPosition`, and `markDeletedPosition` of the cursor will be in an incorrect position after trimming the ledger. You can reproduce it by the test `testTrimLedgerIfNoDurableCursor`

### Modifications
Do not make `cursor.markDeletedPosition` larger than `ml.lastConfirmedPosition` when advancing non-durable cursors.

(cherry picked from commit ca77982)
(cherry picked from commit 6895919)
…ne faster (apache#21183)

There is an issue similar to the apache#21155 fixed one.

The client assumed the connection was inactive, but the Broker assumed the connection was fine. The Client tried to  use a new connection to reconnect an exclusive consumer, then got an error `Exclusive consumer is already connected`

- Check the connection of the old consumer is available when the new one tries to subscribe

(cherry picked from commit 29db8f8)
(cherry picked from commit b796f56)
…ptions caused by wrong topicName (apache#21997)

Similar to: apache#20131

The master branch has fixed the issue by apache#19841 Since it will makes users can not receive the messages which created in mistake, we did not cherry-pick apache#19841 into other branches, see detail apache#19841)

It works like this:
1. createSubscription( `tp1` )
2. is partitioned topic?
  `no`: return subscriptions
  `yes`: createSubscription(`tp1-partition-0`)....createSubscription(`tp1-partition-n`)

---

```java
String partitionedTopic = "tp1-partition-0-DLQ";

TopicName partition0 = partitionedTopic.getPartition(0);// Highlight: the partition0.toString() will be "tp1-partition-0-DLQ"(it is wrong).The correct value is "tp1-partition-0-DLQ-partition-0"
```

Therefore, if there has a partitioned topic named `tp1-partition-0-DLQ`, the method `PersistentTopics.createSubscription` will works like this:
1. call Admin API ``PersistentTopics.createSubscription("tp1-partition-0-DLQ")`
2. is partitioned topic?
3. yes, call `TopicName.getPartition(0)` to get partition 0 and will get `tp1-partition-0-DLQ` , then loop to step-1.

Then the infinite HTTP call `PersistentTopics.createSubscription` makes the broker crash.

If hits the issue which makes the topic name wrong, do not loop to step 1.

The PR apache#19841 fixes the issue which makes the topic name wrong, and this PR will create unfriendly compatibility, and PIP 263 apache#20033 will make compatibility good.

(cherry picked from commit 4386401)
… reads (apache#22295)

(cherry picked from commit 2803ba2)

(cherry picked from commit fde7c49)
…o same topics (apache#22255)

(cherry picked from commit c616b35)
(cherry picked from commit 5ab0e93)
…cher.consumerList and dispatcher.consumerSet (apache#22270)

(cherry picked from commit cba1600)
(cherry picked from commit 94edfe4)
…tcher.consumerSet (apache#22283)

(cherry picked from commit a52945b)
(cherry picked from commit bec3be2)
…ar-function-go (apache#21444)

(cherry picked from commit 2d57624)
(cherry picked from commit 3ed5ea2)
(cherry picked from commit 073bb37)
(cherry picked from commit 53306b6)
…orrectly (apache#22311)

(cherry picked from commit 5cabcac)
(cherry picked from commit d4c0543)
…ccessing multiple clusters in the same process (apache#22306)

Co-authored-by: atomchchen <[email protected]>
(cherry picked from commit 71598c1)
(cherry picked from commit bff6ea2)
Technoboy- and others added 28 commits April 17, 2024 20:15
(cherry picked from commit b777136)
(cherry picked from commit d3f0a4b)
…EntriesInStorage

- improve the test case based on suggestion from @poorbarcode

(cherry picked from commit 8f17446)
…red mode (apache#22245)

(cherry picked from commit e34ea62)
(cherry picked from commit 5b37e84)
…n keys (apache#22421)

(cherry picked from commit a1970ae)
(cherry picked from commit 4a79a26)
(cherry picked from commit f423558)
(cherry picked from commit 5b14144)
…pache#22284)

### Motivation

We should not rollover inactive ledgers when metadata service is invailable.

### Modifications

Checking metadata service is vailable when schedule `checkInactiveLedgerAndRollOver`

(cherry picked from commit b9bf0a8)
…e and fix imprecise client-side metrics: pendingMessagesUpDownCounter, pendingBytesUpDownCounter, latencyHistogram (apache#22393)

(cherry picked from commit 2469b97)
(cherry picked from commit 14b6279)
(cherry picked from commit 1c46877)
(cherry picked from commit 4a5400f)
(cherry picked from commit 5ba3e57)
…he#21274)

Co-authored-by: Baodi Shi <[email protected]>
(cherry picked from commit 5d18ff7)
(cherry picked from commit 000ee66)
…c loading wouldn't timeout (apache#22479)

(cherry picked from commit 837f8bc)
(cherry picked from commit 0fbcbb2)
(cherry picked from commit 7a4e16a)

(cherry picked from commit 93664d7)
…ks (apache#22463)

(cherry picked from commit f3d14a6)
(cherry picked from commit 976399c)
…paths or disabling it (apache#22370)

(cherry picked from commit 15ed659)
(cherry picked from commit cdc5af1)
…d fix race conditions in metricsBufferResponse mode (apache#22494)

(cherry picked from commit 7009071)
(cherry picked from commit 5f9d7c5)
…xception and testIncorrectClientClock (apache#22489)

(cherry picked from commit d9a43dd)
(cherry picked from commit c590198)
Co-authored-by: hoguni <[email protected]>
(cherry picked from commit 20915d1)

# Conflicts:
#	pom.xml
(cherry picked from commit ef9b28f)
(cherry picked from commit bbff29d)
(cherry picked from commit a0120d0)
…ntriesInStorage

- change apache#22034 is missing from branch-3.0

(cherry picked from commit e3531e8)
…EntriesInStorage

- improve the test case based on suggestion from @poorbarcode

(cherry picked from commit 8f17446)
@mukesh-ctds mukesh-ctds deleted the LS-1263 branch April 23, 2024 10:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.