Fail reconfigure cmd on invalid auto subscription #298

dorjesinpo · 2024-05-21T22:28:15Z

Fix. If auto subscription expression is invalid, the reconfigure admin command should fail

Signed-off-by: dorjesinpo <[email protected]>

678098 · 2024-05-22T11:42:56Z

src/groups/mqb/mqba/mqba_domainmanager.cpp

@@ -686,19 +686,25 @@ int DomainManager::processCommand(mqbcmd::DomainsResult*        result,
            }
        }

-        DecodeAndUpsertValue unused;
+        DecodeAndUpsertValue configureResult;
        d_configProvider_p->clearCache(domainSp->name());
        d_configProvider_p->getDomainConfig(


No need to do anything, just my observation here

Note that docs for getDomainConfig are incorrect. They state that the method is asynchronous:

/// Asynchronously get the configuration for the specified `domainName` /// and invoke the specified `callback` with the result.

In fact, this method is synchronous and thread safe, protected by mutex.
It means that you are safe to access configureResult just after the getDomainConfig call because you are guaranteed that the method was already executed.

I will update the docs later.

Note that Domain::configure calls mqbi::Queue::configure with false as the wait argument.
Even if it was true, we call it from the cluster thread and do not synchronize. There are 3 threads involved here.

The validation is synchronous, yes. The reconfiguration of queues is asynchronous.

678098 · 2024-05-22T11:44:16Z

src/groups/mqb/mqba/mqba_domainmanager.cpp

-        result->makeSuccess();
+
+        if (configureResult.isError()) {
+            result->makeError().message() = configureResult.error().d_details;


Should we return -1; in the error branch?

678098 · 2024-05-22T11:53:30Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+    // Validate newConfig.subscriptions()
+
+    bsl::size_t size   = newConfig.subscriptions().size();
+    bool        result = true;


I would rather rename the flag to make its meaning more obvious:

Suggested change

bool result = true;

bool success = true;

The problem with result is that it can mean everything: positive and negative result.

Well, let's be more implicit then, like allSubscriptionsAreValid

678098 · 2024-05-22T11:56:21Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+        else {
+            if (expression.text().length()) {
+                errorDescription << "Invalid Expression: [ expression: "
+                                 << expression;


Suggested change

<< expression;

<< expression << " ]";

678098 · 2024-05-22T12:01:20Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+        }
+        else {
+            if (expression.text().length()) {
+                errorDescription << "Invalid Expression: [ expression: "


Would rather leave a more explicit comment, so our users can clearly understand why it fails here

Something like
This expression version expects only empty expressions: [ expression: <...>, version: <...> ]

On the second thought, let's reject unsupported versions.

678098 · 2024-05-22T12:02:23Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

@@ -429,7 +464,10 @@ int Domain::configure(bsl::ostream&           errorDescription,
    }

    // Validate config. Return early if the configuration is not valid.
-    if (int rc = validateConfig(errorDescription, d_config, finalConfig)) {
+    if (int rc = validateConfig(errorDescription,


Suggested change

if (int rc = validateConfig(errorDescription,

if (const int rc = validateConfig(errorDescription,

678098 · 2024-05-22T12:07:33Z

src/integration-tests/test_auto_subscriptions.py

+    def test_configure_invalid(self, cluster: Cluster):
+        """
+        Configure the priority queue to evaluate auto subscription negatively.
+        Make sure the queue does not get the message.


For these both tests, there are no messages posted.
Does it worth it to extend these tests with producer?

Not much of a value, I think.
The goal is to check if queue is opened or not.

Then we should probably remove mentions of messages from these tests' docs

678098 · 2024-05-22T12:08:15Z

src/integration-tests/test_auto_subscriptions.py

+        ] = "invalid expression"
+
+        cluster.reconfigure_domain(tc.DOMAIN_PRIORITY_SC, succeed=None)
+        # TODO: why not succeed=False


Is there a problem with reconfigure_domain not checking a negative result?

Yes, we want to make sure the command fails (so the whoever has issued the command can roll back)
The code insists on "processed successfully"

def command(self, command, succeed=None, timeout=BLOCK_TIMEOUT): """ Send the specified 'cmd' command to the broker, prefixing it with 'CMD '. """ cmd = "CMD " + command self.send(cmd) return ( None if succeed is None else self.capture(f"'{cmd}' processed successfully", timeout=timeout) ) ```

Signed-off-by: dorjesinpo <[email protected]>

hallfox · 2024-05-22T14:32:08Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+            newConfig.subscriptions()[i].expression();
+
+        if (mqbconfm::ExpressionVersion::E_VERSION_1 == expression.version()) {
+            if (expression.text().length()) {


Suggested change

if (expression.text().length()) {

if (!expression.text().empty()) {

I would prefer this for clarity and also in general for collection types it's usually much more efficient to check if a collection is empty than get its length.

hallfox · 2024-05-22T14:33:17Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+                        << expression << ", rc: " << context.lastError()
+                        << ", reason: \"" << context.lastErrorMessage()
+                        << "\" ]";
+                    result = false;


Can we just return rc_INVALID_SUBSCRIPTION here?

No, I would prefer to validate and report everything

hallfox · 2024-05-22T14:45:25Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

+    for (bsl::size_t i = 0; i < size; ++i) {
+        const mqbconfm::Expression& expression =
+            newConfig.subscriptions()[i].expression();
+
+        if (mqbconfm::ExpressionVersion::E_VERSION_1 == expression.version()) {
+            if (expression.text().length()) {
+                bmqeval::CompilationContext context(allocator);
+
+                if (!bmqeval::SimpleEvaluator::validate(expression.text(),
+                                                        context)) {
+                    errorDescription
+                        << "Expression validation failed: [ expression: "
+                        << expression << ", rc: " << context.lastError()
+                        << ", reason: \"" << context.lastErrorMessage()
+                        << "\" ]";
+                    allSubscriptionsAreValid = false;
+                }
+            }
+        }
+        else {
+            errorDescription
+                << "Unsupported version: [ expression: " << expression << " ]";
+            allSubscriptionsAreValid = false;
+        }
+    }
+
+    return allSubscriptionsAreValid ? 0 : rc_INVALID_SUBSCRIPTION;


This whole block is pretty complex, can we break it down into something more like this?

for (Subscriptions::const_iterator it = newConfig.subscriptions().cbegin(), end = newConfig.subscriptions().cend(); it != end; ++it) { const mqbconfm::Expression& expression = it->expression(); if (!validateExpression(errorDescription, expression, allocator)) { return rc_INVALID_SUBSCRIPTION; } } return 0;

Alternatively, if we're looking to log every invalid subscription expression (which I see the value of)

bool allSubscriptionsAreValid = true; for (...) { const mqbconfm::Expression& expression = it->expression(); allSubscriptionsAreValid = allSubscriptionsAreValid && validateExpression(errorDescription, expression, allocator); } return allSubscriptionsAreValid;

Additionally I may even suggest trying to figure out which expresssions are invalid first before populating errorDescription so instead of getting a wall of

"Expression validation failed: [ expression: " << expression << ", rc: " << context.lastError() << ", reason: \"" << context.lastErrorMessage() << "\" ]";

for each expression, we get one error message with a list of invalid expressions.

@hallfox I believe with this code we do not evaluate validateExpression if the bool flag was set to false before due to short-circuit evaluation with && operator
allSubscriptionsAreValid && validateExpression(errorDescription, expression, allocator);

Ah duh, good catch. It should be broken up into two lines

bool isValidExpression = validateExpression(errorDescription, expression, allocator); allSubscriptionsAreValid = allSubscriptionsAreValid && isValidExpression;

Another reason to filter out invalid expressions first before building the error message.

I also like the iterators approach since we don't actually need the exact index of a subscription

for each expression, we get one error message with a list of invalid expressions.

We might have details pointing out why exactly each expression is invalid. For example, if the expression is too complex, or if we have a syntax error. If we print just the expressions, it might make the debug process more difficult

If the number of expressions is huge, though, we might use a LimitedPrinter instead, or print at most N errors

This is what we will see
single 22May2024_13:59:23.696 (139804138272448) ERROR *gmq.tsk.bmqbrkr.bmqbrkr bmqbrkr.m.cpp:324 Error processing command [rc: -2] error = 'Expression validation failed: [ expression: [ version = E_VERSION_1 text = "invalid expression" ], rc: -100, reason: "syntax error, unexpected property at offset 8" ]' config = '[ name = "bmq.test.mmap.priority.sc" mode = [ priority = ] storage = [ domainLimits = [ messages = 2000 messagesWatermarkRatio = 0.8 bytes = 2097152 bytesWatermarkRatio = 0.8 ] queueLimits = [ messages = 1000 messagesWatermarkRatio = 0.8 bytes = 1048576 bytesWatermarkRatio = 0.8 ] config = [ fileBacked = ] ] maxConsumers = 0 maxProducers = 0 maxQueues = 0 msgGroupIdConfig = NULL maxIdleTime = 0 messageTtl = 300 maxDeliveryAttempts = 0 deduplicationTimeMs = 300000 consistency = [ strong = ] subscriptions = [ [ appId = "" expression = [ version = E_VERSION_1 text = "invalid expression" ] ] ] ]'

I think, this is sufficient enough and there is no need to limit since this per admin command

(there is no such type as mqbconfm::Subscriptions so involving iterator is not going to look prettier)

src/groups/mqb/mqbblp/mqbblp_domain.cpp

Signed-off-by: dorjesinpo <[email protected]>

678098

A few comments

And also, what do you think about merging 2 integration tests in one test? My concern here is that launching a cluster for each test takes 10-15 seconds. If we can change the tests scenario so we can launch cluster once, we can save some time.

In this PR, there are 2 scenarios:
_1:

configure invalid auto subscription
expect open queue fails
reconfigure valid auto subscription
expect queue opens

_2:

configure valid auto subscription
expect queue opens
try reconfigure invalid auto subscription - expect admin command fail
expect queue opens

These 2 scenarios could be merged into one to save cluster start time:

configure invalid auto subscription
expect open queue fails
reconfigure valid auto subscription
expect queue opens
try reconfigure invalid auto subscription - expect admin command fail
expect queue opens

So basically the test might check that the domain configuration remains in valid state once this state is reached.
If you don't want to increase the scope of this PR, but also agree that this change is good, I can do it myself in a separate PR once this is merged.

678098 · 2024-05-22T20:46:36Z

src/groups/mqb/mqbblp/mqbblp_domain.cpp

@@ -95,18 +95,48 @@ void afterAppIdUnregisteredDispatched(mqbi::Queue*       queue,
        mqbi::Storage::AppIdKeyPair(appId, mqbu::StorageKey()));
 }

+/// Validates an application subscription.
+bool validdateSubscriptionExpression(bsl::ostream& errorDescription,


2 d

Suggested change

bool validdateSubscriptionExpression(bsl::ostream& errorDescription,

bool validateSubscriptionExpression(bsl::ostream& errorDescription,

src/integration-tests/test_auto_subscriptions.py

Signed-off-by: dorjesinpo <[email protected]>

* Fail reconfigure cmd on invalid auto subscription Signed-off-by: dorjesinpo <[email protected]> * Addressing review Signed-off-by: dorjesinpo <[email protected]> * addressing review Signed-off-by: dorjesinpo <[email protected]> * merging two tests Signed-off-by: dorjesinpo <[email protected]> --------- Signed-off-by: dorjesinpo <[email protected]>

dorjesinpo requested a review from a team as a code owner May 21, 2024 22:28

dorjesinpo added the bug Something isn't working label May 21, 2024

dorjesinpo force-pushed the fix/domain-reconfigure branch 3 times, most recently from ce81937 to 33761a0 Compare May 21, 2024 23:14

Fail reconfigure cmd on invalid auto subscription

6571134

Signed-off-by: dorjesinpo <[email protected]>

dorjesinpo force-pushed the fix/domain-reconfigure branch from 33761a0 to 6571134 Compare May 21, 2024 23:21

dorjesinpo requested a review from 678098 May 22, 2024 00:00

678098 reviewed May 22, 2024

View reviewed changes

Addressing review

2eb05f8

Signed-off-by: dorjesinpo <[email protected]>

dorjesinpo force-pushed the fix/domain-reconfigure branch from 48ff5d6 to 2eb05f8 Compare May 22, 2024 14:37

hallfox reviewed May 22, 2024

View reviewed changes

addressing review

2ad0ab4

Signed-off-by: dorjesinpo <[email protected]>

678098 requested changes May 22, 2024

View reviewed changes

678098 reviewed May 24, 2024

View reviewed changes

src/integration-tests/test_auto_subscriptions.py Outdated Show resolved Hide resolved

merging two tests

0c7479a

Signed-off-by: dorjesinpo <[email protected]>

dorjesinpo force-pushed the fix/domain-reconfigure branch from 557f10e to 0c7479a Compare May 28, 2024 14:47

dorjesinpo assigned 678098 May 28, 2024

678098 approved these changes May 28, 2024

View reviewed changes

dorjesinpo merged commit c5bfaac into main May 28, 2024
17 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fail reconfigure cmd on invalid auto subscription #298

Fail reconfigure cmd on invalid auto subscription #298

dorjesinpo commented May 21, 2024

678098 May 22, 2024 •

edited

Loading

dorjesinpo May 22, 2024

678098 May 22, 2024

678098 May 22, 2024

dorjesinpo May 22, 2024

678098 May 22, 2024

678098 May 22, 2024

dorjesinpo May 22, 2024

678098 May 22, 2024

678098 May 22, 2024

dorjesinpo May 22, 2024

678098 May 22, 2024

678098 May 22, 2024 •

edited

Loading

dorjesinpo May 22, 2024

hallfox May 22, 2024

hallfox May 22, 2024

dorjesinpo May 22, 2024

hallfox May 22, 2024 •

edited

Loading

hallfox May 22, 2024

hallfox May 22, 2024

678098 May 22, 2024

hallfox May 22, 2024 •

edited

Loading

678098 May 22, 2024

678098 May 22, 2024

678098 May 22, 2024

dorjesinpo May 22, 2024

dorjesinpo May 22, 2024

678098 left a comment •

edited

Loading

678098 May 22, 2024

	if (int rc = validateConfig(errorDescription,
	if (const int rc = validateConfig(errorDescription,

	if (expression.text().length()) {
	if (!expression.text().empty()) {

	bool validdateSubscriptionExpression(bsl::ostream& errorDescription,
	bool validateSubscriptionExpression(bsl::ostream& errorDescription,

Fail reconfigure cmd on invalid auto subscription #298

Fail reconfigure cmd on invalid auto subscription #298

Conversation

dorjesinpo commented May 21, 2024

678098 May 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

678098 May 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hallfox May 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

hallfox May 22, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

678098 left a comment • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

678098 May 22, 2024 •

edited

Loading

678098 May 22, 2024 •

edited

Loading

hallfox May 22, 2024 •

edited

Loading

hallfox May 22, 2024 •

edited

Loading

678098 left a comment •

edited

Loading