-
Notifications
You must be signed in to change notification settings - Fork 141
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fail reconfigure cmd on invalid auto subscription #298
Conversation
ce81937
to
33761a0
Compare
Signed-off-by: dorjesinpo <[email protected]>
33761a0
to
6571134
Compare
@@ -686,19 +686,25 @@ int DomainManager::processCommand(mqbcmd::DomainsResult* result, | |||
} | |||
} | |||
|
|||
DecodeAndUpsertValue unused; | |||
DecodeAndUpsertValue configureResult; | |||
d_configProvider_p->clearCache(domainSp->name()); | |||
d_configProvider_p->getDomainConfig( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No need to do anything, just my observation here
Note that docs for getDomainConfig
are incorrect. They state that the method is asynchronous:
/// Asynchronously get the configuration for the specified `domainName`
/// and invoke the specified `callback` with the result.
In fact, this method is synchronous and thread safe, protected by mutex.
It means that you are safe to access configureResult
just after the getDomainConfig
call because you are guaranteed that the method was already executed.
I will update the docs later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that Domain::configure
calls mqbi::Queue::configure
with false
as the wait
argument.
Even if it was true
, we call it from the cluster thread and do not synchronize
. There are 3 threads involved here.
The validation is synchronous, yes. The reconfiguration of queues is asynchronous.
result->makeSuccess(); | ||
|
||
if (configureResult.isError()) { | ||
result->makeError().message() = configureResult.error().d_details; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we return -1;
in the error branch?
// Validate newConfig.subscriptions() | ||
|
||
bsl::size_t size = newConfig.subscriptions().size(); | ||
bool result = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather rename the flag to make its meaning more obvious:
bool result = true; | |
bool success = true; |
The problem with result
is that it can mean everything: positive and negative result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, let's be more implicit then, like allSubscriptionsAreValid
else { | ||
if (expression.text().length()) { | ||
errorDescription << "Invalid Expression: [ expression: " | ||
<< expression; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
<< expression; | |
<< expression << " ]"; |
} | ||
else { | ||
if (expression.text().length()) { | ||
errorDescription << "Invalid Expression: [ expression: " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would rather leave a more explicit comment, so our users can clearly understand why it fails here
Something like
This expression version expects only empty expressions: [ expression: <...>, version: <...> ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On the second thought, let's reject unsupported versions.
@@ -429,7 +464,10 @@ int Domain::configure(bsl::ostream& errorDescription, | |||
} | |||
|
|||
// Validate config. Return early if the configuration is not valid. | |||
if (int rc = validateConfig(errorDescription, d_config, finalConfig)) { | |||
if (int rc = validateConfig(errorDescription, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (int rc = validateConfig(errorDescription, | |
if (const int rc = validateConfig(errorDescription, |
def test_configure_invalid(self, cluster: Cluster): | ||
""" | ||
Configure the priority queue to evaluate auto subscription negatively. | ||
Make sure the queue does not get the message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For these both tests, there are no messages posted.
Does it worth it to extend these tests with producer?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not much of a value, I think.
The goal is to check if queue is opened or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Then we should probably remove mentions of messages from these tests' docs
] = "invalid expression" | ||
|
||
cluster.reconfigure_domain(tc.DOMAIN_PRIORITY_SC, succeed=None) | ||
# TODO: why not succeed=False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a problem with reconfigure_domain
not checking a negative result?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, we want to make sure the command fails (so the whoever has issued the command can roll back)
The code insists on "processed successfully"
def command(self, command, succeed=None, timeout=BLOCK_TIMEOUT):
"""
Send the specified 'cmd' command to the broker, prefixing it with 'CMD '.
"""
cmd = "CMD " + command
self.send(cmd)
return (
None
if succeed is None
else self.capture(f"'{cmd}' processed successfully", timeout=timeout)
)
```
Signed-off-by: dorjesinpo <[email protected]>
48ff5d6
to
2eb05f8
Compare
newConfig.subscriptions()[i].expression(); | ||
|
||
if (mqbconfm::ExpressionVersion::E_VERSION_1 == expression.version()) { | ||
if (expression.text().length()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if (expression.text().length()) { | |
if (!expression.text().empty()) { |
I would prefer this for clarity and also in general for collection types it's usually much more efficient to check if a collection is empty than get its length.
<< expression << ", rc: " << context.lastError() | ||
<< ", reason: \"" << context.lastErrorMessage() | ||
<< "\" ]"; | ||
result = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just return rc_INVALID_SUBSCRIPTION
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, I would prefer to validate and report everything
for (bsl::size_t i = 0; i < size; ++i) { | ||
const mqbconfm::Expression& expression = | ||
newConfig.subscriptions()[i].expression(); | ||
|
||
if (mqbconfm::ExpressionVersion::E_VERSION_1 == expression.version()) { | ||
if (expression.text().length()) { | ||
bmqeval::CompilationContext context(allocator); | ||
|
||
if (!bmqeval::SimpleEvaluator::validate(expression.text(), | ||
context)) { | ||
errorDescription | ||
<< "Expression validation failed: [ expression: " | ||
<< expression << ", rc: " << context.lastError() | ||
<< ", reason: \"" << context.lastErrorMessage() | ||
<< "\" ]"; | ||
allSubscriptionsAreValid = false; | ||
} | ||
} | ||
} | ||
else { | ||
errorDescription | ||
<< "Unsupported version: [ expression: " << expression << " ]"; | ||
allSubscriptionsAreValid = false; | ||
} | ||
} | ||
|
||
return allSubscriptionsAreValid ? 0 : rc_INVALID_SUBSCRIPTION; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This whole block is pretty complex, can we break it down into something more like this?
for (Subscriptions::const_iterator it = newConfig.subscriptions().cbegin(),
end = newConfig.subscriptions().cend();
it != end;
++it)
{
const mqbconfm::Expression& expression = it->expression();
if (!validateExpression(errorDescription, expression, allocator)) {
return rc_INVALID_SUBSCRIPTION;
}
}
return 0;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternatively, if we're looking to log every invalid subscription expression (which I see the value of)
bool allSubscriptionsAreValid = true;
for (...) {
const mqbconfm::Expression& expression = it->expression();
allSubscriptionsAreValid = allSubscriptionsAreValid && validateExpression(errorDescription, expression, allocator);
}
return allSubscriptionsAreValid;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Additionally I may even suggest trying to figure out which expresssions are invalid first before populating errorDescription
so instead of getting a wall of
"Expression validation failed: [ expression: "
<< expression << ", rc: " << context.lastError()
<< ", reason: \"" << context.lastErrorMessage()
<< "\" ]";
for each expression, we get one error message with a list of invalid expressions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hallfox I believe with this code we do not evaluate validateExpression
if the bool flag was set to false before due to short-circuit evaluation with &&
operator
allSubscriptionsAreValid && validateExpression(errorDescription, expression, allocator);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah duh, good catch. It should be broken up into two lines
bool isValidExpression = validateExpression(errorDescription, expression, allocator);
allSubscriptionsAreValid = allSubscriptionsAreValid && isValidExpression;
Another reason to filter out invalid expressions first before building the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also like the iterators approach since we don't actually need the exact index of a subscription
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for each expression, we get one error message with a list of invalid expressions.
We might have details pointing out why exactly each expression is invalid. For example, if the expression is too complex, or if we have a syntax error. If we print just the expressions, it might make the debug process more difficult
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the number of expressions is huge, though, we might use a LimitedPrinter instead, or print at most N errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is what we will see
single 22May2024_13:59:23.696 (139804138272448) ERROR *gmq.tsk.bmqbrkr.bmqbrkr bmqbrkr.m.cpp:324 Error processing command [rc: -2] error = 'Expression validation failed: [ expression: [ version = E_VERSION_1 text = "invalid expression" ], rc: -100, reason: "syntax error, unexpected property at offset 8" ]' config = '[ name = "bmq.test.mmap.priority.sc" mode = [ priority = ] storage = [ domainLimits = [ messages = 2000 messagesWatermarkRatio = 0.8 bytes = 2097152 bytesWatermarkRatio = 0.8 ] queueLimits = [ messages = 1000 messagesWatermarkRatio = 0.8 bytes = 1048576 bytesWatermarkRatio = 0.8 ] config = [ fileBacked = ] ] maxConsumers = 0 maxProducers = 0 maxQueues = 0 msgGroupIdConfig = NULL maxIdleTime = 0 messageTtl = 300 maxDeliveryAttempts = 0 deduplicationTimeMs = 300000 consistency = [ strong = ] subscriptions = [ [ appId = "" expression = [ version = E_VERSION_1 text = "invalid expression" ] ] ] ]'
I think, this is sufficient enough and there is no need to limit since this per admin command
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(there is no such type as mqbconfm::Subscriptions so involving iterator is not going to look prettier)
Signed-off-by: dorjesinpo <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments
And also, what do you think about merging 2 integration tests in one test? My concern here is that launching a cluster for each test takes 10-15 seconds. If we can change the tests scenario so we can launch cluster once, we can save some time.
In this PR, there are 2 scenarios:
_1:
- configure invalid auto subscription
- expect open queue fails
- reconfigure valid auto subscription
- expect queue opens
_2:
- configure valid auto subscription
- expect queue opens
- try reconfigure invalid auto subscription - expect admin command fail
- expect queue opens
These 2 scenarios could be merged into one to save cluster start time:
- configure invalid auto subscription
- expect open queue fails
- reconfigure valid auto subscription
- expect queue opens
- try reconfigure invalid auto subscription - expect admin command fail
- expect queue opens
So basically the test might check that the domain configuration remains in valid state once this state is reached.
If you don't want to increase the scope of this PR, but also agree that this change is good, I can do it myself in a separate PR once this is merged.
@@ -95,18 +95,48 @@ void afterAppIdUnregisteredDispatched(mqbi::Queue* queue, | |||
mqbi::Storage::AppIdKeyPair(appId, mqbu::StorageKey())); | |||
} | |||
|
|||
/// Validates an application subscription. | |||
bool validdateSubscriptionExpression(bsl::ostream& errorDescription, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 d
bool validdateSubscriptionExpression(bsl::ostream& errorDescription, | |
bool validateSubscriptionExpression(bsl::ostream& errorDescription, |
Signed-off-by: dorjesinpo <[email protected]>
557f10e
to
0c7479a
Compare
* Fail reconfigure cmd on invalid auto subscription Signed-off-by: dorjesinpo <[email protected]> * Addressing review Signed-off-by: dorjesinpo <[email protected]> * addressing review Signed-off-by: dorjesinpo <[email protected]> * merging two tests Signed-off-by: dorjesinpo <[email protected]> --------- Signed-off-by: dorjesinpo <[email protected]>
* Fail reconfigure cmd on invalid auto subscription Signed-off-by: dorjesinpo <[email protected]> * Addressing review Signed-off-by: dorjesinpo <[email protected]> * addressing review Signed-off-by: dorjesinpo <[email protected]> * merging two tests Signed-off-by: dorjesinpo <[email protected]> --------- Signed-off-by: dorjesinpo <[email protected]>
* Fail reconfigure cmd on invalid auto subscription Signed-off-by: dorjesinpo <[email protected]> * Addressing review Signed-off-by: dorjesinpo <[email protected]> * addressing review Signed-off-by: dorjesinpo <[email protected]> * merging two tests Signed-off-by: dorjesinpo <[email protected]> --------- Signed-off-by: dorjesinpo <[email protected]>
Fix. If auto subscription expression is invalid, the reconfigure admin command should fail