Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve lag calculation and error handling in lagCmd #341

Merged
merged 2 commits into from
Jul 27, 2024

Conversation

jackcipher
Copy link
Contributor

This PR enhances the lagCmd function to improve its reliability and error reporting. The main changes are:

  1. Added a check to detect and report cases where the consumer offset is greater than the high watermark. This helps identify potential inconsistencies in the data.

  2. Improved error messages for better clarity when troubleshooting issues.

  3. Enhanced resource management by ensuring the admin client is properly closed using defer.

  4. Only report lag values greater than zero, reducing noise in the output.

  5. Minor code improvements for readability (e.g., renaming variables).

These changes should make the lag calculation more robust and informative, especially in edge cases or when dealing with topic compaction or message retention scenarios.

Testing:

  • Tested with normal scenarios and verified correct lag calculation
  • Simulated scenarios with consumer offsets greater than high watermarks and confirmed appropriate warnings are logged
  • Verified that only non-zero lag values are reported in the final output

Please review and let me know if any further changes are needed.

@birdayz
Copy link
Owner

birdayz commented Jul 6, 2024

Only report lag values greater than zero, reducing noise in the output.

i would argue, that this is very counter intuitive. when i just tested it locally, i was confused because my group was not listed at all.
i would strongly prefer zero lag being shown.

also, zero lag is not such a common case..in local testing or low-traffic cases, yes. however, if you do not have completely trivial scale, lag will rarely be zero (as commits happen after some time / a processed batch, not immediately).

So, i'd rather not have special treatment for zero.

i'm ok with the rest of the changes.

thanks!

@jackcipher
Copy link
Contributor Author

Thank you for the feedback. I agree that showing zero lag is important especially during local testing or in low-traffic scenarios.

so I have made the following updates:

  • Show zero lag consumer groups: The lagCmd now displays consumer groups even if their lag is zero, ensuring no groups are unintentionally hidden.
  • Include group state in output: The state of each consumer group is now included in the output for better context.

@birdayz birdayz merged commit 191bcb3 into birdayz:master Jul 27, 2024
1 check passed
@birdayz
Copy link
Owner

birdayz commented Jul 27, 2024

thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants