Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fix intermittent failing compaction job test in functional test #1244

Merged
merged 4 commits into from
Feb 21, 2024

Conversation

sitaram-kalluri
Copy link
Member

@sitaram-kalluri sitaram-kalluri commented Feb 21, 2024

- What I did

  • Fixes at_client functional tests flakiness 2023-01-15 #1207
  • The primary reason for the problem lies in the background compaction job, which is disrupting the compaction job initiated by the test, leading to discrepancies in the test results.
  • The solution to resolve the issue is to stop the background compaction job while executing tests associated with the compaction job.

- How I did it

  • Introduce startCompactionJob and stopCompactionJob in AtClientSpec.
  • The atClientImpl contains the method implementation to start the compaction job in private. Marked the method as public method.
  • Implemented the stopCompactionJob method.
  • Updated the functional test to stop background compaction job when executing the compaction functional test.

- How to verify it

  • The functional tests should pass

/// to the remote secondary. Only the latest commit entry of the key is retained.
/// Uncommitted entries that are duplicates will not be removed/compacted.
Future<void> startCompactionJob(
{int commitLogCompactionTimeIntervalInMins = 11});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make this parameter a Duration

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure Gary,

@sitaram-kalluri
Copy link
Member Author

sitaram-kalluri commented Feb 21, 2024

The functional and e2e tests fail because monitor fails to start. Attaching the log snippet below. Debugging the issue.

INFO|2024-02-21 10:29:34.710734|Monitor (@alice🛠)|status is MonitorStatus.errored : heartbeat will not be sent 

INFO|2024-02-21 10:29:34.794133|Monitor (@alice🛠)|starting monitor for @alice🛠 with lastNotificationTime: 1708511364573 

INFO|2024-02-21 10:29:34.813447|Monitor (@alice🛠)|monitor started for @alice🛠 with last notification time: 1708511364573 

WARNING|2024-02-21 10:29:34.818178|Monitor (@alice🛠)|runZonedGuarded received socket error Converting object to an encodable object failed: Instance of 'Metadata' - calling _handleError 

INFO|2024-02-21 10:29:34.818691|Monitor (@alice🛠)|socket.listen onDone called. Will destroy socket, set status stopped, call retryCallback 

INFO|2024-02-21 10:29:34.818936|Monitor (@alice🛠)|Monitor error Converting object to an encodable object failed: Instance of 'Metadata' - calling the retryCallback 

INFO|2024-02-21 10:29:34.819009|NotificationServiceImpl (@alice🛠)|Monitor retry already queued 

INFO|2024-02-21 10:29:39.787474|Monitor (@alice🛠)|status is MonitorStatus.errored : heartbeat will not be sent 

INFO|2024-02-21 10:29:39.821140|Monitor (@alice🛠)|starting monitor for @alice🛠 with lastNotificationTime: 1708511364573 

INFO|2024-02-21 10:29:39.839429|Monitor (@alice🛠)|monitor started for @alice🛠 with last notification time: 1708511364573 

WARNING|2024-02-21 10:29:39.843379|Monitor (@alice🛠)|runZonedGuarded received socket error Converting object to an encodable object failed: Instance of 'Metadata' - calling _handleError 

INFO|2024-02-21 10:29:39.843647|Monitor (@alice🛠)|socket.listen onDone called. Will destroy socket, set status stopped, call retryCallback 

@sitaram-kalluri
Copy link
Member Author

The functional and e2e tests fail because monitor fails to start. Attaching the log snippet below. Debugging the issue.

INFO|2024-02-21 10:29:34.710734|Monitor (@alice🛠)|status is MonitorStatus.errored : heartbeat will not be sent 

INFO|2024-02-21 10:29:34.794133|Monitor (@alice🛠)|starting monitor for @alice🛠 with lastNotificationTime: 1708511364573 

INFO|2024-02-21 10:29:34.813447|Monitor (@alice🛠)|monitor started for @alice🛠 with last notification time: 1708511364573 

WARNING|2024-02-21 10:29:34.818178|Monitor (@alice🛠)|runZonedGuarded received socket error Converting object to an encodable object failed: Instance of 'Metadata' - calling _handleError 

INFO|2024-02-21 10:29:34.818691|Monitor (@alice🛠)|socket.listen onDone called. Will destroy socket, set status stopped, call retryCallback 

INFO|2024-02-21 10:29:34.818936|Monitor (@alice🛠)|Monitor error Converting object to an encodable object failed: Instance of 'Metadata' - calling the retryCallback 

INFO|2024-02-21 10:29:34.819009|NotificationServiceImpl (@alice🛠)|Monitor retry already queued 

INFO|2024-02-21 10:29:39.787474|Monitor (@alice🛠)|status is MonitorStatus.errored : heartbeat will not be sent 

INFO|2024-02-21 10:29:39.821140|Monitor (@alice🛠)|starting monitor for @alice🛠 with lastNotificationTime: 1708511364573 

INFO|2024-02-21 10:29:39.839429|Monitor (@alice🛠)|monitor started for @alice🛠 with last notification time: 1708511364573 

WARNING|2024-02-21 10:29:39.843379|Monitor (@alice🛠)|runZonedGuarded received socket error Converting object to an encodable object failed: Instance of 'Metadata' - calling _handleError 

INFO|2024-02-21 10:29:39.843647|Monitor (@alice🛠)|socket.listen onDone called. Will destroy socket, set status stopped, call retryCallback 

Found the root cause of the issue:

When pubKeyHash is null, toJson method is invoked which is leading to failure. Attaching the error stack trace below.

#0 Metadata.toJson (package:at_commons/src/keystore/at_key.dart:687:60)
#1 _defaultToEncodable (dart:convert/json.dart:627:55)
#2 _JsonStringifier.writeObject (dart:convert/json.dart:787:36)
#3 _JsonStringifier.writeMap (dart:convert/json.dart:874:7)
#4 _JsonStringifier.writeJsonValue (dart:convert/json.dart:829:21)
#5 _JsonStringifier.writeObject (dart:convert/json.dart:784:9)
#6 _JsonStringStringifier.printOn (dart:convert/json.dart:982:17)
#7 _JsonStringStringifier.stringify (dart:convert/json.dart:967:5)
#8 JsonEncoder.convert (dart:convert/json.dart:345:30)
#9 JsonCodec.encode (dart:convert/json.dart:231:45)
#10 jsonEncode (dart:convert/json.dart:114:10)
#11 NotificationServiceImpl._internalNotificationCallback (package:at_client/src/service/notification_service_impl.dart:236:15)

@sitaram-kalluri sitaram-kalluri requested a review from gkc February 21, 2024 13:38
@sitaram-kalluri
Copy link
Member Author

All tests passed after retracting at_commons 4.0.2 version.

@gkc gkc merged commit 135d5f9 into trunk Feb 21, 2024
10 checks passed
@gkc gkc deleted the refactor_compaction_job branch February 21, 2024 14:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

at_client functional tests flakiness 2023-01-15
2 participants