New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Added the test scripts for resumption #2117

Open

shubham-yb wants to merge 17 commits into main from shubham/resumption

Contributor

shubham-yb commented Dec 25, 2024 •

edited

Loading

Describe the changes in this pull request

Added the test framework for resumption tests for import data file and offline import data
Added the test cases of large sized table and large number of tables for import data file
Added the test case for PG offline import data resumption with datatypes, indexes, partitions, case sensitivity / reserved words, multiple schemas

Describe if there are any user-facing changes

N/A

How was this pull request tested?

Made the changes to the Jenkins pipeline as well.


          Added the test scripts for resumption

045ad30

CLAassistant commented Dec 25, 2024

Thank you for your submission! We really appreciate it. Like many open source projects, we ask that you sign our Contributor License Agreement before we can accept your contribution.
_{You have signed the CLA already but the status is still pending? Let us recheck it.}

shubham-yb added 5 commits

December 26, 2024 12:30


          Updated the large count tables test

78209bf


          Renamed test in GH Actions

0485f4c


          Merge branch 'main' into shubham/resumption

1b2ee52


          Test: Only run GH integration tests

8fbb84f


          Added AWS region to large table test

0e5bcd0

shubham-yb marked this pull request as ready for review

December 27, 2024 11:45


          Merge branch 'main' into shubham/resumption

986a51e

shubham-yb requested review from makalaaneesh, sanyamsinghal, ShivanshGahlot and priyanshi-yb

December 27, 2024 11:47

shubham-yb added 9 commits

December 27, 2024 12:16


          Cleanup

d5b7aaa


          Reduced time between each retry for the large table test

e8bd070


          Merge branch 'main' into shubham/resumption

18a38ab


          Merge branch 'main' into shubham/resumption

61c9b1d


          Added import data resumption test framework and PG test case

d72c841


          Increased the table sizes for the PG test

b38a7f5


          Increased the table sizes for the PG test

a14fa6b


          Added conditional check while dropping the database

20c2715


          Row count optimsation and cleanup

9a6cf4c

Contributor Author

shubham-yb commented Jan 7, 2025

Does your PR have changes that can cause upgrade issues?

Component	Breaking changes?
MetaDB	No
Name registry json	No
Data File Descriptor Json	No
Export Snapshot Status Json	No
Import Data State	No
Export Status Json	No
Data .sql files of tables	No
Export and import data queue	No
Schema Dump	No
AssessmentDB	No
Sizing DB	No
Migration Assessment Report Json	No
Callhome Json	No
YugabyteD Tables	No
TargetDB Metadata Tables	No

Contributor Author

shubham-yb commented Jan 7, 2025

https://jenkins.dev.yugabyte.com/job/users/job/yb-voyager-testing/job/yb-voyager-testing-pipeline-test/1084/

makalaaneesh reviewed

View reviewed changes

migtests/scripts/resumption.py

+                  """
+                  Runs the yb-voyager command with support for resumption testing.
+                  """
+                  for attempt in range(1, resumption['max_restarts'] + 1):

Collaborator

makalaaneesh Jan 7, 2025

let's get/define all the configs in the beginning. It will make it easier to understand what all configuration options are involved.

max_restarts = resumption['max_restarts']
min_interrupt_seconds = resumption['min_interrupt_seconds']
...

migtests/scripts/resumption.py

+                                  if not output:  # Exit if output is empty (end of process output)
+                                      break
+                                  full_output += output
+                              if time.time() - start_time > 5:

Collaborator

makalaaneesh Jan 7, 2025

why break ? what is 5? seconds? minutes?

migtests/scripts/resumption.py

+                  # Final import retry logic
+                  print("\n--- Final attempt to complete the import ---")
+                  for _ in range(2):

Collaborator

makalaaneesh Jan 7, 2025

Why 2 attempts finally?

migtests/scripts/resumption.py

+                      try:
+                          print("\nVoyager command output:")
+                          process = subprocess.Popen(

Collaborator

makalaaneesh Jan 7, 2025

nit: separate function for starting command (can be called in above for-loop as well)

migtests/scripts/resumption.py

+                          )
+                          # Capture and print output
+                          for line in iter(process.stdout.readline, ''):

Collaborator

makalaaneesh Jan 7, 2025

in the above for-loop, we're reading both stderr and stdout, here we're only reading stdout. Any particular reason? Would be good to be consistent here (call a common function that captures stdout/stderr)

Collaborator

makalaaneesh Jan 7, 2025

Also till when will you keep reading? How long will the loop run?

migtests/scripts/resumption.py

+                          for line in iter(process.stderr.readline, ''):
+                              print(line.strip())
+                              sys.stdout.flush()
+                          time.sleep(30)

Collaborator

makalaaneesh Jan 7, 2025

why sleep?

migtests/scripts/resumption.py

+                      print("Final import failed after 2 attempts.")
+                      sys.exit(1)
+              def validate_row_counts(row_count, export_dir):

Collaborator

makalaaneesh Jan 7, 2025

note for future: you can create a common python file that has such helper
functions.

migtests/tests/pg/partitions/snapshot.sh

		@@ -0,0 +1,133 @@
		#!/bin/bash

Collaborator

makalaaneesh Jan 7, 2025

Assuming that the ONLY change here is that you're specifying ROW_COUNT and essentially making generate_series dynamic.

migtests/tests/resumption/pg/resumption/config.yaml

+                schema2.Case_Sensitive_Table: 5000000
+                schema2.case: 5000000
+                schema2.Table: 5000000
+                public.boston: 2500000

Collaborator

makalaaneesh Jan 7, 2025 •

edited

Loading

where is the code that generates data for all these other tables boston/cust/emp/etc? I only see code for table/case/Case_Sensitive_Table


          Merge branch 'main' into shubham/resumption

b42f922

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

makalaaneesh makalaaneesh left review comments

sanyamsinghal Awaiting requested review from sanyamsinghal

ShivanshGahlot Awaiting requested review from ShivanshGahlot

priyanshi-yb Awaiting requested review from priyanshi-yb

At least 1 approving review is required to merge this pull request.

Labels

None yet