Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new scenarios e2e bq #26

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from
Open

new scenarios e2e bq #26

wants to merge 2 commits into from

Conversation

priyabhatnagar25
Copy link
Collaborator

@priyabhatnagar25 priyabhatnagar25 commented Oct 6, 2023

PR consists of scenarios from ITN Classes

@@ -941,4 +941,259 @@ public static void createBucketWithMultipleTestFilesWithRegex() throws IOExcepti
gcsSourceBucketName = createGCSBucketWithMultipleFiles(PluginPropertyUtils.pluginProp(
"gcsMultipleFilesFilterRegexPath"));
}

@Before(order = 1, value = "@BQ_UPSERT_SOURCE_TEST")
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see the After hooks for all before hooks, Please Add the after hooks for all of them, to cleanup the created tables after the tests execution.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the after hooks for all the before hooks

@rahuldash171
Copy link
Collaborator

Please change the description .

Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please check in all the scenarios, why do we need this step? If not required please remove this step.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This step is used to for the connection is created and used in all the scenarios that are using use connection functionality . This is need to check the connection is loaded properly or not.In earlier 5 scenarios for BQ I have used this step and got approved

Then Open and capture logs
Then Close the pipeline logs
Then Verify the pipeline status is "Succeeded"
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqUpdatededupeExpectedFile"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use camelcase for Parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to camelcase

Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
And Select radio button plugin property: "operation" with value: "update"
Then Enter Value for plugin property table key : "relationTableKey" with values: "Name"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the value in plugin parameter file and use from there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the value in plugin file

And Select radio button plugin property: "operation" with value: "update"
Then Enter Value for plugin property table key : "relationTableKey" with values: "Name"
Then Select dropdown plugin property: "dedupeBy" with option value: "dedupeByOrder"
Then Enter key for plugin property: "dedupeBy" with values: "ID"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention the value in plugin parameter file and use from there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done added

PluginPropertyUtils.addPluginProp("bqSourceTable", bqSourceTable);
BeforeActions.scenario.write("BQ source table name - " + bqSourceTable);
BigQueryClient.getSoleQueryResult("create table `" + datasetName + "." + bqSourceTable + "` " +
"(ID STRING, transaction_date DATE, Firstname STRING)");
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add Datetime and Timestamp columns to use them to cover the Datetime and Timestamp column scenarios.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the steps

BeforeActions.scenario.write("BQ Target Table " + bqTargetTable + " updated successfully");
}

@Before(order = 1, value = "@BQ_TIME_STAMP_SOURCE_TEST")
Copy link
Collaborator

@itsmekumari itsmekumari Oct 9, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change the Hook name @BQ_TIME_SOURCE_TEST (which is to be common hook to cover Date/Datetime/Timestamp columns)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the separate steps

Then Click on the Add Button of the property: "relationTableKey" with value:
| TableKeyDedupe |
Then Select dropdown plugin property: "dedupeBy" with option value: "dedupeBy"
Then Enter key for plugin property: "dedupeBy" with values: "Price"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention in Parameter file and use it from there.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, added in plugin parameter

@@ -27,7 +27,8 @@
features = {"src/e2e-test/features"},
glue = {"io.cdap.plugin.bigquery.stepsdesign", "io.cdap.plugin.gcs.stepsdesign",
"stepsdesign", "io.cdap.plugin.common.stepsdesign"},
tags = {"@BigQuery_Sink"},
tags = {"@BigQuery_Sink not @CDAP-20830"},
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it should be 'and not'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to and not

Copy link
Collaborator

@rahuldash171 rahuldash171 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please go though all the comments and change as per them

Then Click on the Browse button inside plugin properties
Then Select connection data row with name: "dataset"
Then Select connection data row with name: "bqSourceTable"
Then Wait till connection data loading completes with a timeout of 60 seconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this , default timeout is already there for 180 sec .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already used in the use connection scenarios for BQ

Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName"
Then Click on the Browse button inside plugin properties
Then Click SELECT button inside connection data row with name: "dataset"
Then Wait till connection data loading completes with a timeout of 60 seconds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if not needed then please remove , and what happens if it load the data before 60 seconds , does it wait till 1 min gets over or go to the next step . ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already used in the use connection scenarios for BQ

Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqUpsertExpectedFile"

@BQ_NULL_MODE_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario: Validate Successful record transfer from BigQuery source plugin to BigQuery sink plugin with all null values in one column and few null values in different column.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please change it to "Validate Successful record transfer from BigQuery source plugin to BigQuery sink plugin having all null values in one column and few null values in another column in Source table ."

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to above

Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table

@BQ_UPDATE_SOURCE_DEDUPE_TEST @BQ_UPDATE_SINK_DEDUPE_TEST @EXISTING_BQ_CONNECTION
Scenario: Verify successful record transfer from BigQuery source to BigQuery sink using advance operation update with dedupe property.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change it to with 'Dedupe By property'.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed to Dedupe By property'

Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we using different values in step definition for selecting connection in source and sink ? It is present in every scenario .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is already used in the use connection scenarios for BQ

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharatgulati If switch-useConnection is used in Sink will it work seamlessly ? I believe you have used these before in both source and sink for recent merged GCS tests .

Then Wait till connection data loading completes with a timeout of 60 seconds
Then Verify input plugin property: "dataset" contains value: "dataset"
Then Enter input plugin property: "table" with value: "bqTargetTable"
Then Enter input plugin property: "partitionByField" with value: "bqPartitionFieldTime"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this value is not present in pluginParam.properties , is it an old key/value which was previously used ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is a old key value

Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqTimeStampExpectedFile"

@BQ_UPSERT_DEDUPE_SOURCE_TEST @BQ_UPSERT_DEDUPE_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario:Validate successful records transfer from BigQuery source to BigQuery sink with Upsert operation with dedupe source data and destination table already exists and update table schema is false
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

change to " Validate successful records transfer from BigQuery source to BigQuery sink with Upsert operation with dedupe source data and existing destination table where update table schema is set to false"

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed

Then Click on the Get Schema button
Then Click on the Validate button
Then Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery3"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we not following the above steps for bigQuery2?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added BQ 2 source with hook

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added the BQ 2 hook

*
* @param json The JSON object to search for the ID key.
*/
private static String getIdKey(JsonObject json) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this method is needed ? getIdkey() ? @bharatgulati Can you please look at this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This we used earlier also in wrangler, and BQ 5 scenarios, PR got merged

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete this file from testData , please avoid these type of mistakes .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

deleted the file

Then Open and capture logs
Then Close the pipeline logs
Then Verify the pipeline status is "Succeeded"
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqTimeStampExpectedFile"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the test is for 'Date' datatype use relevant date related parameter value. And similarly for other Datetime an Timestamp for other tests to be added.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

renamed the file for date, date time and timestamp

Copy link
Collaborator

@rahuldash171 rahuldash171 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed , please look into .

Then Validate "BigQuery" plugin properties
And Close the Plugin Properties page
Then Navigate to the properties page of plugin: "BigQuery2"
Then Click plugin property: "useConnection"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@bharatgulati If switch-useConnection is used in Sink will it work seamlessly ? I believe you have used these before in both source and sink for recent merged GCS tests .

dedupeBy=DESC
TableKeyDedupe=Name
Directive_Drop=testdata/BigQuery/test_diffschema_record-cdap-data-pipeline.json
bqUpsertDedupeFile=testdata/BigQuery/BQUpsertDedupeFile
bqDifferentRecordFile=testdata/BigQuery/BQDifferentRecordNameFile
bqDateExpectedFile=testdata/BigQuery/BQTimeStampFile
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use proper naming convention , Change the file name from 'BQTimeStampFile' to 'BQDateFile'

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resolved changed the name testdata/BigQuery/BQDateFile

@BQ_TIME_STAMP_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario: Verify successful record transfer for the Insert operation from BigQuery source plugin to BigQuery sink with partition type Time.
@BQ_TIME_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION
Scenario: Verify successful record transfer for the Insert operation from BigQuery source plugin to BigQuery sink with partition type Time and partition field is date.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bqPartitionFieldTime is used in line 435 but the scenario is related to partition filed is date , Can you add a value like 'bqPartitionFieldDate' in pluginParam properties and use the same for this scenario .

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the field bqPartitionFieldDate

@@ -270,7 +271,11 @@ public static void createTempSourceBQTable() throws IOException, InterruptedExce
}

@After(order = 1, value = "@BQ_SOURCE_TEST or @BQ_PARTITIONED_SOURCE_TEST or @BQ_SOURCE_DATATYPE_TEST or " +
"@BQ_INSERT_SOURCE_TEST or @BQ_UPDATE_SINK_TEST")
"@BQ_INSERT_SOURCE_TEST or @BQ_UPDATE_SINK_TEST or @BQ_UPSERT_SOURCE_TEST or @BQ_UPSERT_SINK_TEST or " +
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The before and after hooks counts are not matching . After are 15 but before count is 13 . Can you please reverify ?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

reverified the after hooks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants