-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
new scenarios e2e bq #26
base: develop
Are you sure you want to change the base?
Conversation
@@ -941,4 +941,259 @@ public static void createBucketWithMultipleTestFilesWithRegex() throws IOExcepti | |||
gcsSourceBucketName = createGCSBucketWithMultipleFiles(PluginPropertyUtils.pluginProp( | |||
"gcsMultipleFilesFilterRegexPath")); | |||
} | |||
|
|||
@Before(order = 1, value = "@BQ_UPSERT_SOURCE_TEST") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see the After hooks for all before hooks, Please Add the after hooks for all of them, to cleanup the created tables after the tests execution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the after hooks for all the before hooks
Please change the description . |
Then Click on the Browse button inside plugin properties | ||
Then Select connection data row with name: "dataset" | ||
Then Select connection data row with name: "bqSourceTable" | ||
Then Wait till connection data loading completes with a timeout of 60 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please check in all the scenarios, why do we need this step? If not required please remove this step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This step is used to for the connection is created and used in all the scenarios that are using use connection functionality . This is need to check the connection is loaded properly or not.In earlier 5 scenarios for BQ I have used this step and got approved
Then Open and capture logs | ||
Then Close the pipeline logs | ||
Then Verify the pipeline status is "Succeeded" | ||
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqUpdatededupeExpectedFile" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use camelcase for Parameter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to camelcase
Then Verify input plugin property: "dataset" contains value: "dataset" | ||
Then Enter input plugin property: "table" with value: "bqTargetTable" | ||
And Select radio button plugin property: "operation" with value: "update" | ||
Then Enter Value for plugin property table key : "relationTableKey" with values: "Name" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention the value in plugin parameter file and use from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the value in plugin file
And Select radio button plugin property: "operation" with value: "update" | ||
Then Enter Value for plugin property table key : "relationTableKey" with values: "Name" | ||
Then Select dropdown plugin property: "dedupeBy" with option value: "dedupeByOrder" | ||
Then Enter key for plugin property: "dedupeBy" with values: "ID" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention the value in plugin parameter file and use from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done added
PluginPropertyUtils.addPluginProp("bqSourceTable", bqSourceTable); | ||
BeforeActions.scenario.write("BQ source table name - " + bqSourceTable); | ||
BigQueryClient.getSoleQueryResult("create table `" + datasetName + "." + bqSourceTable + "` " + | ||
"(ID STRING, transaction_date DATE, Firstname STRING)"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add Datetime and Timestamp columns to use them to cover the Datetime and Timestamp column scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the steps
BeforeActions.scenario.write("BQ Target Table " + bqTargetTable + " updated successfully"); | ||
} | ||
|
||
@Before(order = 1, value = "@BQ_TIME_STAMP_SOURCE_TEST") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change the Hook name @BQ_TIME_SOURCE_TEST (which is to be common hook to cover Date/Datetime/Timestamp columns)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the separate steps
Then Click on the Add Button of the property: "relationTableKey" with value: | ||
| TableKeyDedupe | | ||
Then Select dropdown plugin property: "dedupeBy" with option value: "dedupeBy" | ||
Then Enter key for plugin property: "dedupeBy" with values: "Price" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mention in Parameter file and use it from there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done, added in plugin parameter
@@ -27,7 +27,8 @@ | |||
features = {"src/e2e-test/features"}, | |||
glue = {"io.cdap.plugin.bigquery.stepsdesign", "io.cdap.plugin.gcs.stepsdesign", | |||
"stepsdesign", "io.cdap.plugin.common.stepsdesign"}, | |||
tags = {"@BigQuery_Sink"}, | |||
tags = {"@BigQuery_Sink not @CDAP-20830"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it should be 'and not'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to and not
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please go though all the comments and change as per them
Then Click on the Browse button inside plugin properties | ||
Then Select connection data row with name: "dataset" | ||
Then Select connection data row with name: "bqSourceTable" | ||
Then Wait till connection data loading completes with a timeout of 60 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this , default timeout is already there for 180 sec .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already used in the use connection scenarios for BQ
Then Enter input plugin property: "referenceName" with value: "BQSinkReferenceName" | ||
Then Click on the Browse button inside plugin properties | ||
Then Click SELECT button inside connection data row with name: "dataset" | ||
Then Wait till connection data loading completes with a timeout of 60 seconds |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not needed then please remove , and what happens if it load the data before 60 seconds , does it wait till 1 min gets over or go to the next step . ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already used in the use connection scenarios for BQ
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqUpsertExpectedFile" | ||
|
||
@BQ_NULL_MODE_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION | ||
Scenario: Validate Successful record transfer from BigQuery source plugin to BigQuery sink plugin with all null values in one column and few null values in different column. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please change it to "Validate Successful record transfer from BigQuery source plugin to BigQuery sink plugin having all null values in one column and few null values in another column in Source table ."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed to above
Then Validate the values of records transferred to BQ sink is equal to the values from source BigQuery table | ||
|
||
@BQ_UPDATE_SOURCE_DEDUPE_TEST @BQ_UPDATE_SINK_DEDUPE_TEST @EXISTING_BQ_CONNECTION | ||
Scenario: Verify successful record transfer from BigQuery source to BigQuery sink using advance operation update with dedupe property. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change it to with 'Dedupe By property'.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
changed to Dedupe By property'
Then Validate "BigQuery" plugin properties | ||
And Close the Plugin Properties page | ||
Then Navigate to the properties page of plugin: "BigQuery2" | ||
Then Click plugin property: "useConnection" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we using different values in step definition for selecting connection in source and sink ? It is present in every scenario .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already used in the use connection scenarios for BQ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bharatgulati If switch-useConnection is used in Sink will it work seamlessly ? I believe you have used these before in both source and sink for recent merged GCS tests .
Then Wait till connection data loading completes with a timeout of 60 seconds | ||
Then Verify input plugin property: "dataset" contains value: "dataset" | ||
Then Enter input plugin property: "table" with value: "bqTargetTable" | ||
Then Enter input plugin property: "partitionByField" with value: "bqPartitionFieldTime" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this value is not present in pluginParam.properties , is it an old key/value which was previously used ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value is a old key value
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqTimeStampExpectedFile" | ||
|
||
@BQ_UPSERT_DEDUPE_SOURCE_TEST @BQ_UPSERT_DEDUPE_SINK_TEST @EXISTING_BQ_CONNECTION | ||
Scenario:Validate successful records transfer from BigQuery source to BigQuery sink with Upsert operation with dedupe source data and destination table already exists and update table schema is false |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
change to " Validate successful records transfer from BigQuery source to BigQuery sink with Upsert operation with dedupe source data and existing destination table where update table schema is set to false"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
Then Click on the Get Schema button | ||
Then Click on the Validate button | ||
Then Close the Plugin Properties page | ||
Then Navigate to the properties page of plugin: "BigQuery3" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are we not following the above steps for bigQuery2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added BQ 2 source with hook
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added the BQ 2 hook
* | ||
* @param json The JSON object to search for the ID key. | ||
*/ | ||
private static String getIdKey(JsonObject json) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is this method is needed ? getIdkey() ? @bharatgulati Can you please look at this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This we used earlier also in wrangler, and BQ 5 scenarios, PR got merged
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Delete this file from testData , please avoid these type of mistakes .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deleted the file
Then Open and capture logs | ||
Then Close the pipeline logs | ||
Then Verify the pipeline status is "Succeeded" | ||
Then Validate the data transferred from BigQuery to BigQuery with actual And expected file for: "bqTimeStampExpectedFile" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the test is for 'Date' datatype use relevant date related parameter value. And similarly for other Datetime an Timestamp for other tests to be added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
renamed the file for date, date time and timestamp
0818101
to
985c4a8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed , please look into .
Then Validate "BigQuery" plugin properties | ||
And Close the Plugin Properties page | ||
Then Navigate to the properties page of plugin: "BigQuery2" | ||
Then Click plugin property: "useConnection" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@bharatgulati If switch-useConnection is used in Sink will it work seamlessly ? I believe you have used these before in both source and sink for recent merged GCS tests .
dedupeBy=DESC | ||
TableKeyDedupe=Name | ||
Directive_Drop=testdata/BigQuery/test_diffschema_record-cdap-data-pipeline.json | ||
bqUpsertDedupeFile=testdata/BigQuery/BQUpsertDedupeFile | ||
bqDifferentRecordFile=testdata/BigQuery/BQDifferentRecordNameFile | ||
bqDateExpectedFile=testdata/BigQuery/BQTimeStampFile |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use proper naming convention , Change the file name from 'BQTimeStampFile' to 'BQDateFile'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resolved changed the name testdata/BigQuery/BQDateFile
@BQ_TIME_STAMP_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION | ||
Scenario: Verify successful record transfer for the Insert operation from BigQuery source plugin to BigQuery sink with partition type Time. | ||
@BQ_TIME_SOURCE_TEST @BQ_SINK_TEST @EXISTING_BQ_CONNECTION | ||
Scenario: Verify successful record transfer for the Insert operation from BigQuery source plugin to BigQuery sink with partition type Time and partition field is date. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bqPartitionFieldTime is used in line 435 but the scenario is related to partition filed is date , Can you add a value like 'bqPartitionFieldDate' in pluginParam properties and use the same for this scenario .
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added the field bqPartitionFieldDate
@@ -270,7 +271,11 @@ public static void createTempSourceBQTable() throws IOException, InterruptedExce | |||
} | |||
|
|||
@After(order = 1, value = "@BQ_SOURCE_TEST or @BQ_PARTITIONED_SOURCE_TEST or @BQ_SOURCE_DATATYPE_TEST or " + | |||
"@BQ_INSERT_SOURCE_TEST or @BQ_UPDATE_SINK_TEST") | |||
"@BQ_INSERT_SOURCE_TEST or @BQ_UPDATE_SINK_TEST or @BQ_UPSERT_SOURCE_TEST or @BQ_UPSERT_SINK_TEST or " + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The before and after hooks counts are not matching . After are 15 but before count is 13 . Can you please reverify ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reverified the after hooks
62edd14
to
62e6a10
Compare
fc2efdf
to
5f4f137
Compare
PR consists of scenarios from ITN Classes