-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify acceptance tests #106
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -40,7 +40,11 @@ To build the connector jar: | |
|
||
### Prerequisite | ||
|
||
Make sure you have the BigQuery Storage API enabled in your GCP project. Follow [these instructions](https://cloud.google.com/bigquery/docs/reference/storage/#enabling_the_api). | ||
Enable the BigQuery Storage API for your project: | ||
|
||
```sh | ||
gcloud services enable bigquerystorage.googleapis.com | ||
``` | ||
|
||
### Option 1: connectors init action | ||
|
||
|
@@ -614,19 +618,19 @@ There are multiple options to override the default behavior and to provide custo | |
for specific users, specific groups, or for all users that run the Hive query by default using | ||
the below properties: | ||
|
||
- `bq.impersonation.service.account.for.user.<USER_NAME>` (not set by default) | ||
- `bq.impersonation.service.account.for.user.<USER_NAME>` (not set by default) | ||
|
||
The service account to be impersonated for a specific user. You can specify multiple | ||
properties using that pattern for multiple users. | ||
The service account to be impersonated for a specific user. You can specify multiple | ||
properties using that pattern for multiple users. | ||
|
||
- `bq.impersonation.service.account.for.group.<GROUP_NAME>` (not set by default) | ||
- `bq.impersonation.service.account.for.group.<GROUP_NAME>` (not set by default) | ||
|
||
The service account to be impersonated for a specific group. You can specify multiple | ||
properties using that pattern for multiple groups. | ||
The service account to be impersonated for a specific group. You can specify multiple | ||
properties using that pattern for multiple groups. | ||
|
||
- `bq.impersonation.service.account` (not set by default) | ||
- `bq.impersonation.service.account` (not set by default) | ||
|
||
Default service account to be impersonated for all users. | ||
Default service account to be impersonated for all users. | ||
|
||
If any of the above properties are set then the service account specified will be impersonated by | ||
generating a short-lived credentials when accessing BigQuery. | ||
|
@@ -773,29 +777,26 @@ You must use Java version 8, as it's the version that Hive itself uses. Make sur | |
|
||
Acceptance tests create Dataproc clusters with the connector and run jobs to verify it. | ||
|
||
The following environment variables must be set and **exported** first. | ||
|
||
* `GOOGLE_APPLICATION_CREDENTIALS` - the full path to a credentials JSON, either a service account or the result of a | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why are they removed? |
||
`gcloud auth login` run | ||
* `GOOGLE_CLOUD_PROJECT` - The Google cloud platform project used to test the connector | ||
* `TEST_BUCKET` - The GCS bucked used to test writing to BigQuery during the integration tests | ||
* `ACCEPTANCE_TEST_BUCKET` - The GCS bucked used to test writing to BigQuery during the acceptance tests | ||
|
||
To run the acceptance tests: | ||
|
||
```sh | ||
./mvnw verify -Pdataproc21,acceptance | ||
``` | ||
1. Enable the Dataproc API for your project: | ||
```sh | ||
gcloud services enable dataproc.googleapis.com | ||
``` | ||
2. Run the tests: | ||
```sh | ||
./mvnw verify -Pdataproc21,acceptance | ||
``` | ||
|
||
If you want to avoid rebuilding `shaded-dependencies` and `shaded-test-dependencies` when there is no changes in these | ||
If you want to avoid rebuilding `shaded-dependencies` and `shaded-acceptance-tests-dependencies` when there is no changes in these | ||
modules, you can break it down into several steps, and only rerun the necessary steps: | ||
|
||
```sh | ||
# Install hive-bigquery-parent/pom.xml to Maven local repo | ||
mvn install:install-file -Dpackaging=pom -Dfile=hive-bigquery-parent/pom.xml -DpomFile=hive-bigquery-parent/pom.xml | ||
|
||
# Build and install shaded-dependencies and shaded-test-dependencies jars to Maven local repo | ||
mvn clean install -pl shaded-dependencies,shaded-test-dependencies -Pdataproc21 -DskipTests | ||
# Build and install shaded-deps-dataproc21 and shaded-acceptance-tests-dependencies jars to Maven local repo | ||
mvn clean install -pl shaded-deps-dataproc21,shaded-acceptance-tests-dependencies -Pdataproc21 -DskipTests | ||
|
||
# Build and test connector | ||
mvn clean verify -pl connector -Pdataproc21,acceptance | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -46,7 +46,7 @@ public class TestUtils { | |
public static final String MANAGED_TEST_TABLE_NAME = "managed_test"; | ||
public static final String FIELD_TIME_PARTITIONED_TABLE_NAME = "field_time_partitioned"; | ||
public static final String INGESTION_TIME_PARTITIONED_TABLE_NAME = "ingestion_time_partitioned"; | ||
public static final String TEST_BUCKET_ENV_VAR = "TEST_BUCKET"; | ||
public static final String INTEGRATION_BUCKET_ENV_VAR = "INTEGRATION_BUCKET"; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You might want to document this env variable in README with its default value. |
||
|
||
// The BigLake bucket and connection must be created before running the tests. | ||
// Also, the connection's service account must be given permission to access the bucket. | ||
|
@@ -211,8 +211,9 @@ public static String getBigLakeBucket() { | |
* Returns the name of the bucket used to store temporary Avro files when testing the indirect | ||
* write method. This bucket is created automatically when running the tests. | ||
*/ | ||
public static String getTestBucket() { | ||
return System.getenv().getOrDefault(TEST_BUCKET_ENV_VAR, getProject() + "-integration-tests"); | ||
public static String getIntegrationTestBucket() { | ||
return System.getenv() | ||
.getOrDefault(INTEGRATION_BUCKET_ENV_VAR, getProject() + "-integration-tests"); | ||
} | ||
|
||
public static void createBqDataset(String dataset) { | ||
|
@@ -269,7 +270,15 @@ private static Storage getStorageClient() { | |
} | ||
|
||
public static void createBucket(String bucketName) { | ||
getStorageClient().create(BucketInfo.newBuilder(bucketName).setLocation(LOCATION).build()); | ||
try { | ||
getStorageClient().create(BucketInfo.newBuilder(bucketName).setLocation(LOCATION).build()); | ||
} catch (StorageException e) { | ||
if (e.getCode() == 409) { | ||
// The bucket already exists, which is okay. | ||
return; | ||
} | ||
throw e; | ||
} | ||
} | ||
|
||
public static void uploadBlob(String bucketName, String objectName, byte[] contents) { | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,17 +15,13 @@ | |
*/ | ||
package com.google.cloud.hive.bigquery.connector.acceptance; | ||
|
||
import com.google.common.base.Preconditions; | ||
import org.apache.parquet.Strings; | ||
|
||
public class AcceptanceTestConstants { | ||
|
||
public static final String REGION = "us-west1"; | ||
public static final String DATAPROC_ENDPOINT = REGION + "-dataproc.googleapis.com:443"; | ||
public static final String PROJECT_ID = | ||
Preconditions.checkNotNull( | ||
System.getenv("GOOGLE_CLOUD_PROJECT"), | ||
"Please set the 'GOOGLE_CLOUD_PROJECT' environment variable"); | ||
public static final String ACCEPTANCE_BUCKET_ENV_VAR = "ACCEPTANCE_BUCKET"; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. ditto. |
||
|
||
public static final boolean CLEAN_UP_CLUSTER = | ||
Strings.isNullOrEmpty(System.getenv("CLEAN_UP_CLUSTER")) | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many projects have a
:
in their names, e.g.,google.com:project
, which makes this default bucket name invalid. At least, consider automatically converting:
to_
or-
here.