Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue 3/improve readme #9

Merged
merged 13 commits into from
Sep 19, 2018
Merged
101 changes: 98 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,107 @@
# Databricks Java Rest Client

# How To Run Integration Tests
_This is a simple java library that provides programmatic access to the [Databricks Rest Service](https://docs.databricks.com/api/latest/index.html)._


## API Overview

[![Javadocs](http://www.javadoc.io/badge/com.edmunds.databricks/databricks-rest-client.svg)](http://www.javadoc.io/doc/com.edmunds.databricks/databricks-rest-client)

This library only implements a percentage of all of the functionality that the Databricks Rest Interface provides.
The idea is to add functionality as users of this library need it.
Here are the current Endpoints that are supported:

- Cluster Service

- Dbfs Service

- Job Service

- Library Service

- Workspace Service

Please look at the javadocs for the specific service to get more detailed information on what
functionality is currently available.

If there is important functionality that is currently missing, please create a github issue.

## Examples
```java
public class MyClient {
public static void main(String[] args) throws DatabricksRestException, IOException {
// Construct a serviceFactory using token authentication
DatabricksServiceFactory serviceFactory =
DatabricksServiceFactory.Builder
.createServiceFactoryWithTokenAuthentication("myToken", "myHost")
.withMaxRetries(5)
.withRetryInterval(10000L)
.build();

// Lets get our databricks job "myJob" and edit maxRetries to 5
JobDTO jobDTO = serviceFactory.getJobService().getJobByName("myJob");
JobSettingsDTO jobSettingsDTO = jobDTO.getSettings();
jobSettingsDTO.setMaxRetries(5);
serviceFactory.getJobService().upsertJob(jobSettingsDTO, true);

// Lets install a jar to a specific cluster
LibraryDTO libraryDTO = new LibraryDTO();
libraryDTO.setJar("s3://myBucket/myJar.jar");
for (ClusterInfoDTO clusterInfoDTO : serviceFactory.getClusterService().list()) {
if (clusterInfoDTO.getClusterName().equals("myCluster")) {
serviceFactory.getLibraryService().install(clusterInfoDTO.getClusterId(), new LibraryDTO[]{libraryDTO});
}
}
}
}
```
For more examples, take a look at the service tests.

## Building, Installing and Running

### Getting Started and Prerequisites

- You will need Maven installed

### Building

*How to build the project locally:*
```mvn clean install```


## Unit Tests

There are currently no unit tests for this project. Our thoughts are that the only testable
functionality is the integration between our client on an actual databricks instance.
As such we currently only have integration tests.


## Integration Tests
IMPORTANT: integration tests do not execute automatically as part of a build.
It is your responsibility (and Pull Request Reviewers) to make sure the integration tests
pass before merging in code.

### Setup
You need to set the following environment properties in your .bash_profile
```bash
export [email protected]
export DB_PASSWORD=mypassword
export DB_URL=my-databricks-account.databricks.com
export DB_TOKEN=my-token
```

In order for the integration tests to run, you must
have a valid token for the user in question.
Here is how to set it up: [Set up Tokens](https://docs.databricks.com/api/latest/authentication.html)


# To execute the integration tests please run:
### Executing Integration Tests
mvn clean install org.apache.maven.plugins:maven-failsafe-plugin:integration-test

## Deployment

Please see the CONTRIBUTING.md about our release process.
As this is a library, there is no deployment operation needed.

## Contributing

Please read [CONTRIBUTING.md](CONTRIBUTING.md) for the process for merging code into master.
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,9 @@

package com.edmunds.rest.databricks;

import com.edmunds.rest.databricks.restclient.DatabricksRestClient;
import com.edmunds.rest.databricks.restclient.DatabricksRestClientImpl;
import com.edmunds.rest.databricks.restclient.DatabricksRestClientImpl425;
import com.edmunds.rest.databricks.service.ClusterService;
import com.edmunds.rest.databricks.service.ClusterServiceImpl;
import com.edmunds.rest.databricks.service.DbfsService;
Expand All @@ -29,7 +32,9 @@


/**
* Factory class for all other specific Databricks Service Wrappers.
* This is the class that clients should interact with.
* It provides singletons for all of the Services, as well as
* abstracting the construction of the databricks rest client.
*/
public final class DatabricksServiceFactory {

Expand All @@ -43,16 +48,23 @@ public final class DatabricksServiceFactory {
private JobService jobService;
private DbfsService dbfsService;

public DatabricksServiceFactory(DatabricksRestClient databricksRestClient) {
this.client2dot0 = databricksRestClient;
}

@Deprecated
public DatabricksServiceFactory(String username, String password, String host) {
this(username, password, host, DEFAULT_HTTP_CLIENT_MAX_RETRY,
DEFAULT_HTTP_CLIENT_RETRY_INTERVAL);
}

/**
* Creating a Databricks Service object.
*
* @param maxRetry http client maxRetry when failed due to I/O , timeout error
* @param retryInterval http client retry interval when failed due to I/O , timeout error
*/
@Deprecated
public DatabricksServiceFactory(String username, String password, String host, int maxRetry,
long retryInterval) {
this(username, password, host, maxRetry, retryInterval, false);
Expand All @@ -64,17 +76,35 @@ public DatabricksServiceFactory(String username, String password, String host, i
*
* @param useLegacyAPI425 choose what version of API compatible HttpClient.
*/
@Deprecated
public DatabricksServiceFactory(String username, String password, String host, int maxRetry,
long retryInterval, boolean useLegacyAPI425) {
if (useLegacyAPI425) {
client2dot0 = new DatabricksRestClientImpl425(username, password, host, "2.0", maxRetry,
retryInterval);
client2dot0 = DatabricksRestClientImpl425
.createClientWithUserPassword(username, password, host, "2.0", maxRetry,
retryInterval);
} else {
client2dot0 = new DatabricksRestClientImpl(username, password, host, "2.0", maxRetry,
retryInterval);
client2dot0 = DatabricksRestClientImpl
.createClientWithUserPassword(username, password, host, "2.0", maxRetry,
retryInterval);
}
}

/**
* Create a databricks service factory using personal token authentication instead.
*
* @param personalToken your personal token
* @param host the databricks host
* @param maxRetry the maximum number of retries
* @param retryInterval the retry interval between each attempt
*/
@Deprecated
public DatabricksServiceFactory(String personalToken, String host,
int maxRetry, long retryInterval) {
client2dot0 = DatabricksRestClientImpl
.createClientWithTokenAuthentication(personalToken, host, "2.0", maxRetry, retryInterval);
}

/**
* Will return a Databricks Cluster Service singleton.
*/
Expand Down Expand Up @@ -124,4 +154,85 @@ public DbfsService getDbfsService() {
}
return dbfsService;
}

/**
* This is how the DatabricksServiceFactory should be constructed. This gives flexibility to add
* more parameters later without ending up with large constructors.
*/
public static class Builder {

long retryInterval = DEFAULT_HTTP_CLIENT_RETRY_INTERVAL;
int maxRetries = DEFAULT_HTTP_CLIENT_MAX_RETRY;
String token;
String host;
String username;
String password;

private Builder() {
//NO-OP
}

/**
* Creates a DatabricksServiceFactory using token authentication.
*
* @param token your databricks token
* @param host the databricks host where that token is valid
* @return the builder object
*/
public static Builder createServiceFactoryWithTokenAuthentication(String token, String host) {
Builder builder = new Builder();
builder.token = token;
builder.host = host;
return builder;
}

/**
* Creates a DatabrickServiceFactory using username password authentication.
*
* @param username databricks username
* @param password databricks password
* @param host the host object
* @return the builder object
*/
public static Builder createServiceFactoryWithUserPasswordAuthentication(String username,
String password, String host) {
Builder builder = new Builder();
builder.username = username;
builder.password = password;
builder.host = host;
return builder;
}

public Builder withMaxRetries(int maxRetries) {
this.maxRetries = maxRetries;
return this;
}

public Builder withRetryInterval(long retryInterval) {
this.retryInterval = retryInterval;
return this;
}

/**
* Builds a DatabricksServiceFactory. Conducts basic validation.
*
* @return the databricks service factory object
*/
public DatabricksServiceFactory build() {
if (token != null) {
return new DatabricksServiceFactory(
DatabricksRestClientImpl
.createClientWithTokenAuthentication(token, host, "2.0", maxRetries, retryInterval)
);
} else if (username != null && password != null) {
return new DatabricksServiceFactory(
DatabricksRestClientImpl
.createClientWithUserPassword(username, password, host, "2.0", maxRetries,
retryInterval)
);
} else {
throw new IllegalArgumentException("Token or username/password must be set!");
}
}
}
}
5 changes: 4 additions & 1 deletion src/main/java/com/edmunds/rest/databricks/JobRunner.java
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,10 @@ private JobService getService() {
String password = parser.getPassword();
String hostname = parser.getHostname();

DatabricksServiceFactory factory = new DatabricksServiceFactory(username, password, hostname);
DatabricksServiceFactory factory =
DatabricksServiceFactory.Builder
.createServiceFactoryWithUserPasswordAuthentication(username, password, hostname)
.build();
return factory.getJobService();
}

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,8 @@

/**
* A Cluster Request object.
* Should be deprecated in favor of using DTO objects.
* TODO Should be deprecated in favor of using DTO objects.
*/
@Deprecated
public class CreateClusterRequest extends DatabricksRestRequest {

private CreateClusterRequest(Map<String, Object> data) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,6 @@
/**
* Base class for Request Objects.
*/
@Deprecated
public abstract class DatabricksRestRequest {

private Map<String, Object> data;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -25,9 +25,8 @@

/**
* Edit Cluster Request object.
* Should be deprecated in favor of using DTOs.
* TODO Should be deprecated in favor of using DTOs.
*/
@Deprecated
public class EditClusterRequest extends DatabricksRestRequest {

private EditClusterRequest(Map<String, Object> data) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,8 @@

/**
* Export Workspace Request object.
* Should be deprecated in favor of using DTOs.
* TODO Should be deprecated in favor of using DTOs.
*/
@Deprecated
public class ExportWorkspaceRequest extends DatabricksRestRequest {

private ExportWorkspaceRequest(Map<String, Object> data) {
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,8 @@

/**
* Import Workspace Request object.
* Deprecated in favor of using DTOs
* TODO Should be deprecated in favor of using DTOs
*/
@Deprecated
public class ImportWorkspaceRequest extends DatabricksRestRequest {

private ImportWorkspaceRequest(Map<String, Object> data) {
Expand Down
Loading