Skip to content

Commit

Permalink
Refactor java/scala templates to maven/sbt instead
Browse files Browse the repository at this point in the history
  • Loading branch information
GezimSejdiu committed Nov 23, 2021
1 parent a248a2f commit 020bcc3
Show file tree
Hide file tree
Showing 11 changed files with 24 additions and 23 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/build.yml
Original file line number Diff line number Diff line change
Expand Up @@ -130,7 +130,7 @@ jobs:
strategy:
fail-fast: false
matrix:
template: [java, scala, python]
template: [maven, sbt, python]

needs: 'submit'
steps:
Expand Down
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -90,18 +90,18 @@ Make sure to fill in the `INIT_DAEMON_STEP` as configured in your pipeline.
### Spark Master
To start a Spark master:

docker run --name spark-master -h spark-master -e ENABLE_INIT_DAEMON=false -d bde2020/spark-master:3.1.1-hadoop3.2
docker run --name spark-master -h spark-master -d bde2020/spark-master:3.1.1-hadoop3.2

### Spark Worker
To start a Spark worker:

docker run --name spark-worker-1 --link spark-master:spark-master -e ENABLE_INIT_DAEMON=false -d bde2020/spark-worker:3.1.1-hadoop3.2
docker run --name spark-worker-1 --link spark-master:spark-master -d bde2020/spark-worker:3.1.1-hadoop3.2

## Launch a Spark application
Building and running your Spark application on top of the Spark cluster is as simple as extending a template Docker image. Check the template's README for further documentation.
* [Java template](template/java)
* [Maven template](template/maven)
* [Python template](template/python)
* [Scala template](template/scala)
* [Sbt template](template/sbt)

## Kubernetes deployment
The BDE Spark images can also be used in a Kubernetes enviroment.
Expand Down
4 changes: 2 additions & 2 deletions build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -20,8 +20,8 @@ if [ $# -eq 0 ]
build worker
build history-server
build submit
build java-template template/java
build scala-template template/scala
build maven-template template/maven
build sbt-template template/sbt
build python-template template/python

build python-example examples/python
Expand Down
1 change: 1 addition & 0 deletions template/java/Dockerfile → template/maven/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ FROM bde2020/spark-submit:3.1.1-hadoop3.2
LABEL maintainer="Gezim Sejdiu <[email protected]>, Giannis Mouchakis <[email protected]>"

ENV SPARK_APPLICATION_JAR_NAME application-1.0
ENV SPARK_APPLICATION_JAR_LOCATION /app/application.jar

COPY template.sh /

Expand Down
18 changes: 9 additions & 9 deletions template/java/README.md → template/maven/README.md
Original file line number Diff line number Diff line change
@@ -1,17 +1,17 @@
# Spark Java template
# Spark Maven template

The Spark Java template image serves as a base image to build your own Java application to run on a Spark cluster. See [big-data-europe/docker-spark README](https://github.com/big-data-europe/docker-spark) for a description how to setup a Spark cluster.
The Spark Maven template image serves as a base image to build your own Maven application to run on a Spark cluster. See [big-data-europe/docker-spark README](https://github.com/big-data-europe/docker-spark) for a description how to setup a Spark cluster.

### Package your application using Maven
You can build and launch your Java application on a Spark cluster by extending this image with your sources. The template uses [Maven](https://maven.apache.org/) as build tool, so make sure you have a `pom.xml` file for your application specifying all the dependencies.
You can build and launch your Maven application on a Spark cluster by extending this image with your sources. The template uses [Maven](https://maven.apache.org/) as build tool, so make sure you have a `pom.xml` file for your application specifying all the dependencies.

The Maven `package` command must create an assembly JAR (or 'uber' JAR) containing your code and its dependencies. Spark and Hadoop dependencies should be listes as `provided`. The [Maven shade plugin](http://maven.apache.org/plugins/maven-shade-plugin/) provides a plugin to build such assembly JARs.

### Extending the Spark Java template with your application
### Extending the Spark Maven template with your application

#### Steps to extend the Spark Java template
#### Steps to extend the Spark Maven template
1. Create a Dockerfile in the root folder of your project (which also contains a `pom.xml`)
2. Extend the Spark Java template Docker image
2. Extend the Spark Maven template Docker image
3. Configure the following environment variables (unless the default value satisfies):
* `SPARK_MASTER_NAME` (default: spark-master)
* `SPARK_MASTER_PORT` (default: 7077)
Expand All @@ -21,10 +21,10 @@ The Maven `package` command must create an assembly JAR (or 'uber' JAR) containi
4. Build and run the image
```
docker build --rm=true -t bde/spark-app .
docker run --name my-spark-app -e ENABLE_INIT_DAEMON=false --link spark-master:spark-master -d bde/spark-app
docker run --name my-spark-app --link spark-master:spark-master -d bde/spark-app
```

The sources in the project folder will be automatically added to `/usr/src/app` if you directly extend the Spark Java template image. Otherwise you will have to add and package the sources by yourself in your Dockerfile with the commands:
The sources in the project folder will be automatically added to `/usr/src/app` if you directly extend the Spark Maven template image. Otherwise you will have to add and package the sources by yourself in your Dockerfile with the commands:

COPY . /usr/src/app
RUN cd /usr/src/app \
Expand All @@ -34,7 +34,7 @@ If you overwrite the template's `CMD` in your Dockerfile, make sure to execute t

#### Example Dockerfile
```
FROM bde2020/spark-java-template:3.1.1-hadoop3.2
FROM bde2020/spark-maven-template:3.1.1-hadoop3.2
MAINTAINER Erika Pauwels <[email protected]>
MAINTAINER Gezim Sejdiu <[email protected]>
Expand Down
File renamed without changes.
File renamed without changes.
14 changes: 7 additions & 7 deletions template/scala/README.md → template/sbt/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Spark Scala template
# Spark SBT template

The Spark Scala template image serves as a base image to build your own Scala
The Spark SBT template image serves as a base image to build your own Scala
application to run on a Spark cluster. See
[big-data-europe/docker-spark README](https://github.com/big-data-europe/docker-spark)
for a description how to setup a Spark cluster.
Expand All @@ -11,7 +11,7 @@ for a description how to setup a Spark cluster.
spark-shell:

```
docker run -it --rm bde2020/spark-scala-template sbt console
docker run -it --rm bde2020/spark-sbt-template sbt console
```

You can also use directly your Docker image and test your own code that way.
Expand All @@ -29,9 +29,9 @@ When the Docker image is built using this template, you should get a Docker
image that includes a fat JAR containing your application and all its
dependencies.

### Extending the Spark Scala template with your application
### Extending the Spark SBT template with your application

#### Steps to extend the Spark Scala template
#### Steps to extend the Spark SBT template

1. Create a Dockerfile in the root folder of your project (which also contains
a `build.sbt`)
Expand All @@ -45,7 +45,7 @@ dependencies.
4. Build and run the image:
```
docker build --rm=true -t bde/spark-app .
docker run --name my-spark-app -e ENABLE_INIT_DAEMON=false --link spark-master:spark-master -d bde/spark-app
docker run --name my-spark-app --link spark-master:spark-master -d bde/spark-app
```

The sources in the project folder will be automatically added to `/usr/src/app`
Expand All @@ -62,7 +62,7 @@ the `/template.sh` script at the end.
#### Example Dockerfile

```
FROM bde2020/spark-scala-template:3.1.1-hadoop3.2
FROM bde2020/spark-sbt-template:3.1.1-hadoop3.2
MAINTAINER Cecile Tonglet <[email protected]>
Expand Down
File renamed without changes.
File renamed without changes.
File renamed without changes.

0 comments on commit 020bcc3

Please sign in to comment.