-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
accumulo-proxy #22 - Reduce Docker image size #23
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -13,20 +13,18 @@ | |
# See the License for the specific language governing permissions and | ||
# limitations under the License. | ||
|
||
FROM openjdk:8 | ||
FROM openjdk:8-alpine3.9 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How did you select alpine? I was looking at the openjdk tags on dockerhub to see what else was small and saw There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't think I spotted it to be fair, it's about 3M smaller so might be worth it and as you say it seems to be getting kept up to date. Note, all values here unless specified otherwise are from the Docker command. I did a quick rebuild with the new one and this is the sizes:
So it was actually larger to use the smaller base image, not what I expected, I did have to add wget and removed the installation of bash from the current Dockerfile as it was already present in 8-jre-slim On disk the pulled images were different to dockerhub values
Looking into it, it seems that dockerhub reports the compressed size, not the size on disk once you pull the image. I did the comparison for the final image sizes:
I tested both builds so they both function the same, perhaps taking the 80MB increase is worth it for being up to date? What are your thoughts, stick with what we have or update to 8-jre-slim? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I am in favor of 8-jre-slim because its more up to date and its based on Debian which I think a larger base of contributors and users may be familiar with. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Very nice analysis and write up of the analysis. |
||
|
||
EXPOSE 42424 | ||
|
||
WORKDIR /opt/accumulo-proxy | ||
|
||
ARG HADOOP_VERSION=3.2.1 | ||
ARG ZOOKEEPER_VERSION=3.5.7 | ||
ARG ACCUMULO_VERSION=2.0.0 | ||
ARG ACCUMULO_PROXY_VERSION=2.0.0-SNAPSHOT | ||
|
||
ARG HADOOP_HASH=a57962a24d178193349917730bf95cdc99bde9df | ||
ARG ZOOKEEPER_HASH=619928c8553b62775119e3d7d143a4714a160365 | ||
ARG ACCUMULO_HASH=b72bf5c3dcaa25387933a032925046234f30e17a | ||
ARG HADOOP_SHA512_HASH=d62709c3d7144fcaafc60e18d0fa03d7d477cc813e45526f3646030cd87dbf010aeccf3f4ce795b57b08d2884b3a55f91fe9d74ac144992d2dfe444a4bbf34ee | ||
ARG ZOOKEEPER_SHA512_HASH=b9baa1ecb3d4dc0ef648ce7c74da4c5267ee89534c7614b8f27d3b0bc52004dcfbb8cecec810ffb7c8c45053daf8a5e849ce60ba241280fa1e2ab1d3b4990494 | ||
ARG ACCUMULO_SHA512_HASH=1e2b822e0fd6ba5293b09203eb0c5cc230e9f111361634b4d5665b0ddd2b28f42d76699cb08aaeff9b3242efd5fe369bfc871a7dc361e935980889bcb7b4568f | ||
|
||
# Download from Apache mirrors instead of archive #9 | ||
ENV APACHE_DIST_URLS \ | ||
|
@@ -36,43 +34,55 @@ ENV APACHE_DIST_URLS \ | |
https://www.apache.org/dist/ \ | ||
https://archive.apache.org/dist/ | ||
|
||
ENV HADOOP_HOME /opt/hadoop | ||
ENV ZOOKEEPER_HOME /opt/apache-zookeeper | ||
ENV ACCUMULO_HOME /opt/accumulo | ||
|
||
RUN set -eux; \ | ||
download_bin() { \ | ||
local f="$1"; shift; \ | ||
local hash="$1"; shift; \ | ||
local distFile="$1"; shift; \ | ||
download_verify_and_extract() { \ | ||
local expectedHash="$1"; \ | ||
local distFile="$2"; \ | ||
local extractPath="$3"; \ | ||
local symlinkPath="$4"; \ | ||
local success=; \ | ||
local distUrl=; \ | ||
for distUrl in ${APACHE_DIST_URLS}; do \ | ||
if wget -nv -O "/tmp/${f}" "${distUrl}${distFile}"; then \ | ||
success=1; \ | ||
if wget -nv -O "/tmp/download.tar.gz" "${distUrl}${distFile}"; then \ | ||
# Checksum the download | ||
echo "${hash}" "/tmp/${f}" | sha1sum -c -; \ | ||
echo "${expectedHash} /tmp/download.tar.gz" | sha512sum -c -; \ | ||
# Extract the download | ||
mkdir "${extractPath}"; \ | ||
tar xzf "/tmp/download.tar.gz" -C "${extractPath}" --strip 1;\ | ||
# Symlink | ||
ln -s "${extractPath}" "${symlinkPath}"; \ | ||
# Tidy up the download | ||
rm -f "/tmp/download.tar.gz"; \ | ||
# Set success now we've done all our checks and tidied up | ||
success=1; \ | ||
break; \ | ||
fi; \ | ||
done; \ | ||
[ -n "${success}" ]; \ | ||
};\ | ||
\ | ||
download_bin "apache-zookeeper.tar.gz" "${ZOOKEEPER_HASH}" "zookeeper/zookeeper-${ZOOKEEPER_VERSION}/apache-zookeeper-${ZOOKEEPER_VERSION}-bin.tar.gz"; \ | ||
download_bin "hadoop.tar.gz" "$HADOOP_HASH" "hadoop/core/hadoop-${HADOOP_VERSION}/hadoop-$HADOOP_VERSION.tar.gz"; \ | ||
download_bin "accumulo.tar.gz" "${ACCUMULO_HASH}" "accumulo/${ACCUMULO_VERSION}/accumulo-${ACCUMULO_VERSION}-bin.tar.gz"; | ||
download_verify_and_extract "${ZOOKEEPER_SHA512_HASH}" "zookeeper/zookeeper-${ZOOKEEPER_VERSION}/apache-zookeeper-${ZOOKEEPER_VERSION}-bin.tar.gz" "/opt/apache-zookeeper-${ZOOKEEPER_VERSION}" "${ZOOKEEPER_HOME}"; \ | ||
download_verify_and_extract "${HADOOP_SHA512_HASH}" "hadoop/core/hadoop-${HADOOP_VERSION}/hadoop-$HADOOP_VERSION.tar.gz" "/opt/hadoop-${HADOOP_VERSION}" "${HADOOP_HOME}"; \ | ||
download_verify_and_extract "${ACCUMULO_SHA512_HASH}" "accumulo/${ACCUMULO_VERSION}/accumulo-${ACCUMULO_VERSION}-bin.tar.gz" "/opt/accumulo-${ACCUMULO_VERSION}" "${ACCUMULO_HOME}" ; | ||
|
||
# Install the dependencies into /opt/ | ||
RUN tar xzf /tmp/hadoop.tar.gz -C /opt/ && ln -s /opt/hadoop-${HADOOP_VERSION} /opt/hadoop | ||
RUN tar xzf /tmp/apache-zookeeper.tar.gz -C /opt/ && ln -s /opt/apache-zookeeper-${ZOOKEEPER_VERSION}-bin /opt/apache-zookeeper | ||
RUN tar xzf /tmp/accumulo.tar.gz -C /opt/ && ln -s /opt/accumulo-${ACCUMULO_VERSION} /opt/accumulo && sed -i 's/\${ZOOKEEPER_HOME}\/\*/\${ZOOKEEPER_HOME}\/\*\:\${ZOOKEEPER_HOME}\/lib\/\*/g' /opt/accumulo/conf/accumulo-env.sh | ||
# Fix the ZooKeeper classpath for Accumulo | ||
RUN sed -i 's/\${ZOOKEEPER_HOME}\/\*/\${ZOOKEEPER_HOME}\/\*\:\${ZOOKEEPER_HOME}\/lib\/\*/g' /opt/accumulo/conf/accumulo-env.sh | ||
|
||
ENV HADOOP_HOME /opt/hadoop | ||
ENV ZOOKEEPER_HOME /opt/apache-zookeeper | ||
ENV ACCUMULO_HOME /opt/accumulo | ||
# Add bash as a dependency for accumulo-proxy and accumulo shell scripts | ||
RUN apk --no-cache add bash | ||
|
||
# Add the proxy binary | ||
COPY target/accumulo-proxy-${ACCUMULO_PROXY_VERSION}-bin.tar.gz /tmp/ | ||
RUN tar xzf /tmp/accumulo-proxy-${ACCUMULO_PROXY_VERSION}-bin.tar.gz -C /opt/accumulo-proxy --strip 1 | ||
ENV ACCUMULO_PROXY_HOME /opt/accumulo-proxy | ||
ADD target/accumulo-proxy-${ACCUMULO_PROXY_VERSION}-bin.tar.gz /opt/ | ||
RUN ln -s "/opt/accumulo-proxy-${ACCUMULO_PROXY_VERSION}/" "${ACCUMULO_PROXY_HOME}" | ||
|
||
# Ensure Accumulo is on the path. | ||
ENV PATH "${PATH}:${ACCUMULO_HOME}/bin" | ||
|
||
WORKDIR ${ACCUMULO_PROXY_HOME} | ||
|
||
CMD ["/opt/accumulo-proxy/bin/accumulo-proxy", "-p", "/opt/accumulo-proxy/conf/proxy.properties"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure the impact of this, but I do know a lot of Accumulo scripts use bash specifically.