Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Julia build in container #45

Merged
merged 72 commits into from
Aug 23, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
219fafc
test: try build rather than instantiate
tdivoll Jul 21, 2023
66b816c
Merge branch 'main' into feat-docker-build
tdivoll Jul 21, 2023
b6b11c2
test: activate IFTPipeline module
tdivoll Jul 24, 2023
f35aa64
test: activate IFTPipeline module
tdivoll Jul 24, 2023
c316020
test: add quotes to module call
tdivoll Jul 24, 2023
44acad7
test: add quotes to module call
tdivoll Jul 24, 2023
974473c
test: julia import IFTP
tdivoll Jul 24, 2023
a59913b
test: using IFTP
tdivoll Jul 24, 2023
dd6a02d
test: rearrange order of calls
tdivoll Jul 24, 2023
93044c1
test: add deliberate build for PyCall
tdivoll Jul 25, 2023
9d85834
test: copy script to opt
Jul 25, 2023
f036ccd
fix: path to copied script
Jul 25, 2023
026560a
fix: path to copied script
Jul 25, 2023
79d675a
test: try cp to usr local bin
Jul 25, 2023
717ef03
test: try cp to usr local bin
Jul 25, 2023
d92e28d
test: try cp to usr local bin
Jul 25, 2023
b225460
test: try cp to usr local bin
Jul 25, 2023
8e30f68
test: try full path for cp to usr local bin
Jul 25, 2023
f0b65e6
test: switch order of cp
Jul 25, 2023
9ed5fee
test: remove cp
Jul 25, 2023
4819e40
fix: chmod path
Jul 25, 2023
3139a54
fix: chmod path
Jul 25, 2023
d96ba54
feat: add arm64 platform to build
Jul 26, 2023
719146c
test: arm64 only build
Jul 26, 2023
8c8510d
fix: revert to amd64 only
Jul 26, 2023
887f494
test: update to julia 1.9.2-bookworm
Jul 26, 2023
c50b1d6
fix: add extra update to include git
Jul 26, 2023
32b3fa0
test: try setting depot path
Jul 26, 2023
70571a0
test: try adding user as root to dockerfile
Jul 26, 2023
388cb8c
test: try chmod for julia-conda in container
Jul 26, 2023
a39785b
test: try chmod for julia inside container for conda to add packages
Jul 26, 2023
0399586
test: try changing order of chmod commands
Jul 26, 2023
7b6edc1
test: reorder chmod command
Jul 27, 2023
6b000e6
fix syntax in landmask task call
Jul 31, 2023
c016b3a
Merge branch 'main' into feat-docker-build
Jul 31, 2023
0bbd8af
test: activate in script
Jul 31, 2023
ef45ff3
test: soft link for julia inside container
Jul 31, 2023
f003ecc
test: try root user in container
Jul 31, 2023
76a9b67
test: add full path to julia for build
Jul 31, 2023
74e1a16
test: copy workflow dir to /tmp
Jul 31, 2023
d590425
test: turn off logging
Jul 31, 2023
267e6ff
test: copy cli script to tmp
Aug 1, 2023
c1b43d3
test: copy cli script to local bin
Aug 1, 2023
a0ba3c8
fix: update all tasks with path to cli script
Aug 1, 2023
6a61f12
feat: add tasks to pull docker images
Aug 1, 2023
3d96134
feat: update docker container for local
tdivoll Aug 7, 2023
491f3ac
Merge branch 'main' into feat-docker-build
Aug 16, 2023
d040a67
add pyproj and rasterio to container
Aug 16, 2023
cf30c5c
set path to python
Aug 16, 2023
72c9896
merge changes from pr-53
Aug 16, 2023
ce08096
test: remove deliberate PyCall build from Dockerfile
Aug 16, 2023
36f7972
update python path in Dockerfile
Aug 16, 2023
2698fe8
revert python path to default
Aug 17, 2023
4db6b29
Merge branch 'main' into feat-docker-build
Aug 17, 2023
a262133
test: make h5 dir outside rather than in container
Aug 18, 2023
7d0b467
test: rm py deps from container after IFT refactor
Aug 18, 2023
b60aae4
Merge branch '62-deprecate-getiftversion-in-favor-of-iftversion' into…
Aug 21, 2023
86cba59
reset h5 script
Aug 21, 2023
1aca8b9
Merge branch 'main' into feat-docker-build
Aug 21, 2023
82e3802
feat: final cylc pipeline test in hpc
Aug 21, 2023
9b48017
fix: indentation of tasks in flow file
Aug 21, 2023
c5b722e
fix: filename paths in makefilenames
Aug 21, 2023
9e45a97
fix: last path to fix for makefilenames
Aug 21, 2023
51ed16b
fix: revert julia commands to call from local julia install
tdivoll Aug 22, 2023
a8aa62b
fix: add project activation to last two cylc tasks
tdivoll Aug 22, 2023
4e156a3
feat: add compat to project toml
tdivoll Aug 22, 2023
7663c42
docs: clean up line ending and missing readme
tdivoll Aug 22, 2023
a6fffad
docs: add line ending
tdivoll Aug 22, 2023
9a5e6e9
test: try logger now that binding paths are working
Aug 22, 2023
0b5944e
feat: add binding for logging report directory
Aug 23, 2023
b4c124b
fix: indentation error
Aug 23, 2023
2596540
Update config/cylc_hpc/flow.cylc
tdivoll Aug 23, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
12 changes: 7 additions & 5 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,19 @@ ENV JULIA_PROJECT=/opt/ice-floe-tracker-pipeline
ENV JULIA_DEPOT_PATH=/opt/julia
ENV JULIA_PKGDIR=/opt/julia

RUN apt-get clean && apt-get update && \
apt-get install -y wget python3-pip git python3.10 && \
rm -rf /var/lib/apt/list/*
RUN apt-get -y update && \
apt-get install -y git python3.10 && \
rm -rf /var/lib/apt/list/*

WORKDIR /opt

RUN git clone https://github.com/WilhelmusLab/ice-floe-tracker-pipeline.git

RUN julia --project="/opt/ice-floe-tracker-pipeline" -e 'ENV["PYTHON"]=""; using Pkg; Pkg.instantiate(); Pkg.precompile(); Pkg.build("PyCall")'
RUN /usr/local/julia/bin/julia --project="/opt/ice-floe-tracker-pipeline" -e 'ENV["PYTHON"]=""; using Pkg; Pkg.build()'

RUN chmod a+x /opt/ice-floe-tracker-pipeline/workflow/scripts/ice-floe-tracker.jl
COPY workflow/scripts/ice-floe-tracker.jl /usr/local/bin/ice-floe-tracker.jl

RUN chmod a+x /usr/local/bin/ice-floe-tracker.jl
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really required? I see that the command used to run the ice-floe-tracker.jl script includes 'julia' which is probably required to pass the -t auto flag to enable multithreading.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it has to do with the permissions in a singularity container. Things work for container permissions when we copy to/usr/local/bin. I was running into issues with running it directly from the repo.


ENV JULIA_DEPOT_PATH="$HOME/.julia:$JULIA_DEPOT_PATH"

Expand Down
11 changes: 11 additions & 0 deletions Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -13,3 +13,14 @@ LoggingExtras = "e6f89c97-d47a-5376-807f-9c37f3926c36"
Pkg = "44cfe95a-1eb2-52ea-b672-e2afdf69b78f"
PyCall = "438e738f-606a-5dbb-bf0a-cddfbfd45ab0"
TOML = "fa267f1f-6049-4f14-aa54-33bafae1ed76"

[compat]
ArgParse = "1.1.4"
Folds = "0.2.8"
HDF5 = "0.16.15"
IceFloeTracker = "0.2.1"
LoggingExtras = "1.0.1"
PyCall = "1.96.1"
Pkg = "1.9.0"
TOML = "1.0.3"

Comment on lines +17 to +26
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

28 changes: 11 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,25 +22,22 @@ Cylc is used to encode the entire pipeline from start to finish and relies on th
* this will start a compute session for 1 day with 32 GB memory and 20 cores
* see [here](https://docs.ccv.brown.edu/oscar/submitting-jobs/interact) for more options

3. Load the Julia module
- [ ] `module load julia/1.9.0`

4. Build a virtual environment and install Cylc
3. Build a virtual environment and install Cylc
- [ ] `cd <your-project-path>/ice-floe-tracker-pipeline`
- [ ] `conda env create -f ./config/ift-env.yaml`
- [ ] `conda activate ift-env`

5. Register an account with [space-track.org](https://www.space-track.org/) for SOIT
4. Register an account with [space-track.org](https://www.space-track.org/) for SOIT

6. Export SOIT username/password to environment variable
5. Export SOIT username/password to environment variable
- [ ] From your home directory `nano .bash_profile`
- [ ] add `export HISTCONTROL=ignoreboth` to the bottom of your .bash_profile
* this will ensure that your username/password are not stored in history
* when exporting the following environment variables, there __must__ be a space in front of each command
- [ ] ` export SPACEUSER=<firstname>_<lastname>@brown.edu`
- [ ] ` export SPACEPSWD=<password>`
Comment on lines +32 to 38
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From Josh's talk yesterday, I was thinking maybe secrets could be used for this. Not sure if it's possible.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll make an issue for looking into this. That might be a good way to do it. Talking with @broarr, it seemed like this use case was best to not store anything in a file.


7. Prepare the runtime environment
6. Prepare the runtime environment

Cylc will use software dependencies inside a Singularity container to fetch images and satellite times from external APIs.
- [ ] It is a good idea to reset the Singularity cache dir as specified [here](https://docs.ccv.brown.edu/oscar/singularity-containers/building-images)
Expand All @@ -56,10 +53,7 @@ Cylc is used to encode the entire pipeline from start to finish and relies on th
- maxfloearea
- project_dir
**Note:** bounding box format = top_left_x top_left_y bottom_right_x bottom_right_y (x = lat(wgs84) or easting(epsg3413), y = lon(wgs84) or northing(epsg3413))

- [ ] run `singularity build fetchdata.simg docker://brownccv/icefloetracker-fetchdata:main`
* This will pull the image containing all the depencies and make them accessible to Cylc
- [ ] then, build the workflow, run it, and open the terminal-based user interface (TUI) to monitor the progress of each task.
- [ ] then, build the workflow, run it, and open the Terminal-based User Interface (TUI) to monitor the progress of each task.
![TUI example](./tui-example.png)

```
Expand All @@ -83,9 +77,9 @@ Cylc is used to encode the entire pipeline from start to finish and relies on th

### Running the Cylc pipeline locally

When running locally, make sure you have at least Julia 1.9.0 installed with the correct architecture for your local machine. (https://julialang.org/downloads/)
To use the Cylc pipeline locally, also make sure Docker Desktop client is running in the background. (https://www.docker.com/products/docker-desktop/)

#### Prerequisites
__Julia:__ When running locally, make sure you have at least Julia 1.9.0 installed with the correct architecture for your local machine. (https://julialang.org/downloads/)
__Docker Desktop:__ Also make sure Docker Desktop client is running in the background to use the Cylc pipeline locally. (https://www.docker.com/products/docker-desktop/)
cpaniaguam marked this conversation as resolved.
Show resolved Hide resolved

1. Build a virtual environment and install Cylc
- [ ] `cd <your-project-path>/ice-floe-tracker-pipeline`
Expand Down Expand Up @@ -123,7 +117,7 @@ To use the Cylc pipeline locally, also make sure Docker Desktop client is runnin
- [ ] `cylc play <workflow-name>`
- [ ] `cylc tui <workflow-name>`

The Terminal-based user interface provides a simple way to watch the status of each task called in the `flow.cylc` workflow. Use arrow keys to investigate each task (see more [here](https://cylc.github.io/cylc-doc/latest/html/7-to-8/major-changes/ui.html#cylc-tui).
The Terminal-based User Interface provides a simple way to watch the status of each task called in the `flow.cylc` workflow. Use arrow keys to investigate each task (see more [here](https://cylc.github.io/cylc-doc/latest/html/7-to-8/major-changes/ui.html#cylc-tui).
![TUI](tui-example.png)).

If you need to change parameters and re-run a workflow, first do:
Expand Down Expand Up @@ -151,10 +145,10 @@ Open a Julia REPL and build the package
Enter Pkg mode and precompile
- [ ] `]`
- [ ] `activate .`
- [ ] `precompile`
- [ ] `build`

Use the backspace to go back to the Julia REPL and start running Julia code!

__Note__ Use the help for wrapper scripts to learn about available options in each wrapper function
For example, from a bash prompt:
`julia --project=. ice-floe-tracker-pipeline/workflow/scripts/ice-floe-tracker.jl extractfeatures --help`
`julia --project=. ./workflow/scripts/ice-floe-tracker.jl extractfeatures --help`
27 changes: 20 additions & 7 deletions config/cylc_hpc/flow.cylc
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
initial cycle point = 1
[[graph]]
R1 = """
mkpaths => fetchdata => soit & landmask => preprocess => extractfeatures => tracking & exportH5
mkpaths & pullfetchimage & pulljuliaimage => fetchdata & soit => landmask => preprocess => extractfeatures => tracking & exportH5
"""
[runtime]
[[root]]
Expand All @@ -22,6 +22,8 @@
project_dir = "~/ice-floe-tracker-pipeline"

# Recommend using these default paths for output
julia_exec = "/usr/local/julia/bin/julia"
report_dir = $project_dir/"workflow/report"
results_dir = $project_dir/"results"
fetchdata_dir = $project_dir/"resources"
truecolor_dir = $fetchdata_dir/"truecolor"
Expand All @@ -30,13 +32,19 @@
preprocess_dir = $results_dir/"preprocess"
soit_dir = $results_dir/"soit"
tracker_dir = $results_dir/"tracker"
h5_dir = $preprocess_dir/"hdf5-files"

[[mkpaths]]
script = """
mkdir -p $soit_dir
mkdir -p $landmask_dir
mkdir -p $preprocess_dir
mkdir -p $tracker_dir
mkdir -p $h5_dir
"""
[[pullfetchimage]]
script = """
apptainer build --force $project_dir/fetchdata.simg docker://brownccv/icefloetracker-fetchdata:main
"""
[[fetchdata]]
script = """
Expand All @@ -46,23 +54,28 @@
script = """
singularity exec --bind $soit_dir:/tmp $project_dir/fetchdata.simg python3 /usr/local/bin/pass_time_cylc.py --startdate $startdate --enddate $enddate --csvoutpath /tmp --centroid_x $centroid_x --centroid_y $centroid_y --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD
"""
[[pulljuliaimage]]
script = """
apptainer build --force $project_dir/icefloetracker-julia.simg docker://brownccv/icefloetracker-julia:pr-45
"""
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After merging, I'll have to start a new PR and change the image tag to main rather than pr-45.

[[landmask]]
script = """
singularity exec $project_dir/icefloetracker-julia.simg julia --project="/opt/ice-floe-tracker-pipeline" -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl landmask $fetchdata_dir $landmask_dir
singularity exec --bind $landmask_dir:/tmp,$report_dir:/usr/local/bin/../report $project_dir/icefloetracker-julia.simg $julia_exec -t auto /usr/local/bin/ice-floe-tracker.jl landmask $fetchdata_dir /tmp
"""
[[preprocess]]
script = """
julia -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl preprocess -t $fetchdata_dir/truecolor -r $fetchdata_dir/reflectance -l $landmask_dir -p $soit_dir -o $preprocess_dir
singularity exec --bind $preprocess_dir:/tmp,$report_dir:/usr/local/bin/../report $project_dir/icefloetracker-julia.simg $julia_exec -t auto /usr/local/bin/ice-floe-tracker.jl preprocess -t $fetchdata_dir/truecolor -r $fetchdata_dir/reflectance -l $landmask_dir -p $soit_dir -o /tmp
"""
[[extractfeatures]]
script = """
julia -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl extractfeatures -i $preprocess_dir -o $preprocess_dir --minarea $minfloearea --maxarea $maxfloearea
singularity exec --bind $preprocess_dir:/tmp,$report_dir:/usr/local/bin/../report $project_dir/icefloetracker-julia.simg $julia_exec -t auto /usr/local/bin/ice-floe-tracker.jl extractfeatures -i $preprocess_dir -o /tmp --minarea $minfloearea --maxarea $maxfloearea
"""
[[tracking]]
script = """
julia -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl track --imgs $preprocess_dir --props $preprocess_dir --deltat $preprocess_dir --output $tracker_dir
singularity exec --bind $tracker_dir:/tmp,$report_dir:/usr/local/bin/../report $project_dir/icefloetracker-julia.simg $julia_exec -t auto /usr/local/bin/ice-floe-tracker.jl track --imgs $preprocess_dir --props $preprocess_dir --deltat $preprocess_dir --output /tmp
"""
[[exportH5]]
script = """
Comment on lines +63 to 78
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work figuring out the bindings!

julia -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl makeh5files --pathtosampleimg $fetchdata_dir/truecolor/$(ls $fetchdata_dir/truecolor | head -1) --resdir $preprocess_dir
"""
singularity exec --bind $preprocess_dir:/tmp,$report_dir:/usr/local/bin/../report $project_dir/icefloetracker-julia.simg $julia_exec -t auto /usr/local/bin/ice-floe-tracker.jl makeh5files --pathtosampleimg $fetchdata_dir/truecolor/$(ls $fetchdata_dir/truecolor | head -1) --resdir /tmp
"""

10 changes: 5 additions & 5 deletions config/cylc_local/flow.cylc
Original file line number Diff line number Diff line change
Expand Up @@ -5,14 +5,14 @@
initial cycle point = 1
[[graph]]
R1 = """
mkpaths => fetchdata => soit & landmask => preprocess => extractfeatures => tracking & exportH5
mkpaths => fetchdata & soit => landmask => preprocess => extractfeatures => tracking & exportH5
"""
[runtime]
[[root]]
[[[environment]]]
# Update these variables with your run parameters
startdate = "2022-05-04"
enddate = "2022-05-08"
enddate = "2022-05-06"
crs = "wgs84" #epsg3413 for polar stereographic
bounding_box = "78.186394 38.250605 70.749318 15.32373"
centroid_x = "75"
Expand All @@ -30,25 +30,25 @@
preprocess_dir = $results_dir/"preprocess"
soit_dir = $results_dir/"soit"
tracker_dir = $results_dir/"tracker"
h5_dir = $preprocess_dir/"hdf5-files"

[[mkpaths]]
script = """
mkdir -p $soit_dir
mkdir -p $landmask_dir
mkdir -p $preprocess_dir
mkdir -p $tracker_dir
mkdir -p $h5_dir
"""

[[fetchdata]]
script = """
docker run --mount type=bind,source=$fetchdata_dir,target=/tmp brownccv/icefloetracker-fetchdata:main fetchdata.sh -o /tmp -s $startdate -e $enddate -c $crs $bounding_box
docker run --mount type=bind,source=$fetchdata_dir,target=/tmp brownccv/icefloetracker-fetchdata:main /usr/local/bin/fetchdata.sh -o /tmp -s $startdate -e $enddate -c $crs $bounding_box
"""

[[soit]]
script = """
docker run --env SPACEUSER --env SPACEPSWD --mount type=bind,source=$soit_dir,target=/tmp brownccv/icefloetracker-fetchdata:main python3 /usr/local/bin/pass_time_cylc.py --startdate $startdate --enddate $enddate --csvoutpath /tmp --centroid_x $centroid_x --centroid_y $centroid_y --SPACEUSER $SPACEUSER --SPACEPSWD $SPACEPSWD
"""

[[landmask]]
script = """
julia --project=$project_dir -t auto $project_dir/workflow/scripts/ice-floe-tracker.jl landmask $fetchdata_dir $landmask_dir
Expand Down
2 changes: 1 addition & 1 deletion config/ift-env.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@ channels:
- conda-forge

dependencies:
- cylc-flow=8.2
- cylc-flow=8.3
2 changes: 1 addition & 1 deletion resources/README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,3 @@
# Resources Directory

This directory contains any retrieved resources used in the pipeline. This includes the initial satellite images and land mask.
This directory contains any retrieved resources used in the pipeline. This includes the initial satellite images and land mask.
1 change: 1 addition & 0 deletions src/h5.jl
Original file line number Diff line number Diff line change
Expand Up @@ -67,6 +67,7 @@ This function expects the following files to be present in `resdir`: `filenames.

* `pathtosampleimg`: Path to a sample image in the truecolor resource folder. This is used to extract the coordinate reference system (CRS) and the latitude and longitude coordinates of the image pixels.
* `resdir`: Path to the directory containing the results of the IceFloeTracker pipeline.
* `iftversion`: This is automatically pulled into the function from an environment variable.

# File structure
Each HDF5 file has the following structure:
Expand Down