Skip to content

Latest commit

 

History

History
225 lines (136 loc) · 16 KB

ReadMe.md

File metadata and controls

225 lines (136 loc) · 16 KB

This repository allows users to run and access the map-manipulation library Pixell and run the Jupyter notebook tutorials associated with Data Release 4 and 5 (DR4 and DR5).

ACT's DR4 and DR5 include intensity and polarization maps covering close to half the sky as well as a variety of other data products. These data products are described in some detail in the Python Notebook Tutorials presented here. The tutorials also introduce users to the Plate Carree maps used for the ACT data products as well as the python library, Pixell, used to handle the maps.

The full list of ACT DR4 and DR5 data products can be found on LAMBDA here. For questions or comments pertaining to these notebooks, please reach out to our help desk at [email protected]. Bugs can also be filed as an "Issue" on this GitHub repo.

Installing and Running the Notebooks

There are three options for building and running this repo: a conda environment, a completely local installation, or a fully-containerized installation via Docker, which installs all dependencies. We recommend using the conda environment setup if possible. Along with the notebooks and dependencies users will need to download the data products following the instructions below. We provide instructions for each of the cases. We assume you are running >= Python 3.6 as your default python version in your environment.


For the Conda Environment or the Local Installation

Download the necessary data products

The links to all of the products used in these notebooks have been compiled in pull_data bash scipts which makes it simple to download the data products using wget or curl. Feel free to add any other data products you'd like to pull by adding it to the relevant pull_data file following the pre-existing format.

In addition to providing pull_data scripts for the entire notebook set, we've also provided scripts that pull products for a subset of notebooks as a space-saving measure. If you only want to run a few of the notebooks, and would like to only pull data corresponding to those notebooks, simply replace the filename in the next two commands with the filename corresponding to the desired notebook(s). The full dataset is around 7.3GB in size: it should take you ~30 - 45 min to download everything (LAMBDA can be slow, but the other servers should permit a faster download speed).

First set up a directory to hold the data products you will download. Then, copy the pull_data script you wish to run into that directory. Finally, execute either of the following commands in that directory (e.g. to download all the data products):

sh pull_data_wget.sh

Or if you are working on a mac and don't have wget set up, you can use homebrew or use curl instead:

sh pull_data_curl.sh

The above will pull all of the data products with the exception of the full-resolution, full-size coadded maps due to the size of these files. Therefore, for the coadded map we automatically provide smaller versions (a full-res, smaller-footprint-cutout, and a low-res, full-size-footprint version); however, you can also get the full map from the LAMBDA website if you wish to use that. You would need the file:

"act_planck_dr5.01_s08s18_AA_f150_night_map.fits"

Make sure you change the variable path in Section 1 to reflect the directory holding your local data!

If using the conda environment:

This is the most straightforward set up option and takes the least amount of work to set up as it uses a custom environment we've created to use with these notebooks. If you prefer to set up your own environment please look at the local installation instructions below.

In this repository we have included the file act_notebooks.yml which can be used to set up a conda environment. To do so navigate to the repository that contains the file via your terminal. In the terminal run :

conda env create -f act_notebooks.yml

Once the environment has been created you can enter it using :

conda activate act_notebooks

From there launch jupyter notebook and you should be able to run the notebooks.

Note: If you are setting this environment up on NERSC or another computing system it may be necessary to go in and change the syntax of the yml file slightly. If you get the following error: ERROR: Invalid requirement: 'pixell=0.12.0', substitute pixell=0.12.0 with pixell==0.12.0.

If you are using a local installation:

We highly recommend working entirely within a conda environment to manage packages. If that is not possible for you, you should still be able to install all dependencies.

Most of the packages required to run the notebooks have well documented installation procedures, available on their websites (Healpy, getdist, astropy, CAMB, matplotlib, numpy, pandas, plotly, scipy). For proprietary packages (Pixell, pyactlike, nawrapper), you can navigate to the ReadMe of the repositories and follow their installation documentation; or, we have reproduced it for you here:

  1. Pixell

    • Install all Pixell dependencies listed here

    • Run the following two lines:

      pip install pixell --user

      test-pixell

    You may need to restart your environment for test-pixell to work.

  2. pyactlike

    • Navigate to a location in your environment where you are comfortable installing a GitHub repo

    • Download the repo files located here:

      https://lambda.gsfc.nasa.gov/data/suborbital/ACT/ACT_dr4/likelihoods/actpollite_python_dr4.01.tar.gz

    • Extract the files from the tar.gz zipped file. There should be a new folder made in your current directory called actpollite_python_dr4.01.

    • Navigate into actpollite_python_dr4.01 and run the following:

      pip install . --user

    • You can test the install with pytest in the base directory of the cloned pyactlike repo (the actpollite_python_dr4.01 folder)

  3. nawrapper

    • The recommended installation procedure involves working in a conda environment. If this is something you are comfortable with, or prefer, follow the clear and detailed steps outlined here

    • Otherwise, install namaster. The easiest way to do this is to install pymaster:

      pip install pymaster --user

    • Install nawrapper dependencies not already mentioned above: cython and pillow

    • Install nawrapper:

      git clone [email protected]:xzackli/nawrapper.git

      cd nawrapper

      pip install -e . --user


Docker Installation

We now walk through the Docker installation procedure. The initial set up should be reasonably fast with the exception of the step that downloads the data. After setting up the container up once, it's easy to relaunch it with a single command at any time.

  1. Install and run docker:

    • Create a Docker account and then sign in
    • Docker is set up to limit the memory available to your container. Some notebooks are CPU and memory intensive, so you should adjust this! Go into Preferences -> Resources and set Memory to 10GB and CPUs to 4. You can increase them at any point if you need to.
  2. Pull the Docker image:

    • open your terminal or command line and run:

        docker run -d -it -p 8888:8888 --name dr4_tutorials  --rm actcollaboration/dr4_tutorials
      

    This command connects the containers port to the local port with the -p flag, it names the container with the --name flag, it tells your system to remove the container once the session is ended with the --rm flag and then finally it points to the image you want to pull which is called actcollaboration/dr4_tutorials

  3. Move the container content to a local directory:

    • We now want to move the data in the container to somewhere that's easy to find on your local machine. We suggest creating a folder on your computer somewhere where you want to store the data for this tutorial. The path to that folder should replace [local_path] in the lines below.

        docker cp dr4_tutorials:/usr/home/workspace/. [local_path] && cd [local_path]/Data
      

    This command copies the contents of the image using the cp command to somewhere on your local machine and then we go to the data folder in that repository.

  4. Download the data:

    • In order to run the notebooks you'll need to download the relevant data products. In the data folder of this repo you'll notice a few different scripts that have been set up to pull the correct products. If you're on a mac you will want to use the files that have 'curl' in the name, unless you have wget set up already. You can choose to pull all of the data products or just a subset depending on which file you choose (run ls for macs or dir for windows to check what files are available). From there you just need to run that file using:

        sh [pull_data_curl].sh
      

    Just replace the [pull_data_curl] part with the name of the file you wish to run. This procedure is the same as the local installation/download instructions.

  5. Relaunch the container with the new data:

    • Now that we have the data we just need to relaunch our container and we're ready to go. To do so we first stop the container (the part of this command before the && can be used to stop the container whenever you wish to do so in the future) and then relaunch it :

        docker container stop dr4_tutorials && docker run -it -p 8888:8888 -v [local_path]:/usr/home/workspace --name dr4_tutorials --rm actcollaboration/dr4_tutorials
      
    • Again you need to replace [local_path] with the path to the folder you created earlier. Here the -v flag mounts your local folder onto the container so that you can easily access the data and save any changes you make to the notebooks.

    • If you're on a windows machine you may need to switch the slashes in the path name to / (forward slashes) instead of back slashes. If the command fails on the path name the first time then just run the second half of it with the corrected path name:

        docker run -it -p 8888:8888 -v [local_path]:/usr/home/workspace --name dr4_tutorials --rm actcollaboration/dr4_tutorials
      
    • For future use of this container you can relaunch it using just the above command and you can stop it using docker container stop dr4_tutorials

  6. Launch Jupyter Notebook:

    • You will now be in the container and should be able to launch the jupyter notebooks by just running

        jupyter notebook --ip 0.0.0.0
      
    • In the terminal you should now see a link that you can copy and paste into a browser. The link will open up jupyter notebook and you'll be able to navigate to the notebooks and run them in the container.

  7. Run Tutorials:

    • To check your data has correctly linked open the data directory, you should see a list of the relevant files.

    • Navigate to the Tutorials folder and start with the 1st notebook which serves as an indtroduction and provides an overview of the tutorials.

Trouble Shooting the Docker Jupyter notebooks

This step can occasionally cause problems if you are already using the port 8888 on your computer (i.e. you have another notebook running somewhere or something similar). Here are some trouble shooting steps you can try.

  • Try explicitly navigating to the 8888 port by opening your browser and entering: localhost:8888/

    • When prompted for a token copy and paste the token from the url or find it using the terminal by typing:

       jupyter notebook list
      
    • This will give a list of running jupyter notebooks that should look like this:

      Currently running servers:

      http://localhost:8888/?token=0d66c7b877535a9511ebe70d230f5ed65df1e9a0ac4f1144 :: /Users/.... Folder Path

    • Copy the text after 'token=' and before the ' :: /Users...' into the token request box and that should launch the notebook.

  • You can map the notebook onto a different port

    • Close your container by typing exit then run:

       docker run -it -p 8889:8888 -v [local_path]:/usr/home/workspace --name dr4_tutorials --rm actcollaboration/dr4_tutorials
      
    • Now re run jupyter notebook as before and copy the link but change the numbers in the url 8888 -> 8889

Note: You can also create new notebooks or add other data sets to the local directory that will automatically become available in the container.

When in the container if you wish to save work or data locally simply save them to the 'data/' folder that you linked with your local data when launching the container.

Dependencies

References: