Skip to content
Simon Felix edited this page Jan 21, 2022 · 5 revisions

Outline

The following paragraphs show how the complete system could work in the future. The details shown here are not important, but instead the focus is on the outlined functionality and feel of the interaction.

Imaging Algorithm Development

A pip install OurAwesomePipeline installs everything needed on a fresh machine. I can execute python3 imaging-development.py (or run the file in my IDE of choice), which takes a minute or two. The script benchmarks the included sample "backprojection" algorithm against other known algorithms, like CLEAN, MS-CLEAN and others. As a result, it produces a nicely formatted report that describes the performance of the sample algorithm. This report includes plots, statistics and various metrics. The imaging algorithms are tested with a default set of synthetic sources and real data, on varying noise levels, with RFI sources, and different sources. If I wanted to test my algorithm with a different set, I could've run python3 imaging-development.py --benchmarks=sdc2.

To work on my own imaging algorithm, I can just replace the sample "backprojection" implementation and re-run the evaluation script. Because my algorithm is slow, I have to switch to CSCS infrastructure. I run my project there and it works immediately, using multiple cores and GPUs. The framework lets me run parameter sweeps on my algorithm, and the report shows the result of the swept parameter. The framework also contains commonly used functionality, to easily parallelize my algorithms, convert between common data formats, and to visualize intermediate results.

As soon as my algorithm works, I create a pull request to add it to the main repository. I can also use the automatically generated plots for the paper I'm publishing.

Integration of the Science Data Challenge 2 (SDC2) as test set

A pip install OurAwesomePipeline installs everything needed on a fresh machine. In the test-sets folder of the project, the readme.md contains step-by-step instructions how to add new test data. I copy the new-test-set.py to sdc2.py, and implement the required functionality. For this, I preprocess the dataset, and upload the results to a publicly available folder on the CSCS infrastructure. Next, I implement code that downloads the data from the public folder and stores it locally. This is very easy, because most required functions are already implemented in the framework. By writing 3 lines of code, I tell the framework, how this data can be used as sky model. Similarly, I implement a functions which makes the known sources in the SDC2 data set available. The framework will use these two pieces in imaging and source detection benchmarks. Finally, I hook up the SDC2 challenge in test-sets.py. I add some new unit tests to the project, to make sure that the new data source works as intended. I create a pull request, which is accepted after 2-3 days. From this point forward, everybody can test source finding algorithms and imaging algorithms with the SDC2 data set.

Working on a new Quicklook Algorithm

I work on a new idea to compute quick-look images for small image regions, from visibilities. A pip install OurAwesomePipeline installs everything needed on a fresh machine. For my experiments, I want to use certain parts of the pipeline:

uv = vis.generate(testsets.skymodels.SDC2small)
calibrated_uv = calibrate(uv)
image = imaging.clean(calibrated_uv, lat=(-10,-9), long=(12,13), resolution=0.001)
region1clean = imaging.crop((100,100,200,200))

quickLookAlgorithm = myOwnQuickLookAlgorithm(calibrated_uv, ...)
region1ql = imaging.create(quickLookAlgorithm)

imaging.compare(region1clean, region1ql)

Working on a morphological classification model

A pip install OurAwesomePipeline installs everything needed on my CSCS account. I can easily create a lot of training & ground truth data for my classification project with galaxies, their type, and a small cutout:

galaxies = []
for g in testsets.skymodels.SDC2large.galaxies
    galaxies.append((g.id, g.galaxyClass, imaging.quicklook(g, 40, 40)))