Skip to content
Simon Felix edited this page Jan 21, 2022 · 5 revisions

Outline

The following paragraphs show how the complete system could work in the future. The details shown here are not important, but instead the focus is on the outlined functionality and feel of the interaction.

Imaging Algorithm Development

A pip install OurAwesomePipeline installs everything needed on a fresh machine. I can execute python3 imaging-development.py (or run the file in my IDE of choice), which takes a minute or two. The script benchmarks the included sample "backprojection" algorithm against other known algorithms, like CLEAN, MS-CLEAN and others. As a result, it produces a nicely formatted report that describes the performance of the sample algorithm. This report includes plots, statistics and various metrics. The imaging algorithms are tested with a default set of synthetic sources and real data, on varying noise levels, sources. If I wanted to test my algorithm with a different set, I could've run python3 imaging-development.py --benchmarks=sdc2.

To work on my own imaging algorithm, I can just replace the sample "backprojection" implementation and re-run the evaluation script. Because my algorithm is slow, I can also execute the same thing on CSCS infrastructure. The framework lets me run parameter sweeps on my algorithm, and the report shows the result of the swept parameter. The framework also contains commonly used functionality, to easily parallelize my algorithms, convert between common data formats, and to visualize intermediate results.

Integration of the Science Data Challenge 2 (SDC2) as test set

A pip install OurAwesomePipeline installs everything needed on a fresh machine. In the test-sets folder of the project, the readme.md contains step-by-step instructions how to add new test data. I copy the new-test-set.py to sdc2.py, and implement the required functionality. For this, I preprocess the dataset, and upload the results to a publicly available folder on the CSCS infrastructure. Next, I implement code that downloads the data from the public folder and stores it locally. This is very easy, because most required functions are already implemented in the framework. By writing 3 lines of code, I tell the framework, how this data can be used as sky model. Similarly, I implement a functions which makes the known sources in the SDC2 data set available. The framework will use these two pieces in imaging and source detection benchmarks. Finally, I hook up the SDC2 challenge in test-sets.py. I add some new unit tests to the project, to make sure that the new data source works as intended. I create a pull request, which is accepted after 2-3 days. From this point forward, everybody can test source finding algorithms and imaging algorithms with the SDC2 data set.

Clone this wiki locally