Skip to content

afgaron/Portfolio

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Portfolio of Avery Garon

My name is Avery Garon. I was a research assistant in the Minnesota Institute for Astrophysics from 2013-2020, during my undergraduate and graduate education. During this time, I worked on several data analysis research projects and wrote two data processing pipelines. This repository contains the major files I produced during these projects. My work is primarily written in Python 2.7, although I include code samples in C++ and R as well.

Abell 2255 (A2255)

As part of my Master's thesis work under the supervision of Dr. Lawrence Rudnick, I used observations taken by the NRAO Very Large Array to produce a radio image of galaxy Cluster Abell 2255. Initial data processing was performed using the specialized software package AIPS, and then imaging was performed in Python using the wrapper software CASA. One step in the imaging process, known as "peeling," consists of several intermediate steps that need to be performed on each subset of the data. To expedite this process, I wrote a script that generates the code for each step for each subset along with the batch files to submit them to the NRAO cluster. This script is shown in peeling_code_generator.py.

The actual analysis I performed was done using relic_spectra.py. This script was not meant to be run from the command line, but is rather a collection of modules that are run interactively as I progressed through stages in my research. For example, I used the functions color_I() and color_color() while performing early data exploration, and then bg_subtract_manual() for the quantitative analysis presented in my thesis.

Radio Galaxy Zoo (RGZ)

Prior to my work on Abell 2255, I was a member on the science team of the Radio Galaxy Zoo citizen science collaboration and helped develop the data processing pipeline. The pipeline consists of two parts: the first part consolidates the individual user-submitted responses and determines a consensus classification for each image ("source") presented to users, and the second part takes these consensus values, cross-references them to other astronomical catalogs, and calculates physical parameters. The majority of the first part was written by my predecessor, although I added functionality to weight the input data based on the reliability of the user who performed the classification.

I present the second part of the pipeline in RGZcatalog.py. It takes as input the MongoDB collection containing the consensus data from part one, and iterates over each source. Due to the long runtime, the pipeline records its place in the collection at each step in case it gets interrupted. I included error handling to detect connection interruptions so that the pipeline can pause and resume running when the connection is restored.

For each source, it cross-references two external catalogs: the AllWISE catalog through its official API, and the SDSS catalog by SQL injection into its interactive search webpage. These look-up are shown in processing.py. That also includes analysis of the radio contour images presented to the users, which is described by a tree data structure (in contour_node.py). After the pipeline collects all this data for each source, it saves the processed source into a new MongoDB collection, as well exporting a compressed version as a csv for other members of the science team to use.

In addition to writing this pipeline, I also performed an original research project using the RGZ data. The results of this research are presented in Garon et al. (2019) and are reproduced in my Master's thesis. The code I wrote for this research is in bending_analysis.py. This code serves two roles: when run from the command line, it automatically searches the the RGZ database for sources whose bending angle can be determined, cross-matches the sources to another catalog of galaxy clusters, and calculates the physical parameters discussed in Garon et al. (2019). Otherwise, it serves as a collection of interactive modules similar to what I discussed in the previous section for A2255.

WFC3 Infrared Spectroscopic Parallel Survey (WISPS)

During my undergraduate studies, I wrote an honors thesis under the supervision of Dr. Claudia Scarlata. The goal of this work was to take grism images from the WISP survey and disentangle overlapping spectra in the images. The final code I presented for this project is in grismProcessing.py. In summary, the code takes the individual cut-outs of sources in each image, collapses them into one dimension, and measure the profile across that. It uses these profiles to simultaneously fit profiles across each spectrum in the grism image, and subtracts the profiles corresponding to contaminating sources. Due to the finite resolution of the images, it uses splines to interpolate the profiles for sub-pixel alignment during subtraction.

Cluster matching

As an undergraduate, I performed another galaxy cluster research project with Dr. Rudnick, this time using C++. As with one of the steps of the RGZ bending analysis, the goal was to match radio galaxies to their host clusters based on proximity, this time allowing the user to input different definitions of proximity. The main file for this code is matching.cpp.

Information and Decision Sciences (IDSC)

During my final year of graduate school, I took several classes in the Information and Decision Sciences department, part of the Carlson School of Management. I present several products I wrote for these classes. lab4.R and lab5.R show example R snippets to address homework questions. final_project.twbx contains two example Tableau dashboards for a local marathon company, one showing how registrations for the marathons is affected by gender, marketing strategy, and pricing tier, and the other showing volunteering to work the races is affected by age, occupation, and pricing tier. Finally, Homework4 contains a RapidMiner Studio repository in which I tested different predictive modeling techniques to determine which customers would make a purchase. My analysis of the results is presented in hw4.docx.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published