Several machine learning techniques including a multiple linear regression, a polynomial regression, and a random forest regressor were used in this module. The main goal was to get more experience with machine learning and to try and predict rainfall rate from several different polarimetric radar parameters. Several error statistics were calculated to determine which model performed the best, including a baseline model calculation which used the formula
In order to complete this module in Python, scikit-learn will be used.
The package can be installed using conda
conda install -c conda-forge scikit-learn
Run the Module5.ipynb
file. The necessary dataset is already provided.
The radar_parameters.csv
dataset is in the homework
folder and will be used to train and test the models generated. These polarimetric radar parameters were calculated from disdrometer data in Huntsville, Alabama.
Features (radar measurements):
Zh
- radar reflectivity factor (dBZ) - use the formula
Zdr
- differential reflectivity
Ldr
- linear depolarization ratio
Kdp
- specific differential phase
Ah
- specific attenuation
Adp
- differential attenuation
Target :
R
- rain rate
- Multiple linear regression
- Polynomial regression with a grid search
- Random forest regressor with a grid and randomized search
- Baseline model calculation
- Calculation of r-squared and root mean square error statistics
Scikit-learn: Machine Learning in Python, Pedregosa et al., JMLR 12, pp. 2825-2830, 2011.