Track Particle

This project investigate Machine Learning techniques to Particle Track reconstruction problems to HEP, it is part of SPRACE sponsored by Serrapilheira. This is a work flow of proposal.

We use different Machine Learning tecnhiques to resolve this big problem in the physics community. If you want to reproduce our results, we have written some general steps. You are welcome, if you have some ideas our suggestations, please let us know.

Setup

Environment

You need to install miniconda before on a linux system

Configure your conda environtment with env.yml file.

$ conda env create -f env.yml
$ conda activate trackml

Intallation

To run:

Clone the repository

$ git clone https://github.com/SPRACE/track-ml.git

go to track-ml directory created

you will need to have a GPU or some descent CPU.

Dataset

We transformed the detector into three kinematical regions to train our models with different datasets.

The first region is formed by the internal barrel with $\eta$ coordinate from ($-1.0$, $1.0$).
The second region is the intermediary barrel, (overlap) with $\eta$ coordinate from ($-2.0$ to $-1.0$) or ($2.0$ to $1.0$).
The last region is external with $\eta$ values between ($-3.0$ to $-2.0$) or ($3.0$ to $2.0$).

Considering the mentioned regions and the symmetry of the detector, each dataset was filtered to contain only high energy particles pT > 1.0 GeV with $\phi$ values between ($-0.5$, $0.5$), in order to obtain tracks with larger curvature radius, facilitating initial training.

A short datasets are in dataset directory.

Running

Training

There are some predefined config files to train diferents models (MLP, CNN, LSTM, CNN-parallel and others). If you need to change the parameters then change the config_*.json file. We used internal barrel as dataset, this dataset is previously transformed and linked in json file:

$ python main_train.py --config config_lstm_parallel_internal.json

There are other configurations for example a CNN model:

$ python main_train.py --config config_cnn_parallel_internal.json

If you want to see the training process when ajust any parameters of .json file. Run the notebook:

$ main_train.ipynb

For many trainings and testings with scripts. You can run it, with the default configuration:

$ ./run_trains.sh
$ ./run_tests.show

Inference

You can inference data test:

$ python main_inference.py --config config_lstm_parallel_internal.json

This will produce a results/encrypt_name/results-test.txt file.

Auxiliary Scripts

Performance

Accuracy of Algorithm

We are using regressions metrics for accuracy of models. We show 2 groups of metrics.

The principal metrics is a scoring. Scoring counts how many correct hits were found per layer and comparates with original truth hits. Finally we count the quantity of tracks reconstructed.
The other metrics are regression metrics, we measure the error between real and predicted hits per layer.

For example, to see the accuracy of training algorithm, go to results/encrypt_name/results-train.txt file and the scoring of correct and tracks reconstructed go to results/encrypt_name/results-test.txt file.

Output test file:

[Output] Results 
---Parameters--- 
         Model Name    :  lstm
         Dataset       :  phi025-025_eta025-025_train1_lasthit_20200219.csv
         Tracks        :  528
         Model saved   :  /compiled/model-lstm-DCtuvkiXn32hugVsTaokcp-coord-xyz-normalise-true-epochs-21-batch-6.h5
         Test date     :  10/06/2020 12:09:34
         Coordenates   :  xyz
         Model Scaled   :  True
         Model Optimizer :  adam
         Prediction Opt  :  nearest
         Total correct hits per layer  [256. 251. 213. 194. 157. 126.] of 528 tracks tolerance=0.0: 
         Total porcentage correct hits : ['48.48%', '47.54%', '40.34%', '36.74%', '29.73%', '23.86%']
         Reconstructed tracks: 74 of 528 tracks

Above output shows scoring per layer for example 48% with 256 hits were matched at the first layer, results are 74 tracks reconstructed of 528 tracks(it is a short dataset just). We also write other info like what kind of coordinate, if we use the nearest optimization, epochs, batchs, optimazer used, model name etc.

Regression metrics per layer are:

---Regression Scores--- 
        R_2 statistics        (R2)  = 0.992
        Mean Square Error     (MSE) = 882.525
        Root Mean Square Error(RMSE) = 29.707
        Mean Absolute Error   (MAE) = 9.858

layer  5
---Regression Scores--- 
        R_2 statistics        (R2)  = 1.0
        Mean Square Error     (MSE) = 6.818
        Root Mean Square Error(RMSE) = 2.611
        Mean Absolute Error   (MAE) = 1.325

layer  6
---Regression Scores--- 
        R_2 statistics        (R2)  = 0.999
        Mean Square Error     (MSE) = 27.603
        Root Mean Square Error(RMSE) = 5.254
        Mean Absolute Error   (MAE) = 2.541

layer  7
---Regression Scores--- 
        R_2 statistics        (R2)  = 0.998
        Mean Square Error     (MSE) = 141.074
        Root Mean Square Error(RMSE) = 11.877
        Mean Absolute Error   (MAE) = 5.285

The last output shows one geral metric for all hits and four (R^2, MSE, RMSE, MAE) metrics per layer.

Vizualization

If you want to see the results with plots, go to the plot_prediction.ipynb file at notebooks directory.

This plot is 10 tracks reconstructed.

The next plot shows all hits.

The next plot is the prediction of all hits.

Name		Name	Last commit message	Last commit date
Latest commit History 293 Commits
core		core
dataset		dataset
imgs		imgs
notebooks		notebooks
old		old
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
config_cnn.json		config_cnn.json
config_cnn_parallel.json		config_cnn_parallel.json
config_cnn_parallel_internal.json		config_cnn_parallel_internal.json
config_default.json		config_default.json
config_lstm.json		config_lstm.json
config_lstm_gaussian.json		config_lstm_gaussian.json
config_lstm_parallel.json		config_lstm_parallel.json
config_lstm_parallel_internal.json		config_lstm_parallel_internal.json
config_mlp.json		config_mlp.json
config_mlp_gaussian.json		config_mlp_gaussian.json
config_rnn.json		config_rnn.json
env.yml		env.yml
main_inference.py		main_inference.py
main_mlp.py		main_mlp.py
main_train.ipynb		main_train.ipynb
main_train.py		main_train.py
prediction-img.png		prediction-img.png
run_tests.sh		run_tests.sh
run_trains.sh		run_trains.sh
script_ency_dataset.py		script_ency_dataset.py
script_generate_dist.py		script_generate_dist.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Track Particle

Setup

Environment

Intallation

Dataset

Running

Training

Inference

Auxiliary Scripts

Performance

Accuracy of Algorithm

Vizualization

About

Releases

Packages

Contributors 4

Languages

License

SPRACE/track-ml

Folders and files

Latest commit

History

Repository files navigation

Track Particle

Setup

Environment

Intallation

Dataset

Running

Training

Inference

Auxiliary Scripts

Performance

Accuracy of Algorithm

Vizualization

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages