diff --git a/README.md b/README.md old mode 100644 new mode 100755 index e06d5f8..364cd57 --- a/README.md +++ b/README.md @@ -21,7 +21,23 @@ Salient features: 6) Multi-Class training possible 7) Ability to customize thresholds -### Setting up +### What will be the output of the trainer: + +1) Feature distributions +2) Statistics in training and testing +3) ROCs, loss plots, MVA scores +4) Confusion Matrices +5) Correlation plots +6) Trained models (h5 or pkl files) + +#### Optional outputs + +1) Threshold values of scores for chosen working points +2) Efficiency vs pT and Efficiency vs eta plots for all classes +3) Reweighting plots for pT and eta +4) Comparison of new ID performance with benchmark ID flags + +# Setting up #### Clone ``` @@ -35,11 +51,15 @@ In principle, you can set this up on your local computer by installing packages Use LCG 97python3 and you will have all the dependencies! (Tested at lxplus and SWAN) `source /cvmfs/sft.cern.ch/lcg/views/LCG_97python3/x86_64-centos7-gcc8-opt/setup.sh` -#### Run on CPUs and GPUs +#### Run on GPUs -The code can also transparently use a GPU, if a GPU card is available. The cvmfs release to use in that case is: +The code can also transparently use a GPU, if a GPU card is available. Although, all packages need to be setup correctly. +For GPU in tensorflow, you can use a cvmfs release is available: `source /cvmfs/sft.cern.ch/lcg/views/LCG_97py3cu10/x86_64-centos7-gcc7-opt/setup.sh` +For XGBoost, while the code will use it automatically, it needs a GPU compiled XGBoost with CUDA >10.0. This is at the moment not possible with any cvmfs release. +You can cartainly setup packages locally. + ### Running the trainer @@ -54,9 +74,11 @@ The Trainer will read the settings from the config file and run training Projects where the framework has been helpful -1) Run-3 Ele MVA ID -2) Close photon analysis -3) H->eeg analysis +1) Run-3 Electron MVA ID +2) Run-3 PF Electron ID +3) Run-3 PF Photon ID +4) Close photon analysis +5) H->eeg analysis ########################################## @@ -74,8 +96,7 @@ from tensorflow.keras.callbacks import EarlyStopping ``` - -#### All the Parameters +# All the Parameters | Parameters |Type| Description| | --------------- | ----------------| ---------------- | @@ -83,10 +104,10 @@ from tensorflow.keras.callbacks import EarlyStopping | `branches` |list of strings| Branches to read (Should be in the input root files). Only these branches can be later used for any purpose. The '\*' is useful for selecting pattern-based branches. In principle one can do ``` branches=["*"] ```, but remember that the data loading time increases, if you select more branches| |`SaveDataFrameCSV`|boolean| If True, this will save the data frame as a parquet file and the next time you run the same training with different parameters, it will be much faster| |`loadfromsaved`|boolean| If root files and branches are the same as previous training and SaveDataFrameCSV was True, you can assign this as `True`, and data loading time will reduce significantly. Remember that this will use the same output directory as mentioned using `OutputDirName`, so the data frame should be present there| -|`Classes` | list of strings | Two or more classes possible. For two classes the code will do a binary classification. For more than two classes Can be anything but samples will be later loaded under this scheme. Example: `Classes=['DY','TTBar']` or `Classes=['Class1','Class2','Class3']`. The order is important if you want to make an ID. In case of two classes, the first class has to be Signal of interest. The second has to be background.| +|`Classes` | list of strings | Two or more classes possible. For two classes the code will do a binary classification. For more than two classes Can be anything but samples will be later loaded under this scheme. Example: `Classes=['DY','TTBar']` or `Classes=['Class1','Class2','Class3']`. The order is important if you want to make an ID. In case of two classes, the first class has to be a Signal of interest. The second has to be a background. In multiclass, it does not matter which order one is using, but it is highly recommended that the first class is signal, if it is known. | |`ClassColors`|list of strings|Colors for `Classes` to use in plots. Standard python colors work!| |`Tree`| string |Location of the tree inside the root file| -|`processes`| list of dictionaries| You can add as many process files as you like and assign them to a specific class. For example WZ.root and TTBar.root could be 'Background' class and DY.root could be 'Signal' or both 'Signal and 'background' can come from the same root file. In fact you can have, as an example: 4 classes and 5 root files. The Trainer will take care of it at the backend. Look at the sample config below to see how processes are added. It is a list of dictionaries, with one example dictionary looking like this ` {'Class':'IsolatedSignal','path':['./DY.root','./Zee.root'], 'xsecwt': 1, 'selection':'(ele_pt > 5) & (abs(scl_eta) < 1.442) & (abs(scl_eta) < 2.5) & (matchedToGenEle==1)'} ` | +|`processes`| list of dictionaries| You can add as many process files as you like and assign them to a specific class. For example WZ.root and TTBar.root could be 'Background' class and DY.root could be 'Signal' or both 'Signal and 'background' can come from the same root file. In fact you can have, as an example: 4 classes and 5 root files. The Trainer will take care of it at the backend. Look at the sample config below to see how processes are added. It is a list of dictionaries, with one example dictionary looking like this ` {'Class':'IsolatedSignal','path':['./DY.root','./Zee.root'], 'xsecwt': 1, 'selection':'(ele_pt > 5) & (abs(scl_eta) < 1.442) & (abs(scl_eta) < 2.5) & (matchedToGenEle==1)'} ` | |`MVAs`|list of dictionaries| MVAs to use. You can add as many as you like: MVAtypes XGB and DNN are keywords, so names can be XGB_new, DNN_old etc, but keep XGB and DNN in the names (That is how the framework identifies which algo to run). Look at the sample config below to see how MVAs are added. | #### Optional Parameters diff --git a/Tools/readData.py b/Tools/readData.py index 9e2fdb7..af4b31b 100644 --- a/Tools/readData.py +++ b/Tools/readData.py @@ -9,7 +9,7 @@ import gc def daskframe_from_rootfiles(processes, treepath,branches,flatten='False',debug=False): - def get_df(Class,file, xsecwt, selection, treepath=None,branches=['ele*']): + def get_df(Class,file, xsecwt, selection, treepath=None,branches=['ele*'],multfactor=1): tree = uproot.open(file)[treepath] if debug: ddd=tree.pandas.df(branches=branches,flatten=flatten,entrystop=1000).query(selection) @@ -17,14 +17,16 @@ def get_df(Class,file, xsecwt, selection, treepath=None,branches=['ele*']): ddd=tree.pandas.df(branches=branches,flatten=flatten).query(selection) #ddd["Category"]=Category ddd["Class"]=Class - if type(xsecwt) == type("hello"): + if type(xsecwt) == type(('xsec',2)): + ddd["xsecwt"]=ddd[xsecwt[0]]*xsecwt[1] + elif type(xsecwt) == type("hello"): ddd["xsecwt"]=ddd[xsecwt] elif type(xsecwt) == type(0.1): ddd["xsecwt"]=xsecwt elif type(xsecwt) == type(1): ddd["xsecwt"]=xsecwt else: - print("CAUTION: xsecwt should be a branch name or a number... Assigning the weight as 1") + print("CAUTION: xsecwt should be a branch name or a number or a tuple... Assigning the weight as 1") print(file) return ddd diff --git a/Trainer.ipynb b/Trainer.ipynb index 101894c..6d37ef8 100644 --- a/Trainer.ipynb +++ b/Trainer.ipynb @@ -333,9 +333,11 @@ "metadata": {}, "outputs": [], "source": [ - "if hasattr(Conf,'modifydf'):\n", - " if callable(getattr(Conf,'modifydf')):\n", - " Conf.modifydf(df_final)" + "try:\n", + " Conf.modifydf(df_final)\n", + " print(\"Dataframe modification is done using modifydf\")\n", + "except:\n", + " print(\"Looks fine\")" ] }, { @@ -579,7 +581,7 @@ { "cell_type": "code", "execution_count": 26, - "id": "114ef58a", + "id": "9b520a4d", "metadata": {}, "outputs": [], "source": [ @@ -1170,6 +1172,10 @@ " df_final.loc[TestIndices,MVA[\"MVAtype\"]+\"_pred\"]=np.sum([modelDNN.predict(X_test,batch_size=5000)[:, 0],modelDNN.predict(X_test,batch_size=5000)[:, 1]],axis=0)\n", " \n", " ###############DNN#######################################\n", + " \n", + " plotwt_train=np.asarray(df_final.loc[TrainIndices,'xsecwt'])\n", + " plotwt_test=np.asarray(df_final.loc[TestIndices,'xsecwt'])\n", + " \n", " from sklearn.metrics import confusion_matrix\n", " fig, axes = plt.subplots(1, 1, figsize=(len(Conf.Classes)*2, len(Conf.Classes)*2))\n", " cm = confusion_matrix(Y_test.argmax(axis=1), y_test_pred.argmax(axis=1))\n", @@ -1198,10 +1204,10 @@ " ax=axes[i]\n", " for k in range(n_classes):\n", " axMVA.hist(y_test_pred[:, i][Y_test[:, k]==1],bins=np.linspace(0, 1, 21),label=Conf.Classes[k]+'_test',\n", - " weights=Wt_test[Y_test[:, k]==1]/np.sum(Wt_test[Y_test[:, k]==1]),\n", + " weights=plotwt_test[Y_test[:, k]==1]/np.sum(plotwt_test[Y_test[:, k]==1]),\n", " histtype='step',linewidth=2,color=Conf.ClassColors[k])\n", " axMVA.hist(y_train_pred[:, i][Y_train[:, k]==1],bins=np.linspace(0, 1, 21),label=Conf.Classes[k]+'_train',\n", - " weights=Wt_train[Y_train[:, k]==1]/np.sum(Wt_train[Y_train[:, k]==1]),\n", + " weights=plotwt_train[Y_train[:, k]==1]/np.sum(plotwt_train[Y_train[:, k]==1]),\n", " histtype='stepfilled',alpha=0.3,linewidth=2,color=Conf.ClassColors[k])\n", " axMVA.set_title(MVA[\"MVAtype\"]+' Score: Node '+str(i+1),fontsize=10)\n", " axMVA.set_xlabel('Score',fontsize=10)\n", @@ -1210,8 +1216,8 @@ " if Conf.MVAlogplot:\n", " axMVA.set_xscale('log')\n", "\n", - " fpr, tpr, th = roc_curve(Y_test[:, i], y_test_pred[:, i])\n", - " fpr_tr, tpr_tr, th_tr = roc_curve(Y_train[:, i], y_train_pred[:, i])\n", + " fpr, tpr, th = roc_curve(Y_test[:, i], y_test_pred[:, i],sample_weight=plotwt_test)\n", + " fpr_tr, tpr_tr, th_tr = roc_curve(Y_train[:, i], y_train_pred[:, i],sample_weight=plotwt_train)\n", " mask = tpr > 0.0\n", " fpr, tpr = fpr[mask], tpr[mask]\n", "\n", @@ -1270,8 +1276,8 @@ " plot_single_roc_point(df_final.query('TrainDataset==0'), var=OverlayWpi, ax=axes, color=color, marker='o', markersize=8, label=OverlayWpi+\" Test dataset\", cat=cat,Wt=weight)\n", " if len(Conf.MVAs)>0:\n", " for MVAi in Conf.MVAs:\n", - " plot_roc_curve(df_final.query('TrainDataset==0'),MVAi[\"MVAtype\"]+\"_pred\", tpr_threshold=0.0, ax=axes, color=MVAi[\"Color\"], linestyle='--', label=MVAi[\"Label\"]+' Testing',cat=cat,Wt=weight)\n", - " plot_roc_curve(df_final.query('TrainDataset==1'),MVAi[\"MVAtype\"]+\"_pred\", tpr_threshold=0.0, ax=axes, color=MVAi[\"Color\"], linestyle='-', label=MVAi[\"Label\"]+' Training',cat=cat,Wt=weight)\n", + " plot_roc_curve(df_final.query('TrainDataset==0'),MVAi[\"MVAtype\"]+\"_pred\", tpr_threshold=0.0, ax=axes, color=MVAi[\"Color\"], linestyle='--', label=MVAi[\"Label\"]+' Testing',cat=cat,Wt='xsecwt')\n", + " plot_roc_curve(df_final.query('TrainDataset==1'),MVAi[\"MVAtype\"]+\"_pred\", tpr_threshold=0.0, ax=axes, color=MVAi[\"Color\"], linestyle='-', label=MVAi[\"Label\"]+' Training',cat=cat,Wt='xsecwt')\n", " axes.set_ylabel(\"Background efficiency (%)\")\n", " axes.set_xlabel(\"Signal efficiency (%)\")\n", " axes.set_title(\"Final\")\n", @@ -1528,7 +1534,7 @@ "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", - "version": "3.6.13" + "version": "3.9.5" } }, "nbformat": 4, diff --git a/Trainer.py b/Trainer.py index 30501e5..6ae5b7f 100644 --- a/Trainer.py +++ b/Trainer.py @@ -181,9 +181,11 @@ def modify(df): # In[16]: -if hasattr(Conf,'modifydf'): - if callable(getattr(Conf,'modifydf')): - Conf.modifydf(df_final) +try: + Conf.modifydf(df_final) + print("Dataframe modification is done using modifydf") +except: + print("Looks fine") # In[17]: @@ -490,6 +492,10 @@ def corre(df,Classes=[''],MVA={}): df_final.loc[TestIndices,MVA["MVAtype"]+"_pred"]=np.sum([modelDNN.predict(X_test,batch_size=5000)[:, 0],modelDNN.predict(X_test,batch_size=5000)[:, 1]],axis=0) ###############DNN####################################### + + plotwt_train=np.asarray(df_final.loc[TrainIndices,'xsecwt']) + plotwt_test=np.asarray(df_final.loc[TestIndices,'xsecwt']) + from sklearn.metrics import confusion_matrix fig, axes = plt.subplots(1, 1, figsize=(len(Conf.Classes)*2, len(Conf.Classes)*2)) cm = confusion_matrix(Y_test.argmax(axis=1), y_test_pred.argmax(axis=1)) @@ -518,10 +524,10 @@ def corre(df,Classes=[''],MVA={}): ax=axes[i] for k in range(n_classes): axMVA.hist(y_test_pred[:, i][Y_test[:, k]==1],bins=np.linspace(0, 1, 21),label=Conf.Classes[k]+'_test', - weights=Wt_test[Y_test[:, k]==1]/np.sum(Wt_test[Y_test[:, k]==1]), + weights=plotwt_test[Y_test[:, k]==1]/np.sum(plotwt_test[Y_test[:, k]==1]), histtype='step',linewidth=2,color=Conf.ClassColors[k]) axMVA.hist(y_train_pred[:, i][Y_train[:, k]==1],bins=np.linspace(0, 1, 21),label=Conf.Classes[k]+'_train', - weights=Wt_train[Y_train[:, k]==1]/np.sum(Wt_train[Y_train[:, k]==1]), + weights=plotwt_train[Y_train[:, k]==1]/np.sum(plotwt_train[Y_train[:, k]==1]), histtype='stepfilled',alpha=0.3,linewidth=2,color=Conf.ClassColors[k]) axMVA.set_title(MVA["MVAtype"]+' Score: Node '+str(i+1),fontsize=10) axMVA.set_xlabel('Score',fontsize=10) @@ -530,8 +536,8 @@ def corre(df,Classes=[''],MVA={}): if Conf.MVAlogplot: axMVA.set_xscale('log') - fpr, tpr, th = roc_curve(Y_test[:, i], y_test_pred[:, i]) - fpr_tr, tpr_tr, th_tr = roc_curve(Y_train[:, i], y_train_pred[:, i]) + fpr, tpr, th = roc_curve(Y_test[:, i], y_test_pred[:, i],sample_weight=plotwt_test) + fpr_tr, tpr_tr, th_tr = roc_curve(Y_train[:, i], y_train_pred[:, i],sample_weight=plotwt_train) mask = tpr > 0.0 fpr, tpr = fpr[mask], tpr[mask] @@ -582,8 +588,8 @@ def corre(df,Classes=[''],MVA={}): plot_single_roc_point(df_final.query('TrainDataset==0'), var=OverlayWpi, ax=axes, color=color, marker='o', markersize=8, label=OverlayWpi+" Test dataset", cat=cat,Wt=weight) if len(Conf.MVAs)>0: for MVAi in Conf.MVAs: - plot_roc_curve(df_final.query('TrainDataset==0'),MVAi["MVAtype"]+"_pred", tpr_threshold=0.0, ax=axes, color=MVAi["Color"], linestyle='--', label=MVAi["Label"]+' Testing',cat=cat,Wt=weight) - plot_roc_curve(df_final.query('TrainDataset==1'),MVAi["MVAtype"]+"_pred", tpr_threshold=0.0, ax=axes, color=MVAi["Color"], linestyle='-', label=MVAi["Label"]+' Training',cat=cat,Wt=weight) + plot_roc_curve(df_final.query('TrainDataset==0'),MVAi["MVAtype"]+"_pred", tpr_threshold=0.0, ax=axes, color=MVAi["Color"], linestyle='--', label=MVAi["Label"]+' Testing',cat=cat,Wt='xsecwt') + plot_roc_curve(df_final.query('TrainDataset==1'),MVAi["MVAtype"]+"_pred", tpr_threshold=0.0, ax=axes, color=MVAi["Color"], linestyle='-', label=MVAi["Label"]+' Training',cat=cat,Wt='xsecwt') axes.set_ylabel("Background efficiency (%)") axes.set_xlabel("Signal efficiency (%)") axes.set_title("Final") diff --git a/archive/packagesforGPU.txt b/archive/packagesforGPU.txt new file mode 100644 index 0000000..e641a5c --- /dev/null +++ b/archive/packagesforGPU.txt @@ -0,0 +1,249 @@ +# Name Version Build Channel +_libgcc_mutex 0.1 main +_tflow_select 2.1.0 gpu anaconda +absl-py 0.12.0 py39h06a4308_0 +anyio 3.1.0 py39hf3d152e_0 conda-forge +argon2-cffi 20.1.0 py39hbd71b63_2 conda-forge +astunparse 1.6.3 py_0 anaconda +async_generator 1.10 py_0 conda-forge +attrs 21.2.0 pyhd8ed1ab_0 conda-forge +awkward0 0.15.5 pypi_0 pypi +babel 2.9.1 pyh44b312d_0 conda-forge +backcall 0.2.0 pyh9f0ad1d_0 conda-forge +backports 1.0 py_2 conda-forge +backports.functools_lru_cache 1.6.4 pyhd8ed1ab_0 conda-forge +blas 1.0 mkl anaconda +bleach 3.3.0 pyh44b312d_0 conda-forge +blinker 1.4 py39h06a4308_0 +bokeh 2.3.2 py39h06a4308_0 +brotlipy 0.7.0 py39h27cfd23_1003 +c-ares 1.17.1 h27cfd23_0 +ca-certificates 2020.10.14 0 anaconda +cachetools 4.1.1 py_0 anaconda +cairo 1.16.0 h18b612c_1001 conda-forge +certifi 2021.5.30 py39h06a4308_0 +cffi 1.14.5 py39h261ae71_0 +chardet 3.0.4 py39h06a4308_1003 +click 7.1.2 py_0 anaconda +cloudpickle 1.6.0 py_0 +colorama 0.4.4 pyh9f0ad1d_0 conda-forge +commonmark 0.9.1 py_0 conda-forge +correctionlib 2.0.0 pypi_0 pypi +coverage 5.5 py39h27cfd23_2 +cryptography 3.4.7 py39hd23ed53_0 +cudatoolkit 10.1.243 h6bb024c_0 anaconda +cudnn 7.6.5 cuda10.1_0 anaconda +cupti 10.1.168 0 anaconda +cycler 0.10.0 py_2 conda-forge +cytoolz 0.11.0 py39h27cfd23_0 +daal4py 2021.2.2 py39ha9443f7_0 +dal 2021.2.2 h06a4308_389 +dask 2021.6.0 pyhd3eb1b0_0 +dask-core 2021.6.0 pyhd3eb1b0_0 +dbus 1.13.6 he372182_0 conda-forge +decorator 5.0.9 pyhd8ed1ab_0 conda-forge +defusedxml 0.7.1 pyhd8ed1ab_0 conda-forge +distributed 2021.6.0 py39h06a4308_0 +eli5 0.11.0 pyhd8ed1ab_0 conda-forge +entrypoints 0.3 pyhd8ed1ab_1003 conda-forge +expat 2.4.1 h2531618_2 +fontconfig 2.13.1 he4413a7_1000 conda-forge +freetype 2.10.4 h7ca028e_0 conda-forge +fribidi 1.0.10 h36c2ea0_0 conda-forge +fsspec 2021.6.0 pyhd3eb1b0_0 +future 0.18.2 py39hf3d152e_3 conda-forge +gast 0.4.0 py_0 anaconda +gettext 0.19.8.1 h5e8e0c9_1 conda-forge +glib 2.68.2 h36276a3_0 +google-auth 1.21.3 py_0 anaconda +google-auth-oauthlib 0.4.1 py_2 anaconda +google-pasta 0.2.0 py_0 anaconda +graphite2 1.3.13 h58526e2_1001 conda-forge +graphviz 2.42.3 h0511662_0 conda-forge +grpcio 1.35.0 py39hce63b2e_0 +gst-plugins-base 1.14.0 hbbd80ab_1 +gstreamer 1.14.0 h28cd5cc_2 +h5py 2.10.0 py39hec9cf62_0 +harfbuzz 2.4.0 h37c48d4_1 conda-forge +hdf5 1.10.6 hb1b8bf9_0 anaconda +heapdict 1.0.1 py_0 +icu 58.2 hf484d3e_1000 conda-forge +idna 2.10 py_0 anaconda +importlib-metadata 2.0.0 py_1 anaconda +intel-openmp 2020.2 254 anaconda +ipykernel 5.5.5 py39hef51801_0 conda-forge +ipython 7.24.1 py39hef51801_0 conda-forge +ipython_genutils 0.2.0 py_1 conda-forge +jedi 0.18.0 py39hf3d152e_2 conda-forge +jinja2 2.11.3 pyh44b312d_0 conda-forge +joblib 1.0.1 pyhd3eb1b0_0 +jpeg 9d h36c2ea0_0 conda-forge +json5 0.9.5 pyh9f0ad1d_0 conda-forge +jsonschema 3.2.0 pyhd8ed1ab_3 conda-forge +jupyter_client 6.1.12 pyhd8ed1ab_0 conda-forge +jupyter_core 4.7.1 py39hf3d152e_0 conda-forge +jupyter_server 1.8.0 pyhd8ed1ab_0 conda-forge +jupyterlab 3.0.16 pyhd8ed1ab_0 conda-forge +jupyterlab_pygments 0.1.2 pyh9f0ad1d_0 conda-forge +jupyterlab_server 2.6.0 pyhd8ed1ab_0 conda-forge +keras-preprocessing 1.1.2 pyhd3eb1b0_0 +keras-visualizer 2.4 pypi_0 pypi +kernel_driver 0.0.6 pyhd8ed1ab_0 conda-forge +kiwisolver 1.3.1 py39h081fc7a_0 conda-forge +lcms2 2.11 hcbb858e_1 conda-forge +ld_impl_linux-64 2.33.1 h53a641e_7 +libffi 3.3 he6710b0_2 +libgcc-ng 9.1.0 hdf63c60_0 +libgfortran-ng 7.3.0 hdf63c60_0 anaconda +libpng 1.6.37 h21135ba_2 conda-forge +libprotobuf 3.14.0 h8c45485_0 +libsodium 1.0.18 h36c2ea0_1 conda-forge +libstdcxx-ng 9.1.0 hdf63c60_0 +libtiff 4.1.0 h2733197_1 +libtool 2.4.6 h58526e2_1007 conda-forge +libuuid 2.32.1 h14c3975_1000 conda-forge +libxcb 1.13 h14c3975_1002 conda-forge +libxml2 2.9.10 hb55368b_3 +llvmlite 0.36.0 pypi_0 pypi +locket 0.2.1 py39h06a4308_1 +lz4-c 1.9.2 he1b5a44_3 conda-forge +markdown 3.3.4 py39h06a4308_0 +markupsafe 1.1.1 py39h38d8fee_2 conda-forge +matplotlib 3.3.4 py39h06a4308_0 +matplotlib-base 3.3.4 py39h62a2d02_0 +matplotlib-inline 0.1.2 pyhd8ed1ab_2 conda-forge +mistune 0.8.4 py39hbd71b63_1002 conda-forge +mkl 2020.2 256 anaconda +mkl-service 2.3.0 py39he8ac12f_0 +mkl_fft 1.3.0 py39h54f3939_0 +mkl_random 1.0.2 py39h63df603_0 +mpi 1.0 mpich +mpich 3.3.2 hc856adb_0 +msgpack-python 1.0.2 py39hff7bd54_1 +nbclassic 0.3.1 pyhd8ed1ab_1 conda-forge +nbclient 0.5.3 pyhd8ed1ab_0 conda-forge +nbconvert 6.0.7 py39hf3d152e_3 conda-forge +nbformat 5.1.3 pyhd8ed1ab_0 conda-forge +nbterm 0.0.11 pyhd8ed1ab_1 conda-forge +ncurses 6.2 he6710b0_1 +nest-asyncio 1.5.1 pyhd8ed1ab_0 conda-forge +notebook 6.4.0 pyha770c72_0 conda-forge +numba 0.53.1 pypi_0 pypi +numpy 1.19.2 py39h89c1606_0 +numpy-base 1.19.2 py39h2ae0177_0 +oauthlib 3.1.0 py_0 anaconda +olefile 0.46 pyh9f0ad1d_1 conda-forge +openssl 1.1.1k h27cfd23_0 +opt_einsum 3.1.0 py_0 anaconda +packaging 20.9 pyh44b312d_0 conda-forge +pandas 1.2.4 py39h2531618_0 +pandoc 2.14.0.2 h7f98852_0 conda-forge +pandocfilters 1.4.2 py_1 conda-forge +pango 1.42.4 h7062337_4 conda-forge +parso 0.8.2 pyhd8ed1ab_0 conda-forge +partd 1.2.0 pyhd3eb1b0_0 +pcre 8.44 he6710b0_0 +pexpect 4.8.0 pyh9f0ad1d_2 conda-forge +pickleshare 0.7.5 py_1003 conda-forge +pillow 7.2.0 py39h6f3857e_2 conda-forge +pip 21.1.1 py39h06a4308_0 +pixman 0.38.0 h516909a_1003 conda-forge +prometheus_client 0.11.0 pyhd8ed1ab_0 conda-forge +prompt-toolkit 3.0.19 pyha770c72_0 conda-forge +protobuf 3.14.0 py39h2531618_1 +psutil 5.8.0 py39h27cfd23_1 +pthread-stubs 0.4 h36c2ea0_1001 conda-forge +ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge +pyarrow 4.0.1 pypi_0 pypi +pyasn1 0.4.8 py_0 anaconda +pyasn1-modules 0.2.8 py_0 anaconda +pycparser 2.20 py_2 anaconda +pydantic 1.8.2 pypi_0 pypi +pygments 2.9.0 pyhd8ed1ab_0 conda-forge +pyjwt 2.1.0 py39h06a4308_0 +pyopenssl 19.1.0 py_1 anaconda +pyparsing 2.4.7 pyh9f0ad1d_0 conda-forge +pyqt 5.9.2 py39h2531618_6 +pyrsistent 0.17.3 py39hbd71b63_1 conda-forge +pysocks 1.7.1 py39h06a4308_0 +python 3.9.5 hdb3f193_3 +python-dateutil 2.8.1 pyhd3eb1b0_0 +python-flatbuffers 1.12 pyhd3eb1b0_0 +python-graphviz 0.16 pyhd3deb0d_1 conda-forge +python_abi 3.9 1_cp39 conda-forge +pytz 2021.1 pyhd3eb1b0_0 +pyyaml 5.4.1 py39h27cfd23_1 +pyzmq 19.0.2 py39hb69f2a1_2 conda-forge +qt 5.9.7 h5867ecd_1 +readline 8.1 h27cfd23_0 +requests 2.24.0 py_0 anaconda +requests-oauthlib 1.3.0 py_0 anaconda +rich 10.4.0 py39hf3d152e_0 conda-forge +rsa 4.6 py_0 anaconda +scikit-learn 0.24.2 py39ha9443f7_0 +scikit-learn-intelex 2021.2.2 py39h06a4308_0 +scipy 1.6.2 py39h91f5cce_0 +seaborn 0.11.0 py_0 anaconda +send2trash 1.5.0 py_0 conda-forge +setuptools 52.0.0 py39h06a4308_0 +shap 0.39.0 pypi_0 pypi +shellingham 1.4.0 pyh44b312d_0 conda-forge +singledispatch 3.6.1 pyh44b312d_0 conda-forge +sip 4.19.13 py39h2531618_0 +six 1.15.0 py_0 anaconda +slicer 0.0.7 pypi_0 pypi +sniffio 1.2.0 py39hf3d152e_1 conda-forge +sortedcontainers 2.3.0 pyhd3eb1b0_0 +sqlite 3.35.4 hdfb4753_0 +tabulate 0.8.9 pyhd8ed1ab_0 conda-forge +tbb 2021.2.0 hff7bd54_0 +tblib 1.7.0 py_0 +tensorboard 2.5.0 py_0 +tensorboard-plugin-wit 1.6.0 py_0 anaconda +tensorflow 2.4.1 gpu_py39h8236f22_0 +tensorflow-base 2.4.1 gpu_py39h29c2da4_0 +tensorflow-estimator 2.4.1 pyheb71bc4_0 +tensorflow-gpu 2.4.1 h30adc30_0 +termcolor 1.1.0 py39h06a4308_1 +terminado 0.10.1 py39hf3d152e_0 conda-forge +testpath 0.5.0 pyhd8ed1ab_0 conda-forge +threadpoolctl 2.1.0 pyh5ca1d4c_0 +tk 0.1.0 pypi_0 pypi +toolz 0.11.1 pyhd3eb1b0_0 +tornado 6.1 py39hbd71b63_0 conda-forge +tqdm 4.61.1 pypi_0 pypi +traitlets 5.0.5 py_0 conda-forge +typer 0.3.2 pyhd8ed1ab_0 conda-forge +typing_extensions 3.7.4.3 pyha847dfd_0 +tzdata 2020f h52ac0ba_0 +uncertainties 2.4.4 pypi_0 pypi +uproot3 3.14.4 pypi_0 pypi +uproot3-methods 0.10.1 pypi_0 pypi +urllib3 1.25.11 py_0 anaconda +wcwidth 0.2.5 pyh9f0ad1d_2 conda-forge +webencodings 0.5.1 py_1 conda-forge +websocket-client 0.57.0 py39hf3d152e_4 conda-forge +werkzeug 1.0.1 py_0 anaconda +wheel 0.36.2 pyhd3eb1b0_0 +wrapt 1.12.1 py39he8ac12f_1 +xgboost 0.90 pypi_0 pypi +xorg-kbproto 1.0.7 h14c3975_1002 conda-forge +xorg-libice 1.0.10 h516909a_0 conda-forge +xorg-libsm 1.2.3 h84519dc_1000 conda-forge +xorg-libx11 1.6.12 h36c2ea0_0 conda-forge +xorg-libxau 1.0.9 h14c3975_0 conda-forge +xorg-libxdmcp 1.1.3 h516909a_0 conda-forge +xorg-libxext 1.3.4 h516909a_0 conda-forge +xorg-libxpm 3.5.13 h516909a_0 conda-forge +xorg-libxrender 0.9.10 h516909a_1002 conda-forge +xorg-libxt 1.1.5 h516909a_1003 conda-forge +xorg-renderproto 0.11.1 h14c3975_1002 conda-forge +xorg-xextproto 7.3.0 h14c3975_1002 conda-forge +xorg-xproto 7.0.31 h14c3975_1007 conda-forge +xz 5.2.5 h7b6447c_0 +yaml 0.2.5 h7b6447c_0 +zeromq 4.3.4 h2531618_0 +zict 2.0.0 pyhd3eb1b0_0 +zipp 3.3.1 py_0 anaconda +zlib 1.2.11 h7b6447c_3 +zstd 1.4.5 h9ceee32_0 diff --git a/yamlCode/Config.yaml b/archive/yamlCode/Config.yaml similarity index 100% rename from yamlCode/Config.yaml rename to archive/yamlCode/Config.yaml diff --git a/yamlCode/Tools/PlotTools.py b/archive/yamlCode/Tools/PlotTools.py similarity index 100% rename from yamlCode/Tools/PlotTools.py rename to archive/yamlCode/Tools/PlotTools.py diff --git a/yamlCode/Trainer-yamltest.ipynb b/archive/yamlCode/Trainer-yamltest.ipynb similarity index 100% rename from yamlCode/Trainer-yamltest.ipynb rename to archive/yamlCode/Trainer-yamltest.ipynb diff --git a/yamlCode/Traineryaml.ipynb b/archive/yamlCode/Traineryaml.ipynb similarity index 100% rename from yamlCode/Traineryaml.ipynb rename to archive/yamlCode/Traineryaml.ipynb diff --git a/job.sh b/condor_example/job.sh similarity index 61% rename from job.sh rename to condor_example/job.sh index f887a7c..27300f0 100644 --- a/job.sh +++ b/condor_example/job.sh @@ -1,7 +1,7 @@ universe = vanilla +JobFlavour = "workday" executable = train.sh -arguments = "PFElectronConfig_lowpT" +arguments = "Config" log = test.log output = condor_ouput/outfile.$(Cluster).$(Process).out error = condor_ouput/errors.$(Cluster).$(Process).err @@ -9,5 +9,3 @@ request_GPUs = 1 request_CPUs = 4 +testJob = True queue - -#PFElectronConfig_EB_highpT.py PFElectronConfig_EB_lowpT.py PFElectronConfig_EE_highpT.py PFElectronConfig_EE_lowpT.py diff --git a/condor_example/train.sh b/condor_example/train.sh new file mode 100644 index 0000000..2892baf --- /dev/null +++ b/condor_example/train.sh @@ -0,0 +1,12 @@ +#!/bin/bash +cd /afs/cern.ch/user/a/akapoor/workspace/2020/IDTRainer/ID-Trainer + +#use only one of the source commands + +## For GPU +#source /cvmfs/sft.cern.ch/lcg/views/LCG_97py3cu10/x86_64-centos7-gcc7-opt/setup.sh + +## For only CPU +source /cvmfs/sft.cern.ch/lcg/views/LCG_97python3/x86_64-centos7-gcc8-opt/setup.sh + +python Trainer.py $1 diff --git a/train.sh b/train.sh deleted file mode 100644 index 10fdf73..0000000 --- a/train.sh +++ /dev/null @@ -1,5 +0,0 @@ -#!/bin/bash -cd /afs/cern.ch/user/a/akapoor/workspace/2020/IDTRainer/ID-Trainer -source /cvmfs/sft.cern.ch/lcg/views/LCG_97py3cu10/x86_64-centos7-gcc7-opt/setup.sh -#source /cvmfs/sft.cern.ch/lcg/views/dev3cuda/latest/x86_64-centos7-gcc8-opt/setup.sh -python Trainer.py $1