GlueX Lambda SLD analysis: July 2019

Tuesday, July 30, 2019

Relegator ideas...

For now, work on simple two-class moons dataset (2 features) with an added third feature: peaking signal and exponentially distributed background.

Compare:
1. NN with single output node, logistic regression, with optimal cut based on ROC
2. NN with two output nodes, cat. cross-entropy, optimal cut based on S/sqrt(S+B)
3. relegator NN with three output nodes, tuned with S/sqrt(S+B) in loss function

Wednesday, July 24, 2019

Relegation classifier thoughts

Thinking about investigating a NN classifier for separating signal from a pernicious background. Typical approach is to train classifier, and then use decision cut value that give the best analysis power, often interpreted from ROC AUC. I suspect, though, that the NN could train differently if we "build in" that we want to optimize analysis power.

So, my idea is to investigate a binary classification problem with a NN that predicts probabilities for THREE classes: signal, background, and RELEGATION. The idea is that events which are too difficult to correctly characterize will be placed in the relegation class. The penalty for doing so is that the loss function will contain a term(s) that wants to keep S/sqrt(S+B) as high as possible.

For multiclass classification, we would use the categorical cross-entropy loss function:

$$\[\mathcal{L} = \displaystyle\sum_{c=1}^{M} y_{o,c} \ln (p_{o,c})\]$$

We will add to this a term:

A potential problem is that the total S and B can only be accurately calculated once per epoch. BUT the tuning of the network depends on the change in S and B (derivative for backprop.). These derivatives can be calculated on a per-event basis.

Wednesday, July 17, 2019

ODD: mum_ndf_dedx is ZERO

Getting a weird crash when trying to calculate the chisq/ndf for mum dedx info. All of the mum_ndf_dedx have zero value in my csv files. The values are all zero as generated by DSelector. (I am not setting this branch manually in DSelector.) Might need to ask about why this is.

For the time being, I'm going to NOT calculate the chi2/ndf in the sld_pipeline, though it will presumably be important for mu-/pi-/e- separation...

Monday, July 15, 2019

Next steps for Mikey

Adding the electronic SLD to the multiclass classifier. MC is nearly done. Have to turn into ascii files and add capability to tf code.

BDT code. Look into multi-class BDT.

Add the following functions to dnn_tools.py:
1. fcn that reads in data files and returns pandas dfs
2. fcn that sets up the dfs after being read in
3. fcn that generates all of the labels once the data frames are read in (this is all vague rn)

Sunday, July 14, 2019

Next steps for Megan

Megan showed some plots of kinematic quantities for the "raw-raw" and "raw" muonic sld MC on Friday. These look good.

Btw, what we mean by "raw-raw" is the generated p4 and x4 quantities with x4 for the primary vertex set to (0,0,0,0). We'll call this "generated" from now on.

The "raw" MC is the same p4 vectors, but with the primary x4 set to some position in the target with some physical event time. The Lambda vertex (and any other decay vertices) are fixed by this, too. This step is taken care of by hdgeant4, and all of these quantities are taken from the hddm files that hdgeant spits out.

See /w/halld-scifs17exp/home/mmccrack/mc_processing/gen_raw_vert_files for the code to generate these files.

So far, Megan has looked at 5 files worth (10k events) of this MC.

NOW it's time for Megan to start looking at some "accepted" MC files, meaning events after the detector simulation. So! I have to generate some files that will work for her, and align with the information that she already has. The sld_mu raw files that she's using were generated on May 22, and I haven't generated anything new for this reaction since.

I THINK that I can get away with modifying the protonTRUTH DSelector, and then using my TTree to ascii scripts to get Megan the info that she needs. She should get measured and KF p4, post-KF x4, AND it would be good to have the track ID for each particle (so that we can subtract out any K+ or mu decays that would screw up vertexing). Said DSelector is here:
/w/halld-scifs17exp/home/mmccrack/dsel_protonTRUTH

Actually, I have to do some digging to figure out the track ID stuff, so I'll do that later.

My sld to ascii script is here: /Users/mmccracken/office_comp/lambda_sld/jun2019/sld_ttree_2_megan.py

ACTUALLY, I'm just going to give Megan the same files that I'm working with, but cut all events from files numbers above 1004 (i.e. remove all events with number greater than or equal to 100500000.

Megan is going to look into differences between raw and accepted MC of the following quantities:
beam photons: energy
K+: px, py, pz, magnitude of p, energy
proton: same as K+
mu-: same as K+
primary vertex: x, y, z, t, and distance between raw and accepted vertex
Lambda (secondary) vertex: same as primary vertex.

For now, Megan will use the kinfit quantities in the accepted files.

Files are on Google Drive:
kL_acc_fastpi_1000-1004.ascii
kL_acc_ppim_1000-1004.ascii
kL_acc_sl_mu_1000-1004.ascii

Saturday, July 6, 2019

Features files woes

Looks like I did not add the kin fit confidence level to the features files. The feats files DO include kinfit chi2 and ndf, which I will use to calculate chi2/ndf (adding to pipeline now...), but this might not give the same separation as cl... Will check back.

UPDATE: chisq/ndf seems to do the job. If either it or cl are NOT included, the fits do not converge in 200 epochs -- only overfitting happens. INTERESTING.

Thursday, July 4, 2019

wintermute GPU working!

Fixed it! I had to get cuda 10.0 (specifically). Used deb packages. Follow the instructions here, but with cuda 10.0:

https://www.tensorflow.org/install/gpu

Batches run faster (I think), but there is a longer delay between batches, presumably while the job is parallelized???

wintermute update

SLD nn code is copied to wintermute. It runs, but it doesn't seem to run on the GPU. There are errors:

initializing nn classifier...

2019-07-04 12:30:42.146473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-07-04 12:30:42.234671: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-04 12:30:42.235215: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.68
pciBusID: 0000:01:00.0
2019-07-04 12:30:42.235378: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235492: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235602: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235709: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235814: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235919: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.236025: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.236040: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2019-07-04 12:30:42.236337: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-04 12:30:42.262152: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 4008000000 Hz
2019-07-04 12:30:42.262686: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5623663f1d30 executing computations on platform Host. Devices:
2019-07-04 12:30:42.262750: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): <undefined>, <undefined>
2019-07-04 12:30:42.263222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-04 12:30:42.263281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]
2019-07-04 12:30:42.369886: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-04 12:30:42.370218: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5623663f3180 executing computations on platform CUDA. Devices:
2019-07-04 12:30:42.370232: I tensorflow/compiler/xla/service/service.cc:175] StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
Model: "sequential"

Looks like I might need to install CUDA 10... working on that now.

Further update: TF still can't find cuda 10 libs...

GlueX Lambda SLD analysis