Tuesday, May 21, 2019

mc seems to be running correctly now... [NOPE]

Changed to -PJANA:MAX_RELAUNCH_THREADS=20  (i.e., from 10 to 20) in mcsmear call.  Ten files running now with minimal difference between wall and cpu times.

Update: Nope, they failed... Still failing at the mcsmear stage.

On the cluster: 10/10 sl_mu jobs (only 1000 events each) crashed.

Checking output files.  One difference is the valueof JANA_CALIB_URL.
On ifarm   -->   mysql://ccdb_user@hallddb.jlab.org/ccdb         [works...]
On cluster -->   sqlite:////work/halld/ccdb_sqlite/9/ccdb.sqlite    [doesn't work???]

Adding to the sl_mu auger file:

setenv JANA_CALIB_URL mysql://ccdb_user@hallddb.jlab.org/ccdb

Rerunning 10 files.

OK, this seems to be the problem.  I have no idea why it is intermittent, but I gave up on learning mysql looooong ago (2006).

Upping to 10k events and rerunning all mc.


No comments:

Post a Comment

Relegator update

Kripa has produced some really nice plots of significance vs decision function threshold for the regressor.  NICE. We also have plots of a...