Thursday, July 4, 2019

wintermute update

SLD nn code is copied to wintermute.  It runs, but it doesn't seem to run on the GPU.  There are errors:

initializing nn classifier...

2019-07-04 12:30:42.146473: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2019-07-04 12:30:42.234671: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-04 12:30:42.235215: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1618] Found device 0 with properties:
name: GeForce RTX 2060 major: 7 minor: 5 memoryClockRate(GHz): 1.68
pciBusID: 0000:01:00.0
2019-07-04 12:30:42.235378: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudart.so.10.0'; dlerror: libcudart.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235492: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcublas.so.10.0'; dlerror: libcublas.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235602: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcufft.so.10.0'; dlerror: libcufft.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235709: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcurand.so.10.0'; dlerror: libcurand.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235814: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusolver.so.10.0'; dlerror: libcusolver.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.235919: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcusparse.so.10.0'; dlerror: libcusparse.so.10.0: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.236025: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcudnn.so.7'; dlerror: libcudnn.so.7: cannot open shared object file: No such file or directory; LD_LIBRARY_PATH: /usr/lib:/usr/local/lib
2019-07-04 12:30:42.236040: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1641] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://www.tensorflow.org/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...
2019-07-04 12:30:42.236337: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2019-07-04 12:30:42.262152: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 4008000000 Hz
2019-07-04 12:30:42.262686: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5623663f1d30 executing computations on platform Host. Devices:
2019-07-04 12:30:42.262750: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
2019-07-04 12:30:42.263222: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1159] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-04 12:30:42.263281: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1165]
2019-07-04 12:30:42.369886: I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:1006] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
2019-07-04 12:30:42.370218: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x5623663f3180 executing computations on platform CUDA. Devices:
2019-07-04 12:30:42.370232: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): GeForce RTX 2060, Compute Capability 7.5
Model: "sequential"


Looks like I might need to install CUDA 10... working on that now.

Further update: TF still can't find cuda 10 libs...

No comments:

Post a Comment

Relegator update

Kripa has produced some really nice plots of significance vs decision function threshold for the regressor.  NICE. We also have plots of a...