Skip to content

Fails CpG module training  #36

@yoramzarai

Description

@yoramzarai

Hi

I tried to run, interactively, the cells in example/notebooks/basics/index.ipynb. I have a problem in training the CpG model. Here is the output:

#################################
dcpg_train.py ./data/c1_000000-001000.h5 --val_files ./data/c13_000000-001000.h5 --cpg_model RnnL1 --out_dir ./models/cpg --nb_epoch 1 --nb_train_sample 1000 --nb_val_sample 1000
#################################
Using TensorFlow backend.
INFO (2020-04-22 15:04:14,822): Building model ...
Replicate names:
BS27_1_SER, BS27_3_SER, BS27_5_SER, BS27_6_SER, BS27_8_SER

INFO (2020-04-22 15:04:14,834): Building CpG model ...
WARNING:tensorflow:From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
WARNING (2020-04-22 15:04:14,846): From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/resource_variable_ops.py:1630: calling BaseResourceVariable.__init__ (from tensorflow.python.ops.resource_variable_ops) with constraint is deprecated and will be removed in a future version.
Instructions for updating:
If using Keras pass *_constraint arguments to layers.
Model: "RnnL1"
______________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
==================================================================================================
cpg/state (InputLayer)          (None, 5, 50)        0                                            
__________________________________________________________________________________________________
cpg/dist (InputLayer)           (None, 5, 50)        0                                            
__________________________________________________________________________________________________
cpg/concatenate_1 (Concatenate) (None, 5, 100)       0           cpg/state[0][0]                  
                                                                 cpg/dist[0][0]                   
__________________________________________________________________________________________________
cpg/time_distributed_1 (TimeDis (None, 5, 256)       25856       cpg/concatenate_1[0][0]          
__________________________________________________________________________________________________
cpg/bidirectional_1 (Bidirectio (None, 512)          787968      cpg/time_distributed_1[0][0]     
__________________________________________________________________________________________________
cpg/dropout_1 (Dropout)         (None, 512)          0           cpg/bidirectional_1[0][0]        
__________________________________________________________________________________________________
cpg/BS27_1_SER (Dense)          (None, 1)            513         cpg/dropout_1[0][0]              
__________________________________________________________________________________________________
cpg/BS27_3_SER (Dense)          (None, 1)            513         cpg/dropout_1[0][0]              
__________________________________________________________________________________________________
cpg/BS27_5_SER (Dense)          (None, 1)            513         cpg/dropout_1[0][0]              
__________________________________________________________________________________________________
cpg/BS27_6_SER (Dense)          (None, 1)            513         cpg/dropout_1[0][0]              
__________________________________________________________________________________________________
cpg/BS27_8_SER (Dense)          (None, 1)            513         cpg/dropout_1[0][0]              
==================================================================================================
Total params: 816,389
Trainable params: 816,389
Non-trainable params: 0
__________________________________________________________________________________________________
INFO (2020-04-22 15:04:15,272): Computing output statistics ...
Output statistics:
          name | nb_tot | nb_obs | frac_obs | mean |  var
---------------------------------------------------------
cpg/BS27_1_SER |   1000 |    193 |     0.19 | 0.84 | 0.13
cpg/BS27_3_SER |   1000 |    209 |     0.21 | 0.77 | 0.18
cpg/BS27_5_SER |   1000 |    196 |     0.20 | 0.75 | 0.19
cpg/BS27_6_SER |   1000 |    203 |     0.20 | 0.62 | 0.24
cpg/BS27_8_SER |   1000 |    200 |     0.20 | 0.81 | 0.15

Class weights:
cpg/BS27_1_SER | cpg/BS27_3_SER | cpg/BS27_5_SER | cpg/BS27_6_SER | cpg/BS27_8_SER
----------------------------------------------------------------------------------
        0=0.84 |         0=0.77 |         0=0.75 |         0=0.62 |         0=0.81
        1=0.16 |         1=0.23 |         1=0.25 |         1=0.38 |         1=0.19
WARNING:tensorflow:From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING (2020-04-22 15:04:15,502): From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO (2020-04-22 15:04:15,568): Loading data ...
INFO (2020-04-22 15:04:15,586): Initializing callbacks ...
INFO (2020-04-22 15:04:23,654): Training model ...

WARNING:tensorflow:From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
WARNING (2020-04-22 15:04:15,502): From /tamir1/yoramzar/Projects/Models/NN/deepcpg/venv37/lib/python3.7/site-packages/tensorflow_core/python/ops/nn_impl.py:183: where (from tensorflow.python.ops.array_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.where in 2.0, which has the same broadcast rule as np.where
INFO (2020-04-22 15:04:15,568): Loading data ...
INFO (2020-04-22 15:04:15,586): Initializing callbacks ...
INFO (2020-04-22 15:04:23,654): Training model ...

Training samples: 1000
Validation samples: 1000
WARNING:tensorflow:OMP_NUM_THREADS is no longer used by the default Keras config. To configure the number of threads, use tf.config.threading APIs.
WARNING (2020-04-22 15:04:24,831): OMP_NUM_THREADS is no longer used by the default Keras config. To configure the number of threads, use tf.config.threading APIs.
2020-04-22 15:04:24.831482: W tensorflow/stream_executor/platform/default/dso_loader.cc:55] Could not load dynamic library 'libcuda.so.1'; dlerror: libcuda.so.1: cannot open shared object file: No such file or directory
2020-04-22 15:04:24.831508: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: UNKNOWN ERROR (303)
2020-04-22 15:04:24.831531: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (compute-0-254.power5): /proc/driver/nvidia/version does not exist
2020-04-22 15:04:24.831851: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2020-04-22 15:04:24.841381: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2600000000 Hz
cannot allocate memory for thread-local data: ABORT
cannot allocate memory for thread-local data: ABORT
: 127

Why is it pointing to libcuda? I am assuming this notebook act be run on a cpu, right?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions