Reshape error for 26th sample for inference of ResNet50v15 over imagenet #4

EnriqueGlv · 2024-02-28T09:30:22Z

Hi,

I am trying to reproduce the performance measurements of the article Performance–energy trade-offs of deep learning convolution algorithms on ARM processors.

However, when I try to run pydtnn_benchmark with the following options:

python3 -Ou pydtnn_benchmark.py \
--model=resnet50v15_imagenet \
--dataset=imagenet \
--dataset_train_path=datasets/imagenet_test/ \
--dataset_test_path=datasets/imagenet_test/ \
--weights_and_bias_filename=utils/resnet50_weights_pydtnn_kernels.npz \
--evaluate_only=True \
--test_as_validation=False \
--batch_size=1

I get the following output:

**** resnet50v15_imagenet model...
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
| Layer |           Type           | #Params | Output shape  |   Weights shape   |             Parameters              |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   0   |         InputCPU         |    0    | (224, 224, 3) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   1   |        Conv2DCPU         |  9472   |(112, 112, 64) |   (3, 7, 7, 64)   |padd=(3,3), stride=(2,2), dilat=(1,1)|
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   2   |  BatchNormalizationCPU   |   256   |(112, 112, 64) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   3   |         ReluCPU          |    0    |(112, 112, 64) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   4   |       MaxPool2DCPU       |    0    | (56, 56, 64)  |      (3, 3)       |padd=(1,1), stride=(2,2), dilat=(1,1)|
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|   5   |AdditionBlockCPU (2-path) |         | (56, 56, 256) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|   6   |        Conv2DCPU         |  4160   | (56, 56, 64)  |  (64, 1, 1, 64)   |padd=(0,0), stride=(1,1), dilat=(1,1)|
|   7   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|   8   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|   9   |        Conv2DCPU         |  36928  | (56, 56, 64)  |  (64, 3, 3, 64)   |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  10   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|  11   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|  12   |        Conv2DCPU         |  16640  | (56, 56, 256) |  (64, 1, 1, 256)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  13   |  BatchNormalizationCPU   |  1024   | (56, 56, 256) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
|  14   |        Conv2DCPU         |  16640  | (56, 56, 256) |  (64, 1, 1, 256)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  15   |  BatchNormalizationCPU   |  1024   | (56, 56, 256) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  16   |         ReluCPU          |    0    | (56, 56, 256) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  17   |AdditionBlockCPU (2-path) |         | (56, 56, 256) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  18   |        Conv2DCPU         |  16448  | (56, 56, 64)  |  (256, 1, 1, 64)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  19   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|  20   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|  21   |        Conv2DCPU         |  36928  | (56, 56, 64)  |  (64, 3, 3, 64)   |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  22   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|  23   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|  24   |        Conv2DCPU         |  16640  | (56, 56, 256) |  (64, 1, 1, 256)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  25   |  BatchNormalizationCPU   |  1024   | (56, 56, 256) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  26   |         ReluCPU          |    0    | (56, 56, 256) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  27   |AdditionBlockCPU (2-path) |         | (56, 56, 256) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  28   |        Conv2DCPU         |  16448  | (56, 56, 64)  |  (256, 1, 1, 64)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  29   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|  30   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|  31   |        Conv2DCPU         |  36928  | (56, 56, 64)  |  (64, 3, 3, 64)   |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  32   |  BatchNormalizationCPU   |   256   | (56, 56, 64)  |                   |                                     |
|  33   |         ReluCPU          |    0    | (56, 56, 64)  |                   |                                     |
|  34   |        Conv2DCPU         |  16640  | (56, 56, 256) |  (64, 1, 1, 256)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  35   |  BatchNormalizationCPU   |  1024   | (56, 56, 256) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  36   |         ReluCPU          |    0    | (56, 56, 256) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  37   |AdditionBlockCPU (2-path) |         | (28, 28, 512) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  38   |        Conv2DCPU         |  32896  | (56, 56, 128) | (256, 1, 1, 128)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  39   |  BatchNormalizationCPU   |   512   | (56, 56, 128) |                   |                                     |
|  40   |         ReluCPU          |    0    | (56, 56, 128) |                   |                                     |
|  41   |        Conv2DCPU         | 147584  | (28, 28, 128) | (128, 3, 3, 128)  |padd=(1,1), stride=(2,2), dilat=(1,1)|
|  42   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  43   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  44   |        Conv2DCPU         |  66048  | (28, 28, 512) | (128, 1, 1, 512)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  45   |  BatchNormalizationCPU   |  2048   | (28, 28, 512) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
|  46   |        Conv2DCPU         | 131584  | (28, 28, 512) | (256, 1, 1, 512)  |padd=(0,0), stride=(2,2), dilat=(1,1)|
|  47   |  BatchNormalizationCPU   |  2048   | (28, 28, 512) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  48   |         ReluCPU          |    0    | (28, 28, 512) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  49   |AdditionBlockCPU (2-path) |         | (28, 28, 512) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  50   |        Conv2DCPU         |  65664  | (28, 28, 128) | (512, 1, 1, 128)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  51   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  52   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  53   |        Conv2DCPU         | 147584  | (28, 28, 128) | (128, 3, 3, 128)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  54   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  55   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  56   |        Conv2DCPU         |  66048  | (28, 28, 512) | (128, 1, 1, 512)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  57   |  BatchNormalizationCPU   |  2048   | (28, 28, 512) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  58   |         ReluCPU          |    0    | (28, 28, 512) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  59   |AdditionBlockCPU (2-path) |         | (28, 28, 512) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  60   |        Conv2DCPU         |  65664  | (28, 28, 128) | (512, 1, 1, 128)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  61   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  62   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  63   |        Conv2DCPU         | 147584  | (28, 28, 128) | (128, 3, 3, 128)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  64   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  65   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  66   |        Conv2DCPU         |  66048  | (28, 28, 512) | (128, 1, 1, 512)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  67   |  BatchNormalizationCPU   |  2048   | (28, 28, 512) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  68   |         ReluCPU          |    0    | (28, 28, 512) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  69   |AdditionBlockCPU (2-path) |         | (28, 28, 512) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  70   |        Conv2DCPU         |  65664  | (28, 28, 128) | (512, 1, 1, 128)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  71   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  72   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  73   |        Conv2DCPU         | 147584  | (28, 28, 128) | (128, 3, 3, 128)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  74   |  BatchNormalizationCPU   |   512   | (28, 28, 128) |                   |                                     |
|  75   |         ReluCPU          |    0    | (28, 28, 128) |                   |                                     |
|  76   |        Conv2DCPU         |  66048  | (28, 28, 512) | (128, 1, 1, 512)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  77   |  BatchNormalizationCPU   |  2048   | (28, 28, 512) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  78   |         ReluCPU          |    0    | (28, 28, 512) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  79   |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  80   |        Conv2DCPU         | 131328  | (28, 28, 256) | (512, 1, 1, 256)  |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  81   |  BatchNormalizationCPU   |  1024   | (28, 28, 256) |                   |                                     |
|  82   |         ReluCPU          |    0    | (28, 28, 256) |                   |                                     |
|  83   |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(2,2), dilat=(1,1)|
|  84   |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  85   |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  86   |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  87   |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
|  88   |        Conv2DCPU         | 525312  |(14, 14, 1024) | (512, 1, 1, 1024) |padd=(0,0), stride=(2,2), dilat=(1,1)|
|  89   |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  90   |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  91   |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  92   |        Conv2DCPU         | 262400  | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  93   |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  94   |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  95   |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  96   |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  97   |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  98   |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  99   |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  100  |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  101  |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  102  |        Conv2DCPU         | 262400  | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  103  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  104  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  105  |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  106  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  107  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  108  |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  109  |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  110  |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  111  |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  112  |        Conv2DCPU         | 262400  | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  113  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  114  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  115  |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  116  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  117  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  118  |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  119  |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  120  |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  121  |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  122  |        Conv2DCPU         | 262400  | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  123  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  124  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  125  |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  126  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  127  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  128  |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  129  |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  130  |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  131  |AdditionBlockCPU (2-path) |         |(14, 14, 1024) |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  132  |        Conv2DCPU         | 262400  | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  133  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  134  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  135  |        Conv2DCPU         | 590080  | (14, 14, 256) | (256, 3, 3, 256)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  136  |  BatchNormalizationCPU   |  1024   | (14, 14, 256) |                   |                                     |
|  137  |         ReluCPU          |    0    | (14, 14, 256) |                   |                                     |
|  138  |        Conv2DCPU         | 263168  |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  139  |  BatchNormalizationCPU   |  4096   |(14, 14, 1024) |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  140  |         ReluCPU          |    0    |(14, 14, 1024) |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  141  |AdditionBlockCPU (2-path) |         | (7, 7, 2048)  |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  142  |        Conv2DCPU         | 524800  | (14, 14, 512) | (1024, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  143  |  BatchNormalizationCPU   |  2048   | (14, 14, 512) |                   |                                     |
|  144  |         ReluCPU          |    0    | (14, 14, 512) |                   |                                     |
|  145  |        Conv2DCPU         | 2359808 |  (7, 7, 512)  | (512, 3, 3, 512)  |padd=(1,1), stride=(2,2), dilat=(1,1)|
|  146  |  BatchNormalizationCPU   |  2048   |  (7, 7, 512)  |                   |                                     |
|  147  |         ReluCPU          |    0    |  (7, 7, 512)  |                   |                                     |
|  148  |        Conv2DCPU         | 1050624 | (7, 7, 2048)  | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  149  |  BatchNormalizationCPU   |  8192   | (7, 7, 2048)  |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
|  150  |        Conv2DCPU         | 2099200 | (7, 7, 2048)  |(1024, 1, 1, 2048) |padd=(0,0), stride=(2,2), dilat=(1,1)|
|  151  |  BatchNormalizationCPU   |  8192   | (7, 7, 2048)  |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  152  |         ReluCPU          |    0    | (7, 7, 2048)  |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  153  |AdditionBlockCPU (2-path) |         | (7, 7, 2048)  |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  154  |        Conv2DCPU         | 1049088 |  (7, 7, 512)  | (2048, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  155  |  BatchNormalizationCPU   |  2048   |  (7, 7, 512)  |                   |                                     |
|  156  |         ReluCPU          |    0    |  (7, 7, 512)  |                   |                                     |
|  157  |        Conv2DCPU         | 2359808 |  (7, 7, 512)  | (512, 3, 3, 512)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  158  |  BatchNormalizationCPU   |  2048   |  (7, 7, 512)  |                   |                                     |
|  159  |         ReluCPU          |    0    |  (7, 7, 512)  |                   |                                     |
|  160  |        Conv2DCPU         | 1050624 | (7, 7, 2048)  | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  161  |  BatchNormalizationCPU   |  8192   | (7, 7, 2048)  |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  162  |         ReluCPU          |    0    | (7, 7, 2048)  |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  163  |AdditionBlockCPU (2-path) |         | (7, 7, 2048)  |                   |                                     |
|Path 0 |                          |         |               |                   |                                     |
|  164  |        Conv2DCPU         | 1049088 |  (7, 7, 512)  | (2048, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  165  |  BatchNormalizationCPU   |  2048   |  (7, 7, 512)  |                   |                                     |
|  166  |         ReluCPU          |    0    |  (7, 7, 512)  |                   |                                     |
|  167  |        Conv2DCPU         | 2359808 |  (7, 7, 512)  | (512, 3, 3, 512)  |padd=(1,1), stride=(1,1), dilat=(1,1)|
|  168  |  BatchNormalizationCPU   |  2048   |  (7, 7, 512)  |                   |                                     |
|  169  |         ReluCPU          |    0    |  (7, 7, 512)  |                   |                                     |
|  170  |        Conv2DCPU         | 1050624 | (7, 7, 2048)  | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)|
|  171  |  BatchNormalizationCPU   |  8192   | (7, 7, 2048)  |                   |                                     |
|Path 1 |                          |         |               |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  172  |         ReluCPU          |    0    | (7, 7, 2048)  |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  173  |     AveragePool2DCPU     |    0    | (1, 1, 2048)  |      (7, 7)       |padd=(0,0), stride=(1,1), dilat=(1,1)|
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  174  |        FlattenCPU        |    0    |    (2048,)    |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  175  |          FCCPU           | 2049000 |    (1000,)    |   (2048, 1000)    |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|  176  |        SoftmaxCPU        |    0    |    (1000,)    |                   |                                     |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
|             Total parameters      25636712    97.8 MBytes                                                            |
+-------+--------------------------+---------+---------------+-------------------+-------------------------------------+
**** Loading imagenet dataset...
**** Parameters:
  model_name                     : resnet50v15_imagenet
  batch_size                     : 1
  global_batch_size              : None
  dtype                          : <class 'numpy.float32'>
  num_epochs                     : 1
  steps_per_epoch                : 0
  evaluate_on_train              : False
  evaluate_only                  : True
  weights_and_bias_filename      : utils/resnet50_weights_pydtnn_kernels.npz
  history_file                   : None
  shared_storage                 : False
  enable_fused_bn_relu           : False
  enable_fused_conv_relu         : False
  enable_fused_conv_bn           : False
  enable_fused_conv_bn_relu      : False
  tensor_format                  : NHWC
  enable_best_of                 : False
  dataset_name                   : imagenet
  use_synthetic_data             : False
  dataset_train_path             : datasets/imagenet_test/
  dataset_test_path              : datasets/imagenet_test/
  test_as_validation             : False
  flip_images                    : False
  flip_images_prob               : 0.5
  crop_images                    : False
  crop_images_size               : 16
  crop_images_prob               : 0.5
  validation_split               : 0.0
  optimizer_name                 : sgd
  learning_rate                  : 0.01
  learning_rate_scaling          : True
  momentum                       : 0.9
  decay                          : 0.0
  nesterov                       : False
  beta1                          : 0.99
  beta2                          : 0.999
  epsilon                        : 1e-07
  rho                            : 0.9
  loss_func                      : categorical_cross_entropy
  metrics                        : categorical_accuracy
  lr_schedulers_names            : early_stopping,reduce_lr_on_plateau,model_checkpoint
  warm_up_epochs                 : 5
  early_stopping_metric          : val_categorical_cross_entropy
  early_stopping_patience        : 10
  reduce_lr_on_plateau_metric    : val_categorical_cross_entropy
  reduce_lr_on_plateau_factor    : 0.1
  reduce_lr_on_plateau_patience  : 5
  reduce_lr_on_plateau_min_lr    : 0
  reduce_lr_every_nepochs_factor : 0.1
  reduce_lr_every_nepochs_nepochs: 5
  reduce_lr_every_nepochs_min_lr : 0
  stop_at_loss_metric            : val_accuracy
  stop_at_loss_threshold         : 0
  model_checkpoint_metric        : val_categorical_cross_entropy
  model_checkpoint_save_freq     : 2
  enable_conv_gemm               : False
  enable_memory_cache            : True
  enable_conv_winograd           : False
  mpi_processes                  : 1
  threads_per_process            : 12
  parallel                       : sequential
  non_blocking_mpi               : False
  gpus_per_node                  : 0
  enable_gpu                     : False
  enable_gpudirect               : False
  enable_nccl                    : False
  enable_cudnn_auto_conv_alg     : True
  tracing                        : False
  tracer_output                  : 
  profile                        : False
  cpu_speed                      : 4000000000000.0
  memory_bw                      : 50000000000.0
  network_bw                     : 1000000000.0
  network_lat                    : 5e-07
  network_alg                    : vdg
**** Evaluating on test dataset...
Testing:   0%|                 | 25/39000000 [00:14<5693:59:38,  1.90 samples/s, test_cce: 18.4206810, test_acc:  0.00%]Traceback (most recent call last):
  File "pydtnn_benchmark.py", line 221, in <module>
    main()
  File "pydtnn_benchmark.py", line 131, in main
    _ = model.evaluate_dataset(dataset, model.batch_size, model.loss_func, metrics_list)
  File "/home/test-pydtnn/PyDTNN/pydtnn/model.py", line 677, in evaluate_dataset
    test_batch_loss = self.__evaluate_batch(x_batch, y_batch, self.batch_size, batch_size,
  File "/home/test-pydtnn/PyDTNN/pydtnn/model.py", line 638, in __evaluate_batch
    x = self.layers[i].forward(x)
  File "/home/test-pydtnn/PyDTNN/pydtnn/backends/cpu/layers/conv_2d_cpu.py", line 207, in _forward_nhwc_i2c
    y = y.reshape(-1, self.ho, self.wo, self.co)
ValueError: cannot reshape array of size 680960 into shape (112,112,64)
Testing:   0%|                 | 25/39000000 [00:15<6518:47:14,  1.66 samples/s, test_cce: 18.4206810, test_acc:  0.00%]

Do you have an idea on how to solve this error ?

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reshape error for 26th sample for inference of ResNet50v15 over imagenet #4

Reshape error for 26th sample for inference of ResNet50v15 over imagenet #4

EnriqueGlv commented Feb 28, 2024 •

edited

Loading

Reshape error for 26th sample for inference of ResNet50v15 over imagenet #4

Reshape error for 26th sample for inference of ResNet50v15 over imagenet #4

Comments

EnriqueGlv commented Feb 28, 2024 • edited Loading

EnriqueGlv commented Feb 28, 2024 •

edited

Loading