We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hi,
I am trying to reproduce the performance measurements of the article Performance–energy trade-offs of deep learning convolution algorithms on ARM processors.
However, when I try to run pydtnn_benchmark with the following options:
python3 -Ou pydtnn_benchmark.py \ --model=resnet50v15_imagenet \ --dataset=imagenet \ --dataset_train_path=datasets/imagenet_test/ \ --dataset_test_path=datasets/imagenet_test/ \ --weights_and_bias_filename=utils/resnet50_weights_pydtnn_kernels.npz \ --evaluate_only=True \ --test_as_validation=False \ --batch_size=1
I get the following output:
**** resnet50v15_imagenet model... +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | Layer | Type | #Params | Output shape | Weights shape | Parameters | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 0 | InputCPU | 0 | (224, 224, 3) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 1 | Conv2DCPU | 9472 |(112, 112, 64) | (3, 7, 7, 64) |padd=(3,3), stride=(2,2), dilat=(1,1)| +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 2 | BatchNormalizationCPU | 256 |(112, 112, 64) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 3 | ReluCPU | 0 |(112, 112, 64) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 4 | MaxPool2DCPU | 0 | (56, 56, 64) | (3, 3) |padd=(1,1), stride=(2,2), dilat=(1,1)| +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 5 |AdditionBlockCPU (2-path) | | (56, 56, 256) | | | |Path 0 | | | | | | | 6 | Conv2DCPU | 4160 | (56, 56, 64) | (64, 1, 1, 64) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 7 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 8 | ReluCPU | 0 | (56, 56, 64) | | | | 9 | Conv2DCPU | 36928 | (56, 56, 64) | (64, 3, 3, 64) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 10 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 11 | ReluCPU | 0 | (56, 56, 64) | | | | 12 | Conv2DCPU | 16640 | (56, 56, 256) | (64, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 13 | BatchNormalizationCPU | 1024 | (56, 56, 256) | | | |Path 1 | | | | | | | 14 | Conv2DCPU | 16640 | (56, 56, 256) | (64, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 15 | BatchNormalizationCPU | 1024 | (56, 56, 256) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 16 | ReluCPU | 0 | (56, 56, 256) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 17 |AdditionBlockCPU (2-path) | | (56, 56, 256) | | | |Path 0 | | | | | | | 18 | Conv2DCPU | 16448 | (56, 56, 64) | (256, 1, 1, 64) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 19 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 20 | ReluCPU | 0 | (56, 56, 64) | | | | 21 | Conv2DCPU | 36928 | (56, 56, 64) | (64, 3, 3, 64) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 22 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 23 | ReluCPU | 0 | (56, 56, 64) | | | | 24 | Conv2DCPU | 16640 | (56, 56, 256) | (64, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 25 | BatchNormalizationCPU | 1024 | (56, 56, 256) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 26 | ReluCPU | 0 | (56, 56, 256) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 27 |AdditionBlockCPU (2-path) | | (56, 56, 256) | | | |Path 0 | | | | | | | 28 | Conv2DCPU | 16448 | (56, 56, 64) | (256, 1, 1, 64) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 29 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 30 | ReluCPU | 0 | (56, 56, 64) | | | | 31 | Conv2DCPU | 36928 | (56, 56, 64) | (64, 3, 3, 64) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 32 | BatchNormalizationCPU | 256 | (56, 56, 64) | | | | 33 | ReluCPU | 0 | (56, 56, 64) | | | | 34 | Conv2DCPU | 16640 | (56, 56, 256) | (64, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 35 | BatchNormalizationCPU | 1024 | (56, 56, 256) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 36 | ReluCPU | 0 | (56, 56, 256) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 37 |AdditionBlockCPU (2-path) | | (28, 28, 512) | | | |Path 0 | | | | | | | 38 | Conv2DCPU | 32896 | (56, 56, 128) | (256, 1, 1, 128) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 39 | BatchNormalizationCPU | 512 | (56, 56, 128) | | | | 40 | ReluCPU | 0 | (56, 56, 128) | | | | 41 | Conv2DCPU | 147584 | (28, 28, 128) | (128, 3, 3, 128) |padd=(1,1), stride=(2,2), dilat=(1,1)| | 42 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 43 | ReluCPU | 0 | (28, 28, 128) | | | | 44 | Conv2DCPU | 66048 | (28, 28, 512) | (128, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 45 | BatchNormalizationCPU | 2048 | (28, 28, 512) | | | |Path 1 | | | | | | | 46 | Conv2DCPU | 131584 | (28, 28, 512) | (256, 1, 1, 512) |padd=(0,0), stride=(2,2), dilat=(1,1)| | 47 | BatchNormalizationCPU | 2048 | (28, 28, 512) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 48 | ReluCPU | 0 | (28, 28, 512) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 49 |AdditionBlockCPU (2-path) | | (28, 28, 512) | | | |Path 0 | | | | | | | 50 | Conv2DCPU | 65664 | (28, 28, 128) | (512, 1, 1, 128) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 51 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 52 | ReluCPU | 0 | (28, 28, 128) | | | | 53 | Conv2DCPU | 147584 | (28, 28, 128) | (128, 3, 3, 128) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 54 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 55 | ReluCPU | 0 | (28, 28, 128) | | | | 56 | Conv2DCPU | 66048 | (28, 28, 512) | (128, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 57 | BatchNormalizationCPU | 2048 | (28, 28, 512) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 58 | ReluCPU | 0 | (28, 28, 512) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 59 |AdditionBlockCPU (2-path) | | (28, 28, 512) | | | |Path 0 | | | | | | | 60 | Conv2DCPU | 65664 | (28, 28, 128) | (512, 1, 1, 128) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 61 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 62 | ReluCPU | 0 | (28, 28, 128) | | | | 63 | Conv2DCPU | 147584 | (28, 28, 128) | (128, 3, 3, 128) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 64 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 65 | ReluCPU | 0 | (28, 28, 128) | | | | 66 | Conv2DCPU | 66048 | (28, 28, 512) | (128, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 67 | BatchNormalizationCPU | 2048 | (28, 28, 512) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 68 | ReluCPU | 0 | (28, 28, 512) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 69 |AdditionBlockCPU (2-path) | | (28, 28, 512) | | | |Path 0 | | | | | | | 70 | Conv2DCPU | 65664 | (28, 28, 128) | (512, 1, 1, 128) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 71 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 72 | ReluCPU | 0 | (28, 28, 128) | | | | 73 | Conv2DCPU | 147584 | (28, 28, 128) | (128, 3, 3, 128) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 74 | BatchNormalizationCPU | 512 | (28, 28, 128) | | | | 75 | ReluCPU | 0 | (28, 28, 128) | | | | 76 | Conv2DCPU | 66048 | (28, 28, 512) | (128, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 77 | BatchNormalizationCPU | 2048 | (28, 28, 512) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 78 | ReluCPU | 0 | (28, 28, 512) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 79 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 80 | Conv2DCPU | 131328 | (28, 28, 256) | (512, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 81 | BatchNormalizationCPU | 1024 | (28, 28, 256) | | | | 82 | ReluCPU | 0 | (28, 28, 256) | | | | 83 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(2,2), dilat=(1,1)| | 84 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 85 | ReluCPU | 0 | (14, 14, 256) | | | | 86 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 87 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | | 88 | Conv2DCPU | 525312 |(14, 14, 1024) | (512, 1, 1, 1024) |padd=(0,0), stride=(2,2), dilat=(1,1)| | 89 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 90 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 91 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 92 | Conv2DCPU | 262400 | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 93 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 94 | ReluCPU | 0 | (14, 14, 256) | | | | 95 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 96 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 97 | ReluCPU | 0 | (14, 14, 256) | | | | 98 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 99 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 100 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 101 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 102 | Conv2DCPU | 262400 | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 103 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 104 | ReluCPU | 0 | (14, 14, 256) | | | | 105 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 106 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 107 | ReluCPU | 0 | (14, 14, 256) | | | | 108 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 109 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 110 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 111 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 112 | Conv2DCPU | 262400 | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 113 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 114 | ReluCPU | 0 | (14, 14, 256) | | | | 115 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 116 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 117 | ReluCPU | 0 | (14, 14, 256) | | | | 118 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 119 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 120 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 121 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 122 | Conv2DCPU | 262400 | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 123 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 124 | ReluCPU | 0 | (14, 14, 256) | | | | 125 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 126 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 127 | ReluCPU | 0 | (14, 14, 256) | | | | 128 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 129 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 130 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 131 |AdditionBlockCPU (2-path) | |(14, 14, 1024) | | | |Path 0 | | | | | | | 132 | Conv2DCPU | 262400 | (14, 14, 256) | (1024, 1, 1, 256) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 133 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 134 | ReluCPU | 0 | (14, 14, 256) | | | | 135 | Conv2DCPU | 590080 | (14, 14, 256) | (256, 3, 3, 256) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 136 | BatchNormalizationCPU | 1024 | (14, 14, 256) | | | | 137 | ReluCPU | 0 | (14, 14, 256) | | | | 138 | Conv2DCPU | 263168 |(14, 14, 1024) | (256, 1, 1, 1024) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 139 | BatchNormalizationCPU | 4096 |(14, 14, 1024) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 140 | ReluCPU | 0 |(14, 14, 1024) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 141 |AdditionBlockCPU (2-path) | | (7, 7, 2048) | | | |Path 0 | | | | | | | 142 | Conv2DCPU | 524800 | (14, 14, 512) | (1024, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 143 | BatchNormalizationCPU | 2048 | (14, 14, 512) | | | | 144 | ReluCPU | 0 | (14, 14, 512) | | | | 145 | Conv2DCPU | 2359808 | (7, 7, 512) | (512, 3, 3, 512) |padd=(1,1), stride=(2,2), dilat=(1,1)| | 146 | BatchNormalizationCPU | 2048 | (7, 7, 512) | | | | 147 | ReluCPU | 0 | (7, 7, 512) | | | | 148 | Conv2DCPU | 1050624 | (7, 7, 2048) | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 149 | BatchNormalizationCPU | 8192 | (7, 7, 2048) | | | |Path 1 | | | | | | | 150 | Conv2DCPU | 2099200 | (7, 7, 2048) |(1024, 1, 1, 2048) |padd=(0,0), stride=(2,2), dilat=(1,1)| | 151 | BatchNormalizationCPU | 8192 | (7, 7, 2048) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 152 | ReluCPU | 0 | (7, 7, 2048) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 153 |AdditionBlockCPU (2-path) | | (7, 7, 2048) | | | |Path 0 | | | | | | | 154 | Conv2DCPU | 1049088 | (7, 7, 512) | (2048, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 155 | BatchNormalizationCPU | 2048 | (7, 7, 512) | | | | 156 | ReluCPU | 0 | (7, 7, 512) | | | | 157 | Conv2DCPU | 2359808 | (7, 7, 512) | (512, 3, 3, 512) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 158 | BatchNormalizationCPU | 2048 | (7, 7, 512) | | | | 159 | ReluCPU | 0 | (7, 7, 512) | | | | 160 | Conv2DCPU | 1050624 | (7, 7, 2048) | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 161 | BatchNormalizationCPU | 8192 | (7, 7, 2048) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 162 | ReluCPU | 0 | (7, 7, 2048) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 163 |AdditionBlockCPU (2-path) | | (7, 7, 2048) | | | |Path 0 | | | | | | | 164 | Conv2DCPU | 1049088 | (7, 7, 512) | (2048, 1, 1, 512) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 165 | BatchNormalizationCPU | 2048 | (7, 7, 512) | | | | 166 | ReluCPU | 0 | (7, 7, 512) | | | | 167 | Conv2DCPU | 2359808 | (7, 7, 512) | (512, 3, 3, 512) |padd=(1,1), stride=(1,1), dilat=(1,1)| | 168 | BatchNormalizationCPU | 2048 | (7, 7, 512) | | | | 169 | ReluCPU | 0 | (7, 7, 512) | | | | 170 | Conv2DCPU | 1050624 | (7, 7, 2048) | (512, 1, 1, 2048) |padd=(0,0), stride=(1,1), dilat=(1,1)| | 171 | BatchNormalizationCPU | 8192 | (7, 7, 2048) | | | |Path 1 | | | | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 172 | ReluCPU | 0 | (7, 7, 2048) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 173 | AveragePool2DCPU | 0 | (1, 1, 2048) | (7, 7) |padd=(0,0), stride=(1,1), dilat=(1,1)| +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 174 | FlattenCPU | 0 | (2048,) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 175 | FCCPU | 2049000 | (1000,) | (2048, 1000) | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | 176 | SoftmaxCPU | 0 | (1000,) | | | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ | Total parameters 25636712 97.8 MBytes | +-------+--------------------------+---------+---------------+-------------------+-------------------------------------+ **** Loading imagenet dataset... **** Parameters: model_name : resnet50v15_imagenet batch_size : 1 global_batch_size : None dtype : <class 'numpy.float32'> num_epochs : 1 steps_per_epoch : 0 evaluate_on_train : False evaluate_only : True weights_and_bias_filename : utils/resnet50_weights_pydtnn_kernels.npz history_file : None shared_storage : False enable_fused_bn_relu : False enable_fused_conv_relu : False enable_fused_conv_bn : False enable_fused_conv_bn_relu : False tensor_format : NHWC enable_best_of : False dataset_name : imagenet use_synthetic_data : False dataset_train_path : datasets/imagenet_test/ dataset_test_path : datasets/imagenet_test/ test_as_validation : False flip_images : False flip_images_prob : 0.5 crop_images : False crop_images_size : 16 crop_images_prob : 0.5 validation_split : 0.0 optimizer_name : sgd learning_rate : 0.01 learning_rate_scaling : True momentum : 0.9 decay : 0.0 nesterov : False beta1 : 0.99 beta2 : 0.999 epsilon : 1e-07 rho : 0.9 loss_func : categorical_cross_entropy metrics : categorical_accuracy lr_schedulers_names : early_stopping,reduce_lr_on_plateau,model_checkpoint warm_up_epochs : 5 early_stopping_metric : val_categorical_cross_entropy early_stopping_patience : 10 reduce_lr_on_plateau_metric : val_categorical_cross_entropy reduce_lr_on_plateau_factor : 0.1 reduce_lr_on_plateau_patience : 5 reduce_lr_on_plateau_min_lr : 0 reduce_lr_every_nepochs_factor : 0.1 reduce_lr_every_nepochs_nepochs: 5 reduce_lr_every_nepochs_min_lr : 0 stop_at_loss_metric : val_accuracy stop_at_loss_threshold : 0 model_checkpoint_metric : val_categorical_cross_entropy model_checkpoint_save_freq : 2 enable_conv_gemm : False enable_memory_cache : True enable_conv_winograd : False mpi_processes : 1 threads_per_process : 12 parallel : sequential non_blocking_mpi : False gpus_per_node : 0 enable_gpu : False enable_gpudirect : False enable_nccl : False enable_cudnn_auto_conv_alg : True tracing : False tracer_output : profile : False cpu_speed : 4000000000000.0 memory_bw : 50000000000.0 network_bw : 1000000000.0 network_lat : 5e-07 network_alg : vdg **** Evaluating on test dataset... Testing: 0%| | 25/39000000 [00:14<5693:59:38, 1.90 samples/s, test_cce: 18.4206810, test_acc: 0.00%]Traceback (most recent call last): File "pydtnn_benchmark.py", line 221, in <module> main() File "pydtnn_benchmark.py", line 131, in main _ = model.evaluate_dataset(dataset, model.batch_size, model.loss_func, metrics_list) File "/home/test-pydtnn/PyDTNN/pydtnn/model.py", line 677, in evaluate_dataset test_batch_loss = self.__evaluate_batch(x_batch, y_batch, self.batch_size, batch_size, File "/home/test-pydtnn/PyDTNN/pydtnn/model.py", line 638, in __evaluate_batch x = self.layers[i].forward(x) File "/home/test-pydtnn/PyDTNN/pydtnn/backends/cpu/layers/conv_2d_cpu.py", line 207, in _forward_nhwc_i2c y = y.reshape(-1, self.ho, self.wo, self.co) ValueError: cannot reshape array of size 680960 into shape (112,112,64) Testing: 0%| | 25/39000000 [00:15<6518:47:14, 1.66 samples/s, test_cce: 18.4206810, test_acc: 0.00%]
Do you have an idea on how to solve this error ?
The text was updated successfully, but these errors were encountered:
No branches or pull requests
Hi,
I am trying to reproduce the performance measurements of the article Performance–energy trade-offs of deep learning convolution algorithms on ARM processors.
However, when I try to run pydtnn_benchmark with the following options:
I get the following output:
Do you have an idea on how to solve this error ?
The text was updated successfully, but these errors were encountered: