18 Jan 02:21

tprimak

v0.21.3

4f5d024

v0.21.3

This is a patch release containing following changes to v0.21.2:

Reduced the upper-bound of memory requirement for gemm-based convolution to reduce the probability of OOM error (cd99749)
Significantly reduced the size required for 1x1 convolution (5643445)
Added new dummy stream (cba5823)

Assets 2

17 Jan 22:50

tprimak

v1.1.3

e1fe0eb

v1.1.3

This is a patch release containing following changes to v1.1.2:

Fixed the mean and variance memory descriptors in layer normalization (65f1908)
Fixed the layer normalization formula (c176ceb)

Assets 10

13 Jan 18:01

anita-intel

v1.2-rc

4ea278b

v1.2-rc Pre-release

Pre-release

This is a release candidate for DNNL v1.2. Please provide feedback and report bugs in Github issues.

Assets 2

24 Dec 21:06

tprimak

v1.1.2

cb2cc7a

v1.1.2

This is a patch release containing following changes to v1.1.1:

Fixed threading over the spatial in bfloat16 batched normalization (017b6c9)
Fixed read past end-of-buffer error for int8 convolution (7d6f45e)
Fixed condition for dispatching optimized channel blocking in fp32 backward convolution on Intel Xeon Phi(TM) processor (846eba1)
Fixed fp32 backward convolution for shapes with spatial strides over the depth dimension (002e3ab)
Fixed softmax with zero sizes on GPU (936bff4)
Fixed int8 deconvolution with dilation when ih <= dh (3e3bacb)
Enabled back fp32 -> u8 reorder for RNN (a2c2507)
Fixed segmentation fault in bfloat16 backward convolution from kd_padding=0 computation (52d476c)
Fixed segmentation fault in bfloat16 forward convolution due to push/pop imbalance (4f6e3d5)
Fixed library version for OS X build (0d85005)
Fixed padding by channels in concat (a265c7d)
Added full text of third party licenses and copyright notices to LICENSE file (79f204c)
Added separate README for binary packages (28f4c96)
Fixed computing per-oc mask in RNN (ff3ffab)
Added workaround for number of cores calculation in Xbyak (301b088)

Assets 10

11 Dec 21:46

tprimak

v2.0-beta03

9ff7335

v2.0-beta03 Pre-release

Pre-release

This is a preview release for oneDNN v2.0. The release is based on oneDNN v1.1 and the release notes below include incremental changes.

Binary distribution of this software is available as Intel(R) oneAPI Deep Neural Network Library in Intel(R) oneAPI.

New functionality

SYCL API extensions and interoperability with SYCL code
Support for Intel DPC++ compiler and runtime

Usability

SYCL interoperability examples

Known Limitations

Some f32/f16 convolutions with non-square spatial shape of filters may produce incorrect results on GPU.
Some bf16 backward convolutions with 3D spatial and negative padding may produce segfault on CPU.
Non-Intel GPUs are not supported. The library API allows to create a DNNL engine by index (the order of devices is determined by the SYCL runtime), and there is no check for GPU devices being non-Intel. To have more control, users can create a DNNL engine passing SYCL device and context explicitly.
RNN primitive may hang on GPU if the number of recurrent cells is bigger than 40.
int8 RNN may produce incorrect results on GPU.
Backward propagation of Layer Normalization primitive produces incorrect results.
Intel Processor Graphics Gen11 is not supported.
When running GPU kernels that take longer than a certain time (it depends on OS and system settings) you may face a situation resulting in apparent hang of the application. Configure driver to disable this timeout and avoid hanging of DPC++ or OpenCL programs, including DNNL examples.

On Linux:

$ sudo bash -c 'echo N > /sys/module/i915/parameters/enable_hangcheck'

On Windows increase TdrDelay and TdrDdiDelay values using registry.

Assets 12

24 Oct 22:42

vpirogov

v1.0.4

a0a87d6

v1.0.4

This is a patch release containing following changes to v1.0.3:

Resolved int8 batch normalization performance degradation in comparison to v0.21 (ec19118)

Assets 10

22 Oct 19:58

vpirogov

v1.1.1

52c3052

v1.1.1

This is a patch release containing following changes to v1.1:

Fixed zero padding for memory formats with rank 3 and below (f97e174)
Fixed 'deprecated std::copy' warning with Microsoft C++ Compiler (ee276af)
Fixed tail scaling for int8 inner product (f2b68c7)
Fixed correctness issue for int8 GEMM with N=1 (0dd5c13)
Sum does not override the data type for destination memory descriptor when used with any (5301981)
Addressed following corner cases in CPU convolution implementation:
- Fixed tail processing in int8 depthwise convolution (7711b77)
- Fixed bias padding in bfloat16 depthwise convolution (0696ba6)
- Fixed correctness issue in s8s8 flavor of depthwise convolution (b614482)
- Fixed correctness issue in dilated convolution weight gradient implementation (c6ec0f9)

Assets 10

22 Oct 19:57

vpirogov

v1.0.3

0221032

v1.0.3

This is a patch release containing following changes to v1.0.2:

Fixed zero padding for memory formats with rank 3 and below (4d78aaf)
Fixed tail scaling for int8 inner product (41b5a7e)
Sum does not override the data type for destination memory descriptor when used with any (e979eda)
Improved s8s8 GEMM and inner product performance (4b44aa5)
Reduced memory consumption of GEMM-based algorithm for convolution weight gradient (f46b044)
Fixed negative padding processing in pooling (48ba96a)
Addressed memory leak in GPU deconvolution (686fc41)
Addressed memory leak in GPU stream (1206b2f)
Fixed fp16 GEMM correctness on GPU (c2425d4)
Fixed GEMM correctness on GPU for the case of small M dimension (ac2683f)
Addressed following corner cases in CPU convolution implementation:
- Fixed tail processing in int8 depthwise convolution (3a0943b)
- Fixed bias padding in bfloat16 depthwise convolution (3d9af7c)
- Fixed correctness issue in s8s8 flavor of depthwise convolution (e4d9049)
- Fixed correctness issue in GEMM-based algorithm for 3D convolutions (161ac40)
- Fixed corner case issues in Intel AVX512 implementation of convolution weight gradient (68f5124)
- Disabled not supported cases for depthwise convolution weight gradient (5e6e6c8)
- Convolution with 1x1 filter returns unimplemented for cases that have padding in spatial dimensions (9d7cc77)
- Fixed negative padding support in general convolution kernel (b1c602a)
- Fixed padding handling in depthwise convolution backpropagation (04712f6)
- Added support for negative padding in h and d spatial dimensions (7ddce82)
- Fixed segfault in strided convolution backpropagation (b04f3f5)
- Fixed memory corruption in convolution backpropagation (8877bc9)

Assets 10

09 Oct 20:29

vpirogov

v0.20.6

2a2dd8a

v0.20.6

This is a patch release containing following changes to v0.20.5:

Fixed performance regression in GEMM (cfc5c3d)

Assets 2

08 Oct 19:42

tprimak

v0.21.2

e356595

v0.21.2

This is a patch release containing following changes to v0.21.1:

Fixed performance regression in GEMM (9534621)
Fixed int8 dilated convolution for some shapes with input heights <= dilation over the heights dimension (e68f151)
Addressed static initialization order issue in bf16 converters (ae8efde)
Fixed fast reference backward convolution dispatching for 3D-spatial case (5994d63)

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New functionality

Usability

Known Limitations

Releases: oneapi-src/oneDNN

v0.21.3

v1.1.3

v1.2-rc

v1.1.2

v2.0-beta03

New functionality

Usability

Known Limitations

v1.0.4

v1.1.1

v1.0.3

v0.20.6

v0.21.2