This is a network for handwritten japanese text recognition scenario. It consists of VGG16-like backbone, reshape layer and a fully connected layer. The network is able to recognize japanese text (characters in datasets Kondate and Nakayosi).
Metric | Value |
---|---|
GFlops | 117.136 |
MParams | 15.31 |
Accuracy on Kondate test set and test set generated from Nakayosi | 98.16% |
Source framework | PyTorch* |
This demo adopts label error rate as the metric for accuracy.
Shape: [1x1x96x2000] - An input image in the format [BxCxHxW], where:
- B - batch size
- C - number of channels
- H - image height
- W - image width
Note that the source image should be converted to grayscale, resized to spefic height (such as 96) while keeping aspect ratio, normalized to [-1, 1] and right bottom padded
The net outputs a blob with the shape [186, 1, 1161] in the format [WxBxL], where:
- W - output sequence length
- B - batch size
- L - confidence distribution across the supported symbols in Kondate and Nakayosi.
The network output can be decoded by CTC Greedy Decoder.
[*] Other names and brands may be claimed as the property of others.