Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UpResNet10 Model Conversion #2

Open
tcyrus opened this issue Oct 18, 2020 · 6 comments
Open

UpResNet10 Model Conversion #2

tcyrus opened this issue Oct 18, 2020 · 6 comments
Assignees

Comments

@tcyrus
Copy link
Owner

tcyrus commented Oct 18, 2020

When trying to convert the UpResNet10 model to CoreML, I get a similar error to the CUnet model conversion.

Unfortunately, I didn't log the output at the time.

@tcyrus
Copy link
Owner Author

tcyrus commented Oct 18, 2020

I was able to convert the UpResNet10 models for Chainer (model source) to ONNX.

The Chainer models look slightly different from the Caffe models when viewed in Netron.

  1. Model is Horizontally Mirrored
    This is probably ok, but it's confusing

  2. Unusual Input and Output Labels
    Again, probably ok, but it's still confusing

  3. ReLU is sometimes replaced with LeakyReLU
    This is because Caffe uses the same layer type (ReLU) for both ReLU and LeakyReLU (docs)

  4. Split layers are merged with the input layer.
    This should be fine as the Split doesn't have any specific attributes (docs)

  5. CropCenter is replaced with Slice
    This is mainly confusing me because I don't know the difference between the two

  6. Modified step
    This is probably the trickiest part to explain since I know little to nothing about ML.
    Based on my limited understanding, parts of the UpResNet10 algorithm can be broken up into smaller groups of layers which I'm going to call "steps" since I don't know what the real name for it is. These steps probably have different weight values attached to them, but they're the same in terms of structure.

    Step for Caffe Model:
    caffe_upres10_step

    Step for Chainer Model:
    chainer_upres10_step

    This can be broken down into smaller differences:
    a. Pooling is replaced with ReduceMean (😱)
    This is really confusing for me since I believe the proper operation should be GlobalAveragePool
    b. InnerProduct is replaced with Gemm
    This is a documented conversion for CoreML (link) if that helps
    c. Axpy is replaced with a combination of Unsqueeze, Expand, Mul, and Add (😱)

  7. Eltwise is replaced with Add at the end
    found some code doing this conversion here

@tcyrus
Copy link
Owner Author

tcyrus commented Oct 18, 2020

The steps to resolution depend on the reason for the differences:

  • If the conversion is 1 to 1, then ignore it
  • If the conversion is un-optimal or has different behavior (ReduceMean), then fix it
  • If the conversion is many to one (Axpy), make note of it as a potential ONNX Backend Optimisation

@tcyrus
Copy link
Owner Author

tcyrus commented Oct 18, 2020

Found confirmation that the correct conversion is to GlobalAveragePool (link)

I need to correct the model via editing the onnx file once confirming the input / output dimensions.

@tcyrus
Copy link
Owner Author

tcyrus commented Oct 21, 2020

After looking through more details. the Mul + Add seems to be the proper replacement to Axpy (link). This could also be the reason for the mirroring of the model.

I am still unclear on the reason behind Unsqueeze + Expand, but this could be related to the input or output shape of Gemm.

@tcyrus
Copy link
Owner Author

tcyrus commented Jan 15, 2021

I took a break from this for a while. I'll probably try passing the resulting model through onnxsim and/or onnxoptimizer to try and see if it'll simplify the graph by replacing redundant graph operations.

@tcyrus
Copy link
Owner Author

tcyrus commented Jan 17, 2021

If I'm reading this commit correctly, the current UpResNet10 models for waifu2x-caffe are generated using the models from waifu2x-chainer. Based on this knowledge (also, I don't understand any of this), I have decided to use the resulting model from the conversion.
I'll close this issue once I upload the models, but I do not recommend using them unless you know what you're doing.

tcyrus added a commit that referenced this issue Jan 17, 2021
More info can be found in Issue #2
@tcyrus tcyrus self-assigned this Jan 17, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant