Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support of transposed convolutions through pixel padding #835

Merged
merged 33 commits into from
Jan 4, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ff78c24
[PixelPadding] Created custom_op and unit test, which is passing both…
hleblevec May 16, 2023
dff63d3
Updating headers
hleblevec Jun 5, 2023
e49d7a3
Merge branch 'dev' into feature/pixelpadding
hleblevec Jun 5, 2023
c4c6f8b
[Deconvolution] Creating fpgadataflow test with full deconv (pixel pa…
hleblevec Jun 12, 2023
fb09f04
[QONNX conversion] Add handling for ConvTranspose
auphelia Jun 15, 2023
983cdc3
[Tests] Add Brevitas export test for deconv
auphelia Jun 15, 2023
bad0613
Changing test to use ConvTranspose from standard ONNX as a reference …
hleblevec Jun 28, 2023
3db0e2c
Creating (ConvTranspose -> FMPixelPadding + ConvInpGen + MVAU) transf…
hleblevec Jun 28, 2023
eb5f89e
Swapping weights tensors dimension to fit ususal transposed conv repr…
hleblevec Jun 28, 2023
73037cb
[PixelPaddingDeconv] Updating test to use the InferPixelPaddingDeconv…
hleblevec Jul 19, 2023
5942839
Merge remote-tracking branch 'upstream/dev' into feature/pixelpadding
auphelia Aug 2, 2023
a4c15df
[Linting] Run pre-commit on files
auphelia Aug 2, 2023
c024c39
[MoveScalarMulPastConvTranspose] Creating new streamlining transforma…
Aug 3, 2023
7486217
Updating QONNX qnd FINN-HLSIB repo versions to match dependencies
hleblevec Aug 17, 2023
9a411ec
Updating QONNX qnd FINN-HLSIB repo versions to match dependencies
hleblevec Aug 17, 2023
b368ee0
[InferPixelPaddingDeconv] Adding safety check to verify that groups=1…
hleblevec Aug 25, 2023
bdd8d4b
Merging dev
hleblevec Sep 18, 2023
457400b
Updating InferPixelPaddingDeconv to just do the lowering and not the …
hleblevec Oct 5, 2023
aba05c5
Updating QONNX commit has
hleblevec Oct 5, 2023
75212fc
Merge remote-tracking branch 'upstream/dev' into feature/pixelpadding
auphelia Oct 11, 2023
d454d60
Rewriting the test script to account for the change in InferConvInpGe…
hleblevec Oct 13, 2023
892c73e
Merge branch 'feature/pixelpadding' of https://github.com/hleblevec/f…
hleblevec Oct 20, 2023
5a5f078
Correcting typo and removing unecessary RTL option from InferPixelPad…
hleblevec Oct 20, 2023
e8ea7dd
Merge remote-tracking branch 'upstream/dev' into feature/pixelpadding
auphelia Nov 8, 2023
f86c16b
[Tests] Fix Deconv Brevitas export test
auphelia Nov 8, 2023
6c51d5e
Update to Brevitas commit hash to a version that contains espcn code
hleblevec Dec 19, 2023
514e9b9
Merge upstream dev and resolve merge conflict
auphelia Jan 2, 2024
86b883a
[Tests] Fix copyright header in test case
auphelia Jan 2, 2024
d8fbfd9
[Tests] Change assertion to pytest skip in brevitas export test
auphelia Jan 2, 2024
2c2f387
[pixelpad] Omit redundant code and reduce test cases
auphelia Jan 2, 2024
c8b80df
[Tests] Refactoring of deconv test
auphelia Jan 3, 2024
c741fae
[PixelPadding] Add batchsize for expected cycles calc
auphelia Jan 3, 2024
14929f4
[Deconv] Update test and add comments to transformation
auphelia Jan 3, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions fetch-repos.sh
Original file line number Diff line number Diff line change
Expand Up @@ -27,12 +27,12 @@
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

QONNX_COMMIT="04e24583fb5c1895744801480db3ced8a5b6a914"
QONNX_COMMIT="47e4357faf66b5b0d1bf77bf908bb47752421e5b"
FINN_EXP_COMMIT="de99347e936d51715f5356a1b6c64e37b91c23c2"
BREVITAS_COMMIT="9bb26bf2798de210a267d1e4aed4c20087e0e8a5"
BREVITAS_COMMIT="84f42259ec869eb151af4cb8a8b23ad925f493db"
PYVERILATOR_COMMIT="766e457465f5c0dd315490d7b9cc5d74f9a76f4f"
CNPY_COMMIT="4e8810b1a8637695171ed346ce68f6984e585ef4"
HLSLIB_COMMIT="c17aa478ae574971d115afa9fa4d9c215857d1ac"
HLSLIB_COMMIT="16e5847a5e3ef76cffe84c8fad2f010d593457d3"
OMX_COMMIT="0b59762f9e4c4f7e5aa535ee9bc29f292434ca7a"
AVNET_BDF_COMMIT="2d49cfc25766f07792c0b314489f21fe916b639b"
XIL_BDF_COMMIT="8cf4bb674a919ac34e3d99d8d71a9e60af93d14e"
Expand Down
2 changes: 2 additions & 0 deletions src/finn/custom_op/fpgadataflow/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@
from finn.custom_op.fpgadataflow.duplicatestreams_batch import DuplicateStreams_Batch
from finn.custom_op.fpgadataflow.eltwise import StreamingEltwise
from finn.custom_op.fpgadataflow.fmpadding_batch import FMPadding_Batch
from finn.custom_op.fpgadataflow.fmpadding_pixel import FMPadding_Pixel
from finn.custom_op.fpgadataflow.fmpadding_rtl import FMPadding_rtl
from finn.custom_op.fpgadataflow.globalaccpool_batch import GlobalAccPool_Batch
from finn.custom_op.fpgadataflow.iodma import IODMA
Expand Down Expand Up @@ -83,6 +84,7 @@
custom_op["GlobalAccPool_Batch"] = GlobalAccPool_Batch
custom_op["Pool_Batch"] = Pool_Batch
custom_op["FMPadding_Batch"] = FMPadding_Batch
custom_op["FMPadding_Pixel"] = FMPadding_Pixel
custom_op["Thresholding_Batch"] = Thresholding_Batch
custom_op["AddStreams_Batch"] = AddStreams_Batch
custom_op["LabelSelect_Batch"] = LabelSelect_Batch
Expand Down
335 changes: 335 additions & 0 deletions src/finn/custom_op/fpgadataflow/fmpadding_pixel.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,335 @@
# Copyright (c) 2023, Advanced Micro Devices, Inc.
# All rights reserved.
#
# Redistribution and use in source and binary forms, with or without
# modification, are permitted provided that the following conditions are met:
#
# * Redistributions of source code must retain the above copyright notice, this
# list of conditions and the following disclaimer.
#
# * Redistributions in binary form must reproduce the above copyright notice,
# this list of conditions and the following disclaimer in the documentation
# and/or other materials provided with the distribution.
#
# * Neither the name of Xilinx nor the names of its
# contributors may be used to endorse or promote products derived from
# this software without specific prior written permission.
#
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
# AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
# IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
# DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
# FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
# DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
# SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
# CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
# OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.


import numpy as np
import os
import warnings
from qonnx.core.datatype import DataType

from finn.custom_op.fpgadataflow.hlscustomop import HLSCustomOp
from finn.util.data_packing import npy_to_rtlsim_input, rtlsim_output_to_npy


class FMPadding_Pixel(HLSCustomOp):
def __init__(self, onnx_node, **kwargs):
super().__init__(onnx_node, **kwargs)

def get_nodeattr_types(self):
my_attrs = {
# spatial size of input images
"ImgDim": ("ints", True, []),
# stride to apply, can be non-square
"Stride": ("ints", True, []),
# number of channels in input image
"NumChannels": ("i", True, 0),
# SIMD Input parallelism
"SIMD": ("i", False, 1),
# FINN input datatype
"inputDataType": ("s", True, ""),
# shape describing input vecs per execution
"numInputVectors": ("i", False, 1),
auphelia marked this conversation as resolved.
Show resolved Hide resolved
}
my_attrs.update(super().get_nodeattr_types())
return my_attrs

def get_padded_odim(self):
"Return the padded spatial size of the output."
idim_h, idim_w = self.get_nodeattr("ImgDim")
stride_h, stride_w = self.get_nodeattr("Stride")
odim_h = idim_h + (idim_h - 1) * (stride_h - 1)
odim_w = idim_w + (idim_w - 1) * (stride_w - 1)
return [odim_h, odim_w]

def get_exp_cycles(self):
odim_h, odim_w = self.get_padded_odim()
channels = self.get_nodeattr("NumChannels")
simd = self.get_nodeattr("SIMD")
batch_size = self.get_nodeattr("numInputVectors")
exp_cycles = (channels / simd) * batch_size * odim_h * odim_w
return int(exp_cycles)

def get_normal_input_shape(self, ind=0):
idim_h, idim_w = self.get_nodeattr("ImgDim")
num_ch = self.get_nodeattr("NumChannels")
ishape = (1, idim_h, idim_w, num_ch)
return ishape

def get_normal_output_shape(self, ind=0):
odim_h, odim_w = self.get_padded_odim()
num_ch = self.get_nodeattr("NumChannels")
oshape = (1, odim_h, odim_w, num_ch)
return oshape

def get_folded_input_shape(self, ind=0):
normal_ishape = list(self.get_normal_input_shape())
ifm_ch = self.get_nodeattr("NumChannels")
simd = self.get_nodeattr("SIMD")
assert ifm_ch % simd == 0, "SIMD must divide input channels"
fold = int(normal_ishape[-1] / simd)
folded_ishape = normal_ishape[:-1] + [fold, simd]
return tuple(folded_ishape)

def get_folded_output_shape(self, ind=0):
normal_oshape = list(self.get_normal_output_shape())
ifm_ch = self.get_nodeattr("NumChannels")
simd = self.get_nodeattr("SIMD")
assert ifm_ch % simd == 0, "SIMD must divide input channels"
fold = int(normal_oshape[-1] / simd)
folded_oshape = normal_oshape[:-1] + [fold, simd]
return tuple(folded_oshape)

def make_shape_compatible_op(self, model):
exp_ishape = self.get_normal_input_shape()
oshape = self.get_normal_output_shape()
ishape = tuple(model.get_tensor_shape(self.onnx_node.input[0]))
assert ishape == exp_ishape, "Unexpect input shape for FMPadding_Pixel."
return super().make_const_shape_op(oshape)

def infer_node_datatype(self, model):
node = self.onnx_node
idt = model.get_tensor_datatype(node.input[0])
if idt != self.get_input_datatype():
warn_str = "inputDataType changing for %s: %s -> %s " % (
node.name,
str(self.get_input_datatype()),
str(idt),
)
warnings.warn(warn_str)
self.set_nodeattr("inputDataType", idt.name)
model.set_tensor_datatype(node.output[0], idt)

def verify_node(self):
pass

def get_input_datatype(self, ind=0):
"""Returns FINN DataType of input."""
ret = DataType[self.get_nodeattr("inputDataType")]
# the hlslib op always pads with zeros, so ensure that the DataType
# is able to represent zeros
assert ret.allowed(0), "FMPadding_Pixel DataType must support zero"
return ret

def get_output_datatype(self, ind=0):
"""Returns FINN DataType of output. (Same as input datatype)"""
return self.get_input_datatype()

def get_instream_width(self, ind=0):
ibits = self.get_input_datatype().bitwidth()
simd = self.get_nodeattr("SIMD")
return ibits * simd

def get_outstream_width(self, ind=0):
obits = self.get_output_datatype().bitwidth()
simd = self.get_nodeattr("SIMD")
return obits * simd

def get_number_output_values(self):
folded_oshape = self.get_folded_output_shape()
return np.prod(folded_oshape[:-1])

def global_includes(self):
self.code_gen_dict["$GLOBALS$"] = ['#include "streamtools.h"']

def defines(self, var):
odim_h, odim_w = self.get_padded_odim()
stride_h, stride_w = self.get_nodeattr("Stride")
self.code_gen_dict["$DEFINES$"] = [
"""
#define OutputDim_x {}\n
#define OutputDim_y {}\n
#define Stride_x {}\n
#define Stride_y {}\n
#define NumChannels {}\n
#define SIMD {}\n
""".format(
odim_w,
odim_h,
stride_w,
stride_h,
self.get_nodeattr("NumChannels"),
self.get_nodeattr("SIMD"),
)
]

def read_npy_data(self):
code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
dtype = self.get_input_datatype()
if dtype == DataType["BIPOLAR"]:
# use binary for bipolar storage
dtype = DataType["BINARY"]
elem_bits = dtype.bitwidth()
packed_bits = self.get_instream_width()
packed_hls_type = "ap_uint<%d>" % packed_bits
elem_hls_type = dtype.get_hls_datatype_str()
npy_type = "float"
npy_in = "%s/input_0.npy" % code_gen_dir
self.code_gen_dict["$READNPYDATA$"] = []
self.code_gen_dict["$READNPYDATA$"].append(
'npy2apintstream<%s, %s, %d, %s>("%s", in0);'
% (packed_hls_type, elem_hls_type, elem_bits, npy_type, npy_in)
)

def strm_decl(self):
self.code_gen_dict["$STREAMDECLARATIONS$"] = []
self.code_gen_dict["$STREAMDECLARATIONS$"].append(
'hls::stream<ap_uint<{}>> in0 ("in0");'.format(self.get_instream_width())
)
self.code_gen_dict["$STREAMDECLARATIONS$"].append(
'hls::stream<ap_uint<{}>> out ("out");'.format(self.get_outstream_width())
)

def docompute(self):
in_t = self.get_input_datatype().get_hls_datatype_str()
odim_h, odim_w = self.get_padded_odim()
stride_h, stride_w = self.get_nodeattr("Stride")
hls_call = "FMPadding_Pixel_Nonsquare"
self.code_gen_dict["$DOCOMPUTE$"] = [
"""{}<OutputDim_x, OutputDim_y, Stride_x, Stride_y, NumChannels,
SIMD, {}> (in0, out);""".format(
hls_call, in_t
)
]

def dataoutstrm(self):
code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
dtype = self.get_output_datatype()
if dtype == DataType["BIPOLAR"]:
# use binary for bipolar storage
dtype = DataType["BINARY"]
elem_bits = dtype.bitwidth()
packed_bits = self.get_outstream_width()
packed_hls_type = "ap_uint<%d>" % packed_bits
elem_hls_type = dtype.get_hls_datatype_str()
npy_type = "float"
npy_out = "%s/output.npy" % code_gen_dir
oshape = self.get_folded_output_shape()
oshape_cpp_str = str(oshape).replace("(", "{").replace(")", "}")

self.code_gen_dict["$DATAOUTSTREAM$"] = [
'apintstream2npy<%s, %s, %d, %s>(out, %s, "%s");'
% (
packed_hls_type,
elem_hls_type,
elem_bits,
npy_type,
oshape_cpp_str,
npy_out,
)
]

def save_as_npy(self):
self.code_gen_dict["$SAVEASCNPY$"] = []

def blackboxfunction(self):
packed_bits = self.get_instream_width()
packed_hls_type = "ap_uint<%d>" % packed_bits
self.code_gen_dict["$BLACKBOXFUNCTION$"] = [
"void %s(hls::stream<%s > &in0, hls::stream<%s > &out)"
% (self.onnx_node.name, packed_hls_type, packed_hls_type)
]

def pragmas(self):
self.code_gen_dict["$PRAGMAS$"] = [
"#pragma HLS INTERFACE axis port=in0 name=in0_" + self.hls_sname()
]
self.code_gen_dict["$PRAGMAS$"].append(
"#pragma HLS INTERFACE axis port=out name=out_" + self.hls_sname()
)
self.code_gen_dict["$PRAGMAS$"].append("#pragma HLS INTERFACE ap_ctrl_none port=return")

def execute_node(self, context, graph):
mode = self.get_nodeattr("exec_mode")
node = self.onnx_node
exp_ishape = self.get_normal_input_shape()
exp_oshape = self.get_normal_output_shape()
folded_ishape = self.get_folded_input_shape()

if mode == "cppsim":
code_gen_dir = self.get_nodeattr("code_gen_dir_cppsim")
elif mode == "rtlsim":
code_gen_dir = self.get_nodeattr("code_gen_dir_ipgen")
else:
raise Exception(
"""Invalid value for attribute exec_mode! Is currently set to: {}
has to be set to one of the following value ("cppsim", "rtlsim")""".format(
mode
)
)

inp = context[node.input[0]]
assert str(inp.dtype) == "float32", "Input datatype is not float32"
assert (
inp.shape == exp_ishape
), """Input shape doesn't
match expected shape (1, ImgDim_h, ImgDim_w, NumChannels)."""
export_idt = self.get_input_datatype()

reshaped_input = inp.reshape(folded_ishape)
np.save(os.path.join(code_gen_dir, "input_0.npy"), reshaped_input)

if mode == "cppsim":
# execute the precompiled model
super().exec_precompiled_singlenode_model()
# load output npy file
super().npy_to_dynamic_output(context)
assert (
context[node.output[0]].shape == exp_oshape
), "cppsim did not produce expected output shape"
elif mode == "rtlsim":
sim = self.get_rtlsim()
nbits = self.get_instream_width()
rtlsim_inp = npy_to_rtlsim_input(
"{}/input_0.npy".format(code_gen_dir), export_idt, nbits
)
super().reset_rtlsim(sim)
super().toggle_clk(sim)
rtlsim_output = self.rtlsim(sim, rtlsim_inp)
odt = export_idt
target_bits = odt.bitwidth()
packed_bits = self.get_outstream_width()
out_npy_path = "{}/output.npy".format(code_gen_dir)
out_shape = self.get_folded_output_shape()
rtlsim_output_to_npy(
rtlsim_output, out_npy_path, odt, out_shape, packed_bits, target_bits
)
# load and reshape output
output = np.load(out_npy_path)
output = np.asarray([output], dtype=np.float32).reshape(*exp_oshape)
context[node.output[0]] = output
else:
raise Exception(
"""Invalid value for attribute exec_mode! Is currently set to: {}
has to be set to one of the following value ("cppsim", "rtlsim")""".format(
mode
)
)
assert (
context[node.output[0]].shape == exp_oshape
), """Output shape doesn't match expected shape
(1, OutputDim_H, OutputDim_W, NumChannels)."""
Loading