Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EnableSignal, useful for making Instances compatible with EnableInserter #285

Open
Fatsie opened this issue Dec 25, 2019 · 21 comments
Open
Labels
Milestone

Comments

@Fatsie
Copy link
Contributor

Fatsie commented Dec 25, 2019

I am making wrappers around existing RTL code using Instance. These block have a clock input. This structure seems to not be compatible with EnableInserter. This issue is to see if there is a way to make it compatibl or alternatively at least have a warning or error when using EnableInserter.

Reduced example code is this:

#!/bin/env python3
from nmigen import (
    Signal, Elaboratable, Module, EnableInserter, ClockSignal, Instance
)
from nmigen.build import Platform
from nmigen.back.verilog import convert

class dff(Elaboratable):
    def __init__(self):
        self.d = Signal()
        self.q = Signal()

    def elaborate(self, platform):
        m = Module()

        m.submodules.inst = Instance(
            "DFF",
            i_clk=ClockSignal(), i_d=self.d, o_q=self.q,
        )

        return m

class dffce(Elaboratable):
    def __init__(self):
        self.ce = Signal()
        self._dff = _dff = dff()
        self.d = _dff.d
        self.q = _dff.q

    def elaborate(self, platform):
        m = Module()

        m.submodules.dff = EnableInserter(self.ce)(self._dff)

        return m


top = dffce()
top_code = convert(top, ports=[top.ce, top.d, top.q])
with open("top.v", "w") as f:
    f.write(top_code)

Giving the output:

/* Generated by Yosys 0.9+932 (git sha1 ff8529a, gcc 4.8.5 -fPIC -Os) */

(* generator = "nMigen" *)
(* \nmigen.hierarchy  = "top.dff" *)
module dff(d, q, clk);
  (* src = "/home/verhaegs/anaconda2/envs/nmigen/lib/python3.7/site-packages/nmigen/hdl/ir.py:538" *)
  input clk;
  (* src = "./ceinserter.py:10" *)
  input d;
  (* src = "./ceinserter.py:11" *)
  output q;
  DFF inst (
    .clk(clk),
    .d(d),
    .q(q)
  );
endmodule

(* generator = "nMigen" *)
(* top =  1  *)
(* \nmigen.hierarchy  = "top" *)
module top(d, q, clk, rst, ce);
  (* src = "./ceinserter.py:25" *)
  input ce;
  (* src = "/home/verhaegs/anaconda2/envs/nmigen/lib/python3.7/site-packages/nmigen/hdl/ir.py:538" *)
  input clk;
  (* src = "./ceinserter.py:10" *)
  input d;
  (* src = "./ceinserter.py:11" *)
  output q;
  (* src = "/home/verhaegs/anaconda2/envs/nmigen/lib/python3.7/site-packages/nmigen/hdl/ir.py:538" *)
  input rst;
  dff dff (
    .clk(clk),
    .d(d),
    .q(q)
  );

The ce signal is not used.

@whitequark
Copy link
Contributor

Unfortunately it is not possible to write an equivalent of EnableInserter without it either (a) being able to understand the inner structure of the modules it mutates, or (b) inserting a vendor-specific clock gating primitive.

For (a), consider that it would have to know to replace a DFF with a DFFE and hook up the E input, or perhaps have to know that a DFF is a storage element and wire up a mux to the input. For (b), you could certainly do that (by injecting the dependency on the primitive through platform), but I find it very unlikely that you'll be happy with the output netlist, due to timing, routing, etc issues.

@whitequark
Copy link
Contributor

Of course, if you are asking for an ability to customize the action of EnableInserter on your own Elaboratable, that's definitely technically feasible. It is fairly niche though and requires a fair bit of design work.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 25, 2019

I do think the (b) option to have platform specific clock gating primitives is a good solution. One can then also error when the platform does not provide this feature and a domain ClockSignal() is used inside fragments from EnableInserter.
It's then the task of the platform to handle the ugliness; for ASICs I know clock gating is a standard function handled by the (proprietary) clock tree synthesis tools and static timing analysis. Support from yosys is likely needed to handle this properly.
In the mean time it would be good if user could be informed when support is lacking. It should not be silently ignored as is done in my example.

@whitequark
Copy link
Contributor

I do think the (b) option to have platform specific clock gating primitives is a good solution.

I'm not convinced here at all. I think it's worth exploring, but I see many ways this can potentially go wrong, and in general, EnableInserter's usefulness primarily stems from its fully synchronous nature.

In the mean time it would be good if user could be informed when support is lacking. It should not be silently ignored as is done in my example.

I agree that emitting a diagnostic when EnableInserter encounters an Instance could work. Unfortunately the same cannot be said about ResetInserter; there is no option but to keep it silently skip any Instances, or you couldn't use Instances at all.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 25, 2019

Couldn't you change ResetSignal() to get proper output taking into account ResetInserter() ?
Of course this likely assumes synchronous reset as otherwise the reset logic has to be glitch free.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 25, 2019

And maybe also a EnableSignal() can be introduced next to ClockSignal() and it is then the task of the code that does use Instance to properly handle the EnableSignal().

@whitequark
Copy link
Contributor

it is then the task of the code that does use Instance to properly handle the EnableSignal().

Ah I see, so your complaint isn't that the code that uses Instances, by default, doesn't behave correctly in presence of EnableInserter. The complaint is that currently it's not possible to make it behave correctly.

Yes, I think changing the semantics of ResetSignal and adding EnableSignal is reasonable. Though care needs to be taken to make sure that won't have unintentional consequences.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 26, 2019

Ah I see, so your complaint isn't that the code that uses Instances, by default, doesn't behave correctly in presence of EnableInserter.

What I want is that if a nMigen Elaboratable subobject uses Instance inside it still supports same features as pure nMigen code, including use of EnableInserter on the object. I prefer solution that does this by default but I can live with a solution where user of Instance has to include extra boilerplate to make is compatible with EnableInserter.
But in the latter case if the class that uses Instance does not include the extra boilerplate code people who use EnableInserter on the class may be confused and only find why it does not do what one wants to do after long debugging time.

@whitequark
Copy link
Contributor

I prefer solution that does this by default

nMigen has no idea which ports of the instance are clocks, resets, or enables, nor does it have any idea that it should e.g. turn DFFs into DFFEs. I don't understand how this feature can work without any additional effort spent while defining an Instance.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 26, 2019

nMigen does know that output of ClockSignal() is a clock though but from technical point it is easier to have extra EnableSignal() than trying to handle Enable from ClockSignal().

@whitequark
Copy link
Contributor

Sure, I'm fine with adding EnableSignal in principle, even though it's a bit of a strange addition. The problem can be better illustrated with a reset. You're concerned about a module using Instance not being reset with the clock domain. That's fair; but if you have (for example) an Instance of DFF, you simply cannot get it to reset. You have to use a DFFSR and wire the ResetSignal manually to it, which I would consider "extra boilerplate". I don't know of a solution that avoids this boilerplate.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 26, 2019

To me the DFF ignoring ResetInserter is similar to a Signal(reset_less=True). So signals ignoring reset is something people using nMigen are familiar with. This is not the case for synchronous code ignoring EnableInserter.

@whitequark
Copy link
Contributor

I see. Do I understand it correctly that you're suggesting that EnableInserter look for EnableSignal(), and if it can't find one, insert a clock gating primitive?

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 26, 2019

I would only introduce EnableSignal() if you don't want to introduce a clock gating primitive, otherwise I would always have ClockSignal() insert a clock gating primitive if it is in a block on which EnableInserter() is called.
If technically feasible I would like it to be handled by the platform and not by the one using Instance.

@whitequark
Copy link
Contributor

whitequark commented Dec 26, 2019

There are a few issues here.

First, many FPGAs do not have a clock gating primitive. What should iCE40 do here, for example? There is no vendor-sanctioned way to gate a clock, and even worse, I don't think either the FOSS tools or the proprietary tools would be able to cope with the timing consequences of you abusing a LUT as one, even if you managed to find a truth table where it would not glitch.

Second, FPGAs that do have clock gating primitives typically have severe restrictions on their use (e.g.: you only get one per chip and nMigen already uses it by default, or there are placement restrictions, or there are timing restrictions, or they aren't glitchless, etc). So effectively this auto-gating feature for ClockSignal will be ASIC-only, since I cannot imagine any practical FPGA design where I would actually want to use it.

Third, I don't think your suggestion:

I would always have ClockSignal() insert a clock gating primitive if it is in a block on which EnableInserter() is called.

is viable even on ASICs. Consider that not every input of an Instance feeds synchronous logic inside that instance. It could be as well a combinatorial input--say the instance is for a clock mux, or simply an output buffer feeding a pad. There is no way to tell by looking at the Instance which inputs should be gated and which should not.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 26, 2019

First, many FPGAs do not have a clock gating primitive. What should iCE40 do here, for example? There is no vendor-sanctioned way to gate a clock, and even worse, I don't think either the FOSS tools or the proprietary tools would provide meaningful results if you abused a LUT as one, even if you managed to find a truth table for that LUT where it would not glitch.

Second, FPGAs that do have clock gating primitives typically have severe restrictions on their use (e.g.: you only get one per chip and nMigen already uses it by default, or there are placement restrictions, or there are timing restrictions, or they aren't glitchless, etc). So effectively this auto-gating feature for ClockSignal will be ASIC-only, since I cannot imagine any practical FPGA design where I would actually want to use it.

I do think yosys needs to be extended to handle the clock gating primitive and the primitive does not necessarily have to translate directly to a real physical cell on the FPGA. I'm not familiar with ice40 primitives but if the registers have an enable input the clock gating primitive could translate to an extra condition on the enable signals on all registers connected to the output of the clock gate. Worst case the enable has to be implemented as a mux on the input as you now do for the nMigen handling of EnableInserter.
In the mean time until this sEven fupport is there an option could be provided by the platform to implement clock gate primitives as pass-through; but then at least the implementer knows.

I would always have ClockSignal() insert a clock gating primitive if it is in a block on which EnableInserter() is called.

is viable even on ASICs. Consider that not every input of an Instance feeds synchronous logic inside that instance. It could be as well a combinatorial input--say the instance is for a clock mux, or simply an output buffer feeding a pad. There is no way to tell by looking at the Instance which inputs should be gated and which should not.

How is this different from EnableInserter not interfering with combinatorial signals in nMigen either ?
For the clock mux example one of the inputs will now come from the clock gate primitive and not be the original clock signal; nothing has to change on the select input. Clock muxes are already in the clock so clock tree synthesis needs to handle them properly in conjunction with clock gates if that is present in the design.

@whitequark
Copy link
Contributor

I do think yosys needs to be extended to handle the clock gating primitive and the primitive does not necessarily have to translate directly to a real physical cell on the FPGA

Remember that nMigen does not only target Yosys. When used with non-FOSS toolchains, you cannot emit this primitive, since it'll just be an unknown Verilog cell that e.g. Lattice iCECube will fail on.

In the future I would also like to target other IRs like FIRRTL (since Yosys RTLIL does not have a strict specification with a strong compatibility guarantee), which introduces the same issue.

In the mean time until this support is there an option could be provided by the platform to implement clock gate primitives as pass-through; but then at least the implementer knows.

I am completely opposed to adding a core primitive that will be a no-op on all currently supported FPGAs and therefore introduce a sim/synth mismatch.

I am also strongly opposed to adding any core primitive that requires platform support at all (i.e. does not translate to portable synthesizable Verilog), and you have not shown why this case is so important that it requires doing so. I do recognize that it might unfortunately be necessary in some other circumstances, though.

How is this different from EnableInserter not interfering with combinatorial signals in nMigen either ?

The existing EnableInserter implementation looks into the internal structure of a Module. You cannot look into the internal structure of an Instance or you wouldn't need one in first place. If there was some way to mark an input of an Instance as "this feeds only clocks in specific domain" then it could add gating on that input, but there is no such way.

Clock muxes are already in the clock so clock tree synthesis needs to handle them properly

nMigen does not currently have a lot of special handling for clocks themselves; other than when adding logic to a clock domain, ClockSignal and ResetSignal are just late-bound signals. It does not currently have any clock tree handling at all.

@Fatsie
Copy link
Contributor Author

Fatsie commented Dec 28, 2019

and you have not shown why this case is so important that it requires doing so.

I have wrapped some retro cores like motorola 68000, Z80, mos6502 implemented in VHDL & Verilog in nMigen. Plan is to distribute them through pypi and conda. For users they should behave as other nmigen block including EnableInserter functionality.
Some of the blocks provide Wishbone interface and have synchronous nMigen statements on the same clock domain as the wrapped cores. This will now malfunction if EnableInserter is applied on it and I don't see a way to even warn or error when that happens.
In the end, if one wants to do low power designs Yosys, or any other tool chain will need to provide specific support for clock gating. Supporting this will be platform dependent so if it is not formalized in nMigen platform infrastructure + underlying toolchain I'll need to do it in a separate library that is used in all places where Instance is used on asynchronous blocks. Not my preferred solution but something I'll have to live with when needed.

@whitequark
Copy link
Contributor

whitequark commented Dec 28, 2019

For users they should behave as other nmigen block including EnableInserter functionality.

But you don't need a core primitive that automatically introduces clock gating to make such instances work with EnableInserter. In fact, even if such a core primitive was added, it is not the case that these cores would transparently work with EnableInserter: on every FPGA currently supported, you would have to undo the effect of automatic clock gating, because it is either not synthesizable at all, or synthesizable in a likely undesirable way.

Adding a EnableSignal primitive would make such cores work with EnableInserter. Both EnableSignal and automatic clock gating require manual work, but the former requires less manual work, so it is clearly preferable.

(That's leaving aside my view that it is impossible to robustly implement your suggestion as described because there is no way to determine which inputs should be gated and which should not.)

Supporting this will be platform dependent so if it is not formalized in nMigen platform infrastructure + underlying toolchain I'll need to do it in a separate library that is used in all places where Instance is used on asynchronous blocks.

I think nMigen should certainly have support for clock gating, similar to how it already has support for e.g. DDR I/O. I do not think this support should be a part of the core language; rather, it should be a part of the platform support.

@whitequark whitequark changed the title Making Instance with clock compatible with EnableInserter Add EnableSignal, useful for making Instances compatible with EnableInserter Dec 28, 2019
@whitequark
Copy link
Contributor

I think adding EnableSignal is reasonable and I will do so. A good approach for implementing it would be to ditch fragment transformation completely as a concept; instead of mutating the fragment AST, Inserters and Renamers should rather mutate the fragment environment, where the domain names and various control signals are bound. This would also speed up conversion to RTLIL by up to a factor of 5.

@whitequark whitequark added this to the 0.2 milestone Jan 3, 2020
@whitequark
Copy link
Contributor

@Fatsie By the way, I especially like your suggestion of adding EnableSignal because, as it turns out, EnableInserter already has something very much like it, but as an ad-hoc transformation that I added to fix #154. Thanks for raising this issue to make sure it is handled well and in a generic way.

Also, the approach I described above (using an environment for late bound signals) will let you easily implement a ClockGateInserter yourself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants