Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

slang-netlist new feature: Detect combinatorial loops #984

Merged
merged 33 commits into from
May 17, 2024

Conversation

udif
Copy link
Contributor

@udif udif commented May 5, 2024

The following PR adds combinatorial loop detection to slang-netlist.
The original algorithm can be found here:

https://epubs.siam.org/doi/10.1137/0204007
Finding All the Elementary Circuits of a Directed Graph
Johnson, Donald B
SIAM Journal on Computing Vol. 4, Issue. 1 Mar 1975

I have taken a Java implementation from https://github.com/josch/cycles_johnson_meyer , forked it to modernize the Java, then ported the code to C++ and finally modified my generic C++ code to work with @jameshanlon 's code.

While the basic code works, there are lots of things that can be improved:

  • Report combinatorial path in a more standardized way, using slang's SourceLocation and SourceManager classes.
  • improve memory usage.
  • Integrate with slang's diagnostic subsystem, if possible.

Here is a sample input and output:

% cat test.v
module test (input clk);
wire a;
wire b;
wire c;
wire d;

t2 t2(
  .clk(clk),
  .x(a),
  .y(b),
  .z(c)
);
assign a = b & c;
endmodule

module t2(input clk, input x, output y, output reg z);
assign y = x;
always @(posedge clk)
  z <= x;
endmodule
% build/bin/slang-netlist --comb_loops test.v
Top level design units:
    test

Build succeeded: 0 errors, 0 warnings
Nodes: 29
Detected 1 combinatorial loop:
Variable declaration: test.a => a (Assigned to) => Variable alias test.t2.x => x => y (Assigned to) => Variable alias test.t2.y => b => Variable declaration: test.b => b => a (Assigned to)

Copy link

codecov bot commented May 5, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.21%. Comparing base (ddc950d) to head (3bed0dc).
Report is 77 commits behind head on master.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #984      +/-   ##
==========================================
+ Coverage   94.01%   94.21%   +0.20%     
==========================================
  Files         189      191       +2     
  Lines       48479    46903    -1576     
==========================================
- Hits        45576    44189    -1387     
+ Misses       2903     2714     -189     

see 147 files with indirect coverage changes


Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 4f37bea...3bed0dc. Read the comment docs.

Copy link
Contributor

@jameshanlon jameshanlon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good.

I'm not sure about including code with a different license that's not explicitly 'third party' but @MikePopoloski can comment.

Also - can you add a unit test for the change?

@@ -3,7 +3,8 @@
# SPDX-License-Identifier: MIT
# ~~~

add_executable(slang_netlist netlist.cpp source/Netlist.cpp)
add_executable(slang_netlist netlist.cpp source/Netlist.cpp
source/CombLoops.cpp)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to include this source in the netlist library, so that the functionality is not only bound to the command-line tool.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what you mean.
The file is already under tools\netlist\source, the same place Netlist.cpp is in, and all its functionality is only included in the command-line tools, not part of libsvlang.a, and I don't see you're building any standalone Netlist library either.
Feel free to fix this file, or show me what you mean.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ignore me, I thought that I had set up a library!

Comment on lines 83 to 84
auto& edge = netlist.addEdge(sourceVarNode, targetVarNode);
targetVarNode.edgeKind = edgeKind;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be neater to extend the method to include edgeKind as an argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

addEdge is currently a DirectedGraph method, that has nothing to do with EdgeKind.
Do you want me to override this with a new Netlist::addEdge(NetlistNode& sourceNode, NetlistNode& targetNode, ast::EdgeKind edgeKind) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a minor point so it's up to you. But that sounds reasonable.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added it anyway, learning on the way the importance of using when redefining functions on derived classes, and still wanting to call the base class version on the default function signature.

tools/netlist/netlist.cpp Outdated Show resolved Hide resolved
tools/netlist/netlist.cpp Outdated Show resolved Hide resolved
tools/netlist/netlist.cpp Outdated Show resolved Hide resolved
@udif
Copy link
Contributor Author

udif commented May 5, 2024

At the moment I'm also debugging another issue. I use ultraembedded's JPEG core for tests. I can run it against the whole core, or subsets of it. One subset with <700 nodes passes almost immediately. When I moved to a different subtree with ~2000 nodes, the program dies after a few minutes with a segmentation fault. I don't know if this is a complexity issue, or a deadlock triggered by the specific netlist, because it jumps from a sub-second run, to dying after a few minutes.

@jameshanlon
Copy link
Contributor

jameshanlon commented May 5, 2024

I noticed while looking at #985 that it takes a long time to process the netlist, so this may be related.

$ time ./build/bin/slang-netlist ../core_jpeg/src_v/jpeg_dht*
Top level design units:
    jpeg_dht

Build succeeded: 0 errors, 0 warnings

real    0m6.035s
user    0m6.025s
sys     0m0.020s

If you run with --debug you can see it spends a lot of time looking up variables that appear in the LUTs, eg:

NetlistVisitor.h:70: Edge decl lookup_input_i to ref lookup_input_i
NetlistVisitor.h:83: Edge ref lookup_input_i to ref y_ac_width_r
NetlistVisitor.h:70: Edge decl lookup_input_i to ref lookup_input_i
NetlistVisitor.h:83: Edge ref lookup_input_i to ref y_ac_width_r
...
netlist.cpp:227: Netlist has 1131 nodes and 54233 edges

It needs some more investigation as to why there are so many edges being created.

udif added 5 commits May 6, 2024 07:26
…ath instead.

At the moment we dump the 1st cycle element again at the end of the loop just to close it,
but we will probably drop it.
2. Decided not to print 1st element again at the end
   (it was supposed to close the cycle, but looks redunant).
@udif udif force-pushed the pr_detect_comb_loops branch from f7f4fe1 to eab5af3 Compare May 6, 2024 06:33
@udif
Copy link
Contributor Author

udif commented May 6, 2024

I just added a test (a single one, at least for the moment) to cover CombLoops.
From my side I think the code is ready.

@likeamahoney
Copy link
Contributor

As proof of concept is good but i suppose that analysis is need to be more conservative. It seems to me that checking the graph on presence of cycles (without outermost verilog context) is not quite enough for a good combinational(combinatorial) logic loops analysis. Because not every assignment belongs to combinational logic. There is also sequential and latch kinds of procedural logic.

For example in this simple code sample there is no combinational logic at all. It follows that there are also no combinational cycles (but PR solution reports that there is at least one combinational loop):

module top (input clk);
        reg a;
        reg b;

        test t1(.x(a), .y(b), .clk(clk));
        test t2(.x(b), .y(a), .clk(clk));

endmodule

module test(input reg x, output reg y, input clk);
        always_latch begin
                if (clk)
                        y <= x;
        end
endmodule
Build succeeded: 0 errors, 0 warnings
Nodes: 30
Actual active Nodes: 30
Detected 1 combinatorial loop:
Path length: 8
1.sv:5:13: note: variable a assigned to
        test t1(.x(a), .y(b), .clk(clk));
                   ^

1.sv:13:9: note: variable x read from
                        y <= x;
                             ^

1.sv:13:4: note: variable y assigned to
                        y <= x;
                        ^

1.sv:5:20: note: variable b read from
        test t1(.x(a), .y(b), .clk(clk));
                          ^

1.sv:6:13: note: variable b assigned to
        test t2(.x(b), .y(a), .clk(clk));
                   ^

1.sv:13:9: note: variable x read from
                        y <= x;
                             ^

1.sv:13:4: note: variable y assigned to
                        y <= x;
                        ^

1.sv:6:20: note: variable a read from
        test t2(.x(b), .y(a), .clk(clk));

I think it needs to be determined first which assignments are related to combinational logic but which aren't to make analysis more context sensitivity.

@udif
Copy link
Contributor Author

udif commented May 6, 2024

  1. From my experience, 99% of the designs do not use latches, except for clock gating cells, therefore if you encounter a latch it is more often than not a design mistake rather than intention (although with always_comb and always @* we see less and less the plain old always @(a or b or c..) style that leads to unintentional latches).
  2. Even when you do use latches, I don't think you can easily analyze the latch control (by a simple algorithm) to determine whether it is a combinatorial loop or not.

From a practical point of view, since I cannot safely determine when a latch is safe, I would rather have a false positive I can waive (we'll have to add a mechanism for that) rather than skip a warning.

Copy link
Owner

@MikePopoloski MikePopoloski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is pretty cool, thanks for the contribution. I'd like the slang library to properly have support for comb loop detection at some point in the future but it first needs to gain another layer of lowering support (possibly to something MLIR based) before it will be feasible to implement. In the meantime it seems good to experiment with doing it in slang-netlist.

#ifndef COMBLOOPS_H
#define COMBLOOPS_H

/*
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IANAL but: I think it's good to leave the original license here but since you translated the source it's now also copyrighted by you and so it's fine to also put the standard license header on this file (and we don't need to put it in external/).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I gave this a lot of thought, and it's like the Ship of Theseus. The original code is not here, but the class names, internal functions and their internal structures (loops, branches, etc.) is mostly the same. Given the similarity between Java and C++ this is even more obvious.
Since the original code was governed by a very permissive BSD-2 clause license, I didn't find this an issue.
Apparently, the original code is really from here: http://www.normalisiert.de/ .

In any case, the code is recursive, and I found that for some cases it dies even for a small number of nodes.
I wish to investigate this for a few more days. From a quick peek at the offending function, I think it might be not too complex to remove the recursion.

@MikePopoloski
Copy link
Owner

Let me know when you think this is ready to land.

@likeamahoney
Copy link
Contributor

  1. From my experience, 99% of the designs do not use latches, except for clock gating cells, therefore if you encounter a latch it is more often than not a design mistake rather than intention (although with always_comb and always @* we see less and less the plain old always @(a or b or c..) style that leads to unintentional latches).
  2. Even when you do use latches, I don't think you can easily analyze the latch control (by a simple algorithm) to determine whether it is a combinatorial loop or not.

From a practical point of view, since I cannot safely determine when a latch is safe, I would rather have a false positive I can waive (we'll have to add a mechanism for that) rather than skip a warning.

Modern designs are't use latches it's true but what about D-triggers/flip-flops (which widely used in modern designs to separate combinational logic) and other sequential logics? Provided solution also warns on it.

For example there is simple flip-flop without comb loops:

module top (input rst, input clk);
        wire D;
        wire Q;
        wire Qn;
        dff d1(D, clk, rst, Q, Qn);
        dff d2(Q, clk, rst, D, Qn);
endmodule

module dff (
  input  logic D, clk, rst,
  output logic Q, Qn
);
  always_ff @(posedge clk, posedge rst) begin
    if (rst) begin
      Q  <= 0;
      Qn <= 1;
    end else begin
      Q  <= D;
      Qn <= ~D;
    end
  end
endmodule

@udif
Copy link
Contributor Author

udif commented May 7, 2024

Modern designs are't use latches it's true but what about D-triggers/flip-flops (which widely used in modern designs to separate combinational logic) and other sequential logics? Provided solution also warns on it.

This is simple - your example simply triggered a bug. This was not supposed to happen...
The idea was to disconnect any node below always or always_ff statements with posedge or negedge in their sensitivity list.
Apparently, as your example shows, there are bugs. I will check this.

However, I have found more serious false positive issues:

module t;
wire x, y, z;
assign {z, y} = {~y, ~x};
endmodule

This will also trigger a comb loop warning. While this specific one might be solved (at the cost of complicating the netlist logic) It looks like the general problem will not be solved without a full synthesis engine. For example:

module t;
wire x, y;
assign z = ~y & y;
assign y = z;
endmodule

It remains to be seen if the scope of this feature would be useful given the current limitations (known possible false positives).

@jameshanlon
Copy link
Contributor

jameshanlon commented May 7, 2024

I would expect the netlist graph for assign {z, y} = {~y, ~x}; to be acyclic, so there may be a bug if that is not true.

I see your point though and agree that without full synthesis it's not possible to determine a true combinatorial loop.

The class of loops that slang-netlist can report are comparable to those reported by Verilator as UNOPTFLAT`, which are still useful to know about.

@udif
Copy link
Contributor Author

udif commented May 7, 2024

I would expect the netlist graph for assign {z, y} = {~y, ~x}; to be acyclic, so there may be a bug if that is not true.

A combinatorial loop is reported because each of the x and y nodes have edges to z and y, as {z, y} is lumped into one assigment depending on y and x and is then split. slang-netlist is not smart enough to detect the bitwise assignment and separate the z = ~y and `y = ~x' assignments.

Maybe we should rename the option name to --possible-comb-loop so that people set the correct expectations, and accept a small number of false positives.

@jameshanlon
Copy link
Contributor

slang-netlist is not smart enough to detect the bitwise assignment and separate the z = ~y and `y = ~x' assignments.

Apologies - I thought I had implemented that. My aim for slang-netlist is to provide a datastructure that resolves all source-level connectivity to the bit level. Handling packed vectors is important for that, so I'll try and look into it.

With regards to combinatorial loops, how about just calling the option --report-cycles and some description that these can potentially be combinatorial loops?

@udif
Copy link
Contributor Author

udif commented May 7, 2024

With regards to combinatorial loops, how about just calling the option --report-cycles and some description that these can potentially be combinatorial loops?

But I'm not merely reporting cycles - I'm actively ignoring netlist edges that terminate on a posedge/negedge nodes.

BTW, another feature I would like to implement at a later stage is an improvement to the path reporting mode, that will also report the number of clock edges between the source and destination paths. It would require defining the clock signals, and keeping a table of clock aliases, either those directly connected, or those passing through a clock gating module (you would define the clock gating module name, and the input/output ports).
Today I'm running into situations where I have a memory access bus running through several sample stages across my SoC, and having something that counts the total latency would be useful for me.

@jameshanlon
Copy link
Contributor

But I'm not merely reporting cycles - I'm actively ignoring netlist edges that terminate on a posedge/negedge nodes.

Marking sequential edges is a generally useful feature of the tool, since it allows the netlist to be constrained to find combinatorial paths. This is how I implemented a previous similar tool.

BTW, another feature I would like to implement at a later stage is an improvement to the path reporting mode, that will also report the number of clock edges between the source and destination paths.

Nice, that would be useful.

(Btw. I've got no objections to this landing! We can always iron details out later.)

udif and others added 5 commits May 8, 2024 09:42
We still detect "posedge x or negedge x" as edge even though this
is effecetively combinatorial, because we are lazy.
And renamed previous templates to be more consistent
@udif
Copy link
Contributor Author

udif commented May 10, 2024

This is simple - your example simply triggered a bug.

I've fixed the bug, but I still got 2 issues:

  1. For some tests, the code seems to be stuck in a loop. I need to dig deeper into the algorithm to understand what is the issue. I may dump the test case into a DOT file and run against the original Java code to see if it's a conversion issue or a bug in the original code.
  2. I also have a memory corruption bug that causes my unit tests to fail when I add more than one.

It will take me a few more days to resolve these.

udif added 2 commits May 12, 2024 23:45
When creating more than one Netlist consecutively, the starting Node ID is not reset back to 1!
(Because it is initialized statically).
So When creating an Adjacency list we need to subtract the ID of node 0, not just "1".
@udif
Copy link
Contributor Author

udif commented May 12, 2024

It seems the bugfix above (2c57b7d) solved all stability and corruption issues, but when running against the full core_jpeg example I've used before, I ran into #993 .

udif and others added 5 commits May 13, 2024 09:09
1. Removed dynamic allocation (new/delete) and replaced it by 1 static buffer
2. Removed tail recursion and replaced it by a loop
3. Removed default namespace std and added explicit std:: prefixes
@udif
Copy link
Contributor Author

udif commented May 15, 2024

At the moment, My code works great under clang-sanitize, where I've debugged it, but for some reasons fails under gcc-11, where there seems to be a memory corruption causing a segment violation as well as locking up the watch windows on vscode.
I'm looking into this.

@udif
Copy link
Contributor Author

udif commented May 16, 2024

While I've fixed the crashes, it seems there are still other issues (lockups on some examples).

@udif
Copy link
Contributor Author

udif commented May 16, 2024

While I've fixed the crashes, it seems there are still other issues (lockups on some examples).

False alarm, I was just being impatient and stopped the test after a few seconds. It ended up taking a few more seconds than what I estimated.

Let me know when you think this is ready to land.

@MikePopoloski @jameshanlon I think it is ready from my side.

@MikePopoloski MikePopoloski merged commit 75f760b into MikePopoloski:master May 17, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants