Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Imported C code doesn't recognize increment operator and issues false positive warning for endless loop #1081

Open
csrabak opened this issue Jun 26, 2023 · 4 comments
Assignees

Comments

@csrabak
Copy link

csrabak commented Jun 26, 2023

image
In both sides of the comparison the var y is postincremented, so the supposed condition is not there.

@codemanyak
Copy link
Collaborator

codemanyak commented Jun 26, 2023

Well, the problem is indeed that Structorizer does not support auto-increment and auto-decrement operators as executable syntax. So the Analyser component won't recognize a working modification of variable y here. On the other hand, the conversion of an embedded auto-increment or auto-decrement expression into an equivalent sequence of executable instructions like e.g.

fH[y++] = 0.0

into

fH[y] <- 0.0
y <- y + 1

isn't that trivial in the general case, particularly not in case of several occurrences of auto-increment/-decrement or combined operators +=, -= etc. for the same variable within a single complex expression. Consider e.g. the expression a = y * (17 + b[++y]) - how would you (automatically!) decompose this into an equivalent instruction sequence?
(By the way, a single instruction consisting of a simple auto-in-/decrementation will be converted into an equivalent assignment on import.)
In order to avoid inconsistency in the imported diagram it was decided that this kind of expressions be imported as is and to leave it to the user to "repair" them in case an execution within Structorizer is intended (which often makes little sense since the code will usually have been executable more efficiently in the source language itself and the code import primarily aims at structured presentation, not at functional test).
So if the error hint just annoys you in an aesthetical sense, you might simply switch off Analyser.

On the longer sight, of course, an effort might be considered to

  • either add an import option for C, Java etc. that tries to convert as many by now unsupported complex operators as possible into Structorizer instruction sequences;
  • have Analyser complaining about all operators ++, --, +=, -=, /=, etc. in expressions (actually, it already qualifies all combined operators like += etc. as "possible assignment mistake" but ignores ++ and --);
  • or to have Executor support (and thus Analyser recognise and tolerate) this kind of operators within expressions (this option is already listed for evaluation within the Permanent syntax tree representation in background #800 project).

@codemanyak codemanyak self-assigned this Jun 26, 2023
@codemanyak
Copy link
Collaborator

Just a related remark:
If your code happens to contain assignment expressions within other expressions, e. g. a[3 * (b = 15 * c)] = 1.7, then you will also obtain warnings from Analyser after the import.
Structorizer syntax (or, more precisely, Executor capability) was deliberately restricted in comparison to C, Java, C#, and many of those C-inspired languages.

@csrabak
Copy link
Author

csrabak commented Jun 29, 2023

I've read and understood your points.
I don't consider the issue aesthetical but rather of consequence: once the sight "gets used to" warnings that could be just "misunderstandings" of the code, we tend to loose some which the critique could be "correct".

I do understand, having been there done that, how complicated it can be to mimmick a programming language to make any kind of assertion on the code, specially for a language with the historic baggage and the ensuing incoherences in syntax, asymmetries in API, etc.

Tools like Structorizer (and for NSD even other implementations) have to put a stop in some place and I get that for the present state of Structorizer those compound statements was the decision taken.

I'll address this question of Kay's, which I hope wasn't as a rethoric one[!]:

Consider e.g. the expression a = y * (17 + b[++y]) - how would you (automatically!) decompose this into an equivalent instruction sequence?

I think the most sensible and correct way would be to take the code generated by a [conformant] compiler! Perhaps using libclang to get an "intermediate level" structure and elude having to parse assembly code.

The ROI in such effort is, of course, theme for another debate to which I've no stance a priori.

One interesting point in the NSD above is that translating this:

fH[y++] = 0.0

into

fH[y] <- 0.0
y <- y + 1

Would take out the "atomicity" of the operator and create the illusion that in the NS diagram the y <- y + 1 could be factored out of the decision (if) in the code1. . .

So the first part of the decision on keeping these operators as they are seems to me appropriate whereas the recognizing of their semantic in the code seems the most appropriate route in a future enhancement, which, I think, parallels the second bullet in the options above.

PS.: Some of these "rearrangements" in code are already done in Structorizer, for example for (x=0; x < 90; x++) gets rendered in the diagram as for x = 0 to 90 - 1; with an optional step if the operation in the induction var is to decrement.

Footnotes

  1. Which, I believe, is one of the strongest arguments to use this kind of diagram: discover these opportunities for improving the code.

@codemanyak
Copy link
Collaborator

codemanyak commented Jun 29, 2023

@csrabak
Thanks for your elucidations and suggestions. I'd like to comment on some of them.

I think the most sensible and correct way would be to take the code generated by a [conformant] compiler! Perhaps using libclang to get an "intermediate level" structure and elude having to parse assembly code.

This might indicate a way for certain language families were such an intermediate level exists (libclang might be fine for C, C++, C# etc. but then we would have to use a quite different package for languages like Pascal, Java and, well, you name it). So the more specific our tool would be dedicated to certain language familiy the easier and more complacent it might get. On the other hand, the more general its application is intended the less satisfying the result will be for certain kinds of language. Anyway we require the imported code to adhere to structured algorithmic constructs (e.g. there won't be any efforts to try to make sense of spaghetti jumps).
The code example I gave focused on the semantic difficulties operators with side-effects (e.g. postfix/prefix auto-increment and auto-decrement operators) impose to an expression. They are mighty, often convenient and look elegant but ... they can also be obfuscating and error-prone.

Would take out the "atomicity" of the operator and create the illusion that in the NS diagram the y <- y + 1 could be factored out of the decision (if) in the code. . .

Why is this an illusion? In this algorithm the incrementation of y can of course be factored out of the alternative (if) to the surrounding loop. The loop could (and perhaps should) even be converted into a for loop - if it's about better readability and consistence:
grafik
But this is nothing we would do automatically.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants