Declaration semantics in Structorizer has to be reassessed #980

codemanyak · 2021-06-14T06:13:49Z

Issue #977 revealed that the semantics and processing of declarations (particularly that of arrays) in Structorizer is deficiently based and has to be revisited. The user example in the issue contained an element with the text:
var v1[10], i: int

By the way: Your declaration does not exactly do what you intend. It is rather surprising for me that Structorizer tolerated it. The correct syntax to declare an array of ten elements would follow the Pascal style: var v1: array [0..9] of int. But it has no effect, actually. var v1[10]: int creates indeed an array of 11 elements, all initialized to 0, because Structorizer produces an array of sufficient size as soon as it bumps into some writing attempt to an array element.

Originally posted by @codemanyak in #977 (comment)

The following diagram shows several possible declaration types suggested by different programming languages. With all possible syntactic tolerance of Structorizer appreciated, we need a specification which of them are to be accepted and what effect they are meant to have w.r.t. Executor, Analyser, and code generation:

This result seems hardly acceptable:

Should we accept Pascal's array [0..9] of int at all, as it suggests a flexible start index (i.e., an index base transformation) we do not actually support? Should the "execution" of an array declaration rather result in an array of the specified size but fill with null values? But what to do then with array declarations of unspecified size (e.g. v5)?

The text was updated successfully, but these errors were encountered:

codemanyak · 2021-06-14T07:49:10Z

The discussion will have to keep in mind at least these issues: #61, #113, #335, #368, #408, #423, #739, #800.
As Structorizer does not require declaration except in the case of a record component access and overrides declarations or former type associations on assignments, it remains difficult to force the type of declared variables, in particular if the is incomplete (like in array of string where no size is given).
Nevertheless, a reliable way of interpreting decalarations like var v[10]: float is needed, or they should be rejected.
Remark: Java accepts an array declaration in both of these forms: double[] a; and double a[]; and even double[] b[]; (where the latter is a two-dimensional array b).

codemanyak · 2021-06-21T22:41:13Z

In an analogy to some Pascal distributions and Oberon we might indeed think of a modified semantics of declarations in future Structorizer versions, particularly for arrays:

An undeclared variable will continue to tolerate assignments with all types of value at any time (like it had always done in Structorizer and e.g. in Python). Mere initialisation would not induce further type adherence checks, it might be used for weak type inference on code export, though.
An undeclared variable will be made an array by either being assigned an array initializer (e.g. a ← {0, 8, 15}) or by assigning an element with arbitrary non-negative index (e.g. a[i + 3] ← "some fun" with i being an integer variable or constant) as is done by now. The initial size of the array would be set by the initializer or the used index + 1, respectively, the array will remain extendible. The singular element assignment by now has the effect of filling all places from index 0 to the used index - 1 with 0, which may be convenient and gratious in many cases, but the user should not rely on it—perhaps it is better to fill these array places with null, thus provoking an error on reading access? The variable can be overridden with any value of any type at any time.
If a variable is declared to certain type, however, we might try to respect (and enforce, see executor ignores variable types #408) this type association as far as Structorizer can interpret the type name or description—which is the big question mark here (we would have to specify a set of acknowledged type names and specifications).
If a variable is declared to an array type then there might be two cases:
- a "static" array type if the index range is given at declaration time (e.g. array [15] of elementtype or elementtype[15])—in this case later expansion of the array or redefinition attempts for the variable should not be accepted at execution time, index range would have to be checked. (But this tends to make assignments of array initializer expressions complicated.) We might even try to enforce the element type at runtime. The question is whether such a declaration should initialise the variable with an already dimensioned array filled with null elements (which would cause execution errors on reading access)—by now the variable remains uninitialised in Structorizer.
- an "open", "dynamic", or extensible array type if the index range is not specified (e.g. array of elementtype or elementtype[]). The array could always be prolongated, but not overridden by data of different structure; element type could be supervised on assignments. Again the question arises: Whether and how to initialize the variable? As an empty array (no elements)?
- an array unspecified with respect to both size and element type (only structure principle specified, e.g. array)—do we accept this or not? It should be extensible without element type checking, but we might want to reject redefinition/reassignment attempts with something else?
In analogy, a record variable would be type-enforced if explicitly declared. (If not explicitly initialized then to be established as a record with null components or to be left null as a whole?)
In further analogy we might try to enforce the type of explicitly declared enumeration variables.

In the consequence, Structorizer would have to maintain a declaration table apart from an inferred type association table (or a declaration flag in the type association map).
This approach might seem more sound or sophisticated but is of course more complicated, too (both in understanding and in implementation, obviously, and also for inferring subroutine parameter lists etc.). It would not be fully compatible with earlier versions and it will always remain somewhat incomplete.

So is it worth the efforts? Should it even be optional?

In any case, something is to be done about the misinterpretation of texts like var a[10]: integer—either we reject it or we interpret it like var a: array [10] of integer.

codemanyak · 2021-10-05T22:42:01Z

In addition, we will have to decide whether it is desirable and feasible also to accept declarations in C and/or Java style (then with or without var prefix?), particularly, as this declaration style is already accepted in many assignment (initialisation) contexts, but causing trouble with types.
Examples:

int v6[10] (or even var int v7[10]?) and int v6[] <- {123, 3735, 21, 832, -152, 98, 36};
int[] v8 or var int[] v9?
int[] v9[10] (a perfectly legal, though hardly recommendable way to declare a two-dimensional array in Java or C#...)

The multitude of possible syntactic approaches adds to the already intrinsic ambiguity of an underlying grammar, in particular if type names may consist of more than one word (like unsigned int, long long int, long double or something like that): in contrast to C or C++, "unsigned" or "long" aren't reserved words in Structorizer. Index ranges might even be specified by expressions, thus potentially implying recursion.

codemanyak · 2021-11-04T10:45:19Z

Not to be forgotten: C++, C#, and Java export are also compromised with C-like array declarations.

codemanyak · 2023-10-11T14:24:42Z

The combination of a multiple declaration with an initialization like the following should be detected as illegal:

By now, Structorizer interprets it in an inconsistent way (usuallly assigning the value to the last of the variables). It actually causes harm in case of a semi-C array declaration as in:

Code export interprets this in very different (wrong) ways, e.g.:
C:

Pascal:

BASIC:

codemanyak added code revision question labels Jun 14, 2021

codemanyak self-assigned this Jun 14, 2021

codemanyak mentioned this issue Nov 4, 2021

executor ignores variable types #408

Open

codemanyak added a commit that referenced this issue Oct 13, 2023

Partial fixing of multi-variable declaration support (issue #980)

af1c8a8

codemanyak added a commit that referenced this issue Oct 13, 2023

Analyser check for declarations according to #980.

92842ca

codemanyak added a commit that referenced this issue Oct 17, 2023

Further modifications for #980 (multi-var initialisations)

3eb1355

codemanyak mentioned this issue Oct 30, 2023

Version 3.32-13 candidate #1103

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Declaration semantics in Structorizer has to be reassessed #980

Declaration semantics in Structorizer has to be reassessed #980

codemanyak commented Jun 14, 2021 •

edited

Loading

codemanyak commented Jun 14, 2021 •

edited

Loading

codemanyak commented Jun 21, 2021 •

edited

Loading

codemanyak commented Oct 5, 2021 •

edited

Loading

codemanyak commented Nov 4, 2021

codemanyak commented Oct 11, 2023

Declaration semantics in Structorizer has to be reassessed #980

Declaration semantics in Structorizer has to be reassessed #980

Comments

codemanyak commented Jun 14, 2021 • edited Loading

codemanyak commented Jun 14, 2021 • edited Loading

codemanyak commented Jun 21, 2021 • edited Loading

codemanyak commented Oct 5, 2021 • edited Loading

codemanyak commented Nov 4, 2021

codemanyak commented Oct 11, 2023

codemanyak commented Jun 14, 2021 •

edited

Loading

codemanyak commented Jun 14, 2021 •

edited

Loading

codemanyak commented Jun 21, 2021 •

edited

Loading

codemanyak commented Oct 5, 2021 •

edited

Loading