Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update ADMS parser to handle Accellera standard disciplines header file #103

Closed
wants to merge 4 commits into from

Conversation

ngwood
Copy link
Contributor

@ngwood ngwood commented Jan 3, 2022

This pull request addresses two outstanding issues preventing ADMS from being able to parse the Accellera standard disciplines header file:

  1. Allow an optional semicolon after a discipline or a nature identifier in a nature and discipline declaration, respectively. This enhanced notation was introduced in v2.3.0 of the Accellera Verilog-AMS Language Reference Manual. This fixes issue Fail to parse nature terminated by ; #39.
  2. Process escaped identifiers that can be treated as simple identifiers.

ngwood added 2 commits January 3, 2022 21:12
Version 2.3.0 of the Verilog-AMS Language Reference Manual introduced
an optional semicolon after a nature and discipline identifier in a
nature and discipline declaration, respectively; e.g.,

    nature identifier;
        ...
    endnature

and

    discipline identifier;
        ...
    enddiscipline

are now also legal syntax.

ADMS behaves the same as when the semicolon is absent.

This brings ADMS one step closer to being able to parse the most
recent Accellera standard disciplines header files.

This commit fixes GitHub issue Qucs#39.
Escaped identifier are denoted by a bachslash followed by any printable
ASCII character and ending in whitespace. Currently ADMS cannot handle
escaped identier. This prevents it from being able to parse the official
Accellera standard disciplines header file.

Fortunately, the offending escaped identifier "\logic " is equivalent to
the simple identifier "logic", therefore we can simply tell the lexer
to handle this cleanly.

Note that the lexer doesn't technically provide full support for simple
identifiers yet, as in Verilog-AMS they may contain embedded dollar
signs.

It's not great, but it's much better than not doing anything at all.
That is to say, this adds functionality without breaking anything in the
process.
@felix-salfelder
Copy link
Member

felix-salfelder commented Jan 4, 2022 via email

@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

There may well be a better way, but the change is innocuous and more importantly works.

Regarding escaped identifiers, Verilog-AMS allows for variables with characters other than letters and underscores. Simple identifiers have letters, underscores and dollar signs (but not leading); no one should be using dollar signs though, so I wouldn't suggest we change this. Escaped identifiers all users to create variable names containing all of the other printable ASCII characters! There are four cases to point out; currently ADMS only handles one of them; my pull request handles a second one; the other two are not important:

Simple Identifier without dollar sign:
e.g. "my_value"
ADMS can currently parse this.

Simple identifier, with dollar sign:
e.g. "my$value"
ADMS cannot currently parse this.

Escaped identifier, equivalent to simple identifier without dollar sign:
e.g. "\my_value " == "my_value"
ADMS cannot currently parse this.
My pull request allows ADMS to parse this correctly and unambiguously
by recognising that these are equivalent identifiers. It will then allow use
of the Accellera standard disciplines file.

Escaped identifier with exotic ASCII characters:
e.g. "\my#value "
ADMS cannot currently parse this.

Does this make sense?

@felix-salfelder
Copy link
Member

felix-salfelder commented Jan 4, 2022 via email

@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

If I change

discipline \logic ;

to

discipline \log!c ;

I get the same error as before.

[fatal..] disciplines.vams: during lexical analysis syntax error at line 19 -- see '\'

This is because the bit between the opening \ and the closing , namely log!c, is invalid inside ADMS still, because it contains something other than letters or underscores. When it is just logic, then it is a simple identifier and ADMS can use it as such.

Just to be safe, I also checked that I could still use escaped sequences inside of strings and everything is still working.

This fix essentially says that if ADMS see something that looks like an escaped identifier, e.g., \my_value , then use the my_value as the identifier. If the my_value bit were my#value then it'd still fail as before as it has something other than letters or underscores. I don't see any value in providing any more support for escaped identifiers than this.

@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

Note that the Verilog-AMS LRM say that we need to treat \my_value and my_value the same. This is exactly what this pull request does.
Thus

real my_value;
\my_value =1.0;

is valid syntax.

@felix-salfelder
Copy link
Member

felix-salfelder commented Jan 4, 2022 via email

@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

I take that back! The following example still doesn't work with ADMS. It still treats \my_value and my_value as different things. It is therefore not compliant with the standard in that regards. It does however let you use \my_value as an identifier. In that sense it isn't entirely innocuous, as someone may expect \my_value and my_value to be the same. I will fix this at some point though.

@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

I should have used TKSTRIPPEDRETURN, sorry! With this change everything I mentioned before is correct.

diff --git a/admsXml/verilogaLex.l b/admsXml/verilogaLex.l
index 86393c4..443658e 100644
--- a/admsXml/verilogaLex.l
+++ b/admsXml/verilogaLex.l
@@ -274,7 +274,7 @@ INF {TKRETURN(yytext,yyleng); return tk_inf;}
 \|\| {TKRETURN(yytext,yyleng); return tk_or;}
 \^\~ {TKRETURN(yytext,yyleng); return tk_bitwise_equr;}
 
-\\{ident}" " {TKRETURN(yytext,yyleng); return tk_ident;}
+\\{ident}" " {TKSTRIPPEDRETURN(yytext,yyleng); return tk_ident;}
 \${ident} {TKRETURN(yytext,yyleng); return tk_dollar_ident;}
 {char} {TKSTRIPPEDRETURN(yytext,yyleng); return tk_char;}
 {b8_int} {TKRETURN(yytext,yyleng); return tk_integer;}

Verilog-A:

real \qwerty ; // use of escaped identifier
qwerty = 1.0; // use of simple identifier

Xyce C++:

double qwerty=0.0;
qwerty = 1.0;

ngwood added 2 commits January 4, 2022 15:16
Rather than just treating an escaped identifier as a unique simple
identifier, we should convert the escaped identifier into the
equivalent simple identifier; e.g., th Verilog-A snippet

    real \var ;
    var = 1.0;

is supposed to be equivalent to

    real var;
    var = 1.0;

Both will now produce C-style code similar to

    double var = 0.0;
    var = 1.0;
My previous attempt to allow for an optional semicolon after a nature
and discipline identifier in a nature and discipline declaration,
respectively, resulted in the semicolon now being mandatory. This
commit actually makes it optional.

I have tested this with Accellera's v2.2.0 and v2.3.1 standard headers as
well as ADMS's v2.3.7 standard header.
@ngwood
Copy link
Contributor Author

ngwood commented Jan 4, 2022

Commit cbb960e seems to solve the issue of the optional semicolon after a nature and discipline identifier in a nature and discipline declaration, respectively. Maybe worth someone else testing?

@ngwood ngwood closed this Jan 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants