-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JEP-445: allow parsing unnamed classes #1487
Conversation
@jarthana Here is a first step towards unnamed class support. The most important thing IMO if we want to distribute work is to agree on how this unnamed class would surface in the compiler AST API. So far it's a subtype of TypeDeclaration which is set as a field in CompilationUnitDeclaration. |
d56b9af
to
d129002
Compare
@mpalat Manoj, can you help with this one? |
The parsing still breaks some cases (I don't know what's the best approach here, is it OK to change the tests?), but it would be great if we could already discuss the AST proposal, so if it seems good enough, we can merge it and parallelize further work. |
A review from your side (even if not conclusive) is more than welcome as it would point us if we went in wrong direction somewhere. |
@@ -75,6 +75,7 @@ public int compare(Object o1, Object o2) { | |||
public TypeDeclaration[] types; | |||
public ModuleDeclaration moduleDeclaration; | |||
public int[][] comments; | |||
public UnnamedClass unnamedClass; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did you describe somewhere why the unnamedClass is not stored in the types array as the single element?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious as per spec and grammar and javac results that the unnamedClass is always the single type element in the compilation AST. There are counter examples such as
file A.java
void main() {}
class A{}
which would even showcase that javac here sees 2 root types in the compilation unit: the unnamed one (resulting in A.class) and the named A
, and this case results in an error because A
is already defined.
Also, we may need to directly check the existence of an unnamedClass
from time to time as it affects further compilation steps, so keeping a particular ref has been convenient.
But I think I see your point and we can instead of a field make it a method
UnnamedClass unnamedClass() {
return types == null ? null : Stream.of(types).filter(UnnamedClass.class::isInstance).map(UnnamedClass.class::cast).findAny().orElse(null);
}
Is this what you would recommend here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this what you would recommend here?
Firstly I wanted to understand the reasoning behind it.
Generally speaking, it would be nice if the approach wasn't diverging too much from CompilationUnitDeclaration.isPackageInfo() or CompilationUnitDeclaration.isModuleInfo()
https://docs.oracle.com/en/java/javase/21/docs/specs/unnamed-classes-instance-main-methods-jls.html to me reads as if it's always the single top-level class in the CU. Which parts do you find ambiguous? Maybe that should be discussed before making a decision about the AST?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ambiguity is in the grammar, when parsing. The example I gave earlier
// A.java
void main() {}
class A{}
Is ambiguous at parsing: is A a top-level A or a nested A$A ? This is something we cannot easily sort when parsing and we need further steps later to decide that (according to the spec and javac, it's clearly A$A here, but it's not trivial to parse it).
CompilationUnit.isPackageInfo() or CompilationUnit.isModuleInfo() are relatively simple compared because they do use different parsing rules very early, so they leave less room to parsing ambiguity.
Maybe that should be discussed before making a decision about the AST?
Guess what I'm trying with this PR ;) I'm really willing to discuss it and get a sufficient AST in place ASAP so we can then build on top of it.
If I read correctly, you're expecting CompilationUnitDeclaration.isUnnamedClass()
. If so, what do you think such method would bring over my current proposal (assuming I remove the explicit unnamedClass field to replace with a method resolving against the resolved types) ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I’d expect the compilation unit to contain exactly one top-Level type declaration if it’s an unnamed class. What you suggested with the stream based boolean method indicted that there may be more than one top-level. I’d certainly expect that this is never the case so types!=null && types.length ==1 && types[0] instanceof UnnamedClass
would be the condition that should return the correct result. Is that how you’d model it? (Its not clear to me yet from this WIP PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What you suggested with the stream based boolean method indicted that there may be more than one top-level
Indeed, the current AST model may allow extra top-level types together with unnamed class, but previous steps (Parser.consumeInternalCompilationUnitWithPotentialUnnamedClass) and next steps (Scope) makes that such AST cannot be fully compiled with usual ways. I can add some checks in CompilationUnitDeclaration.analyzeCode to check things before trying to resolve. Would that be enough?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are the implications of what you just said? As far as I understand it, all code that processes the types would need to be aware of the wrong nesting of the type declarations and all code that works with the unnamed class declaration would need to be aware of the fact that some of its siblings are in fact contained? This doesn’t sound great as a model to start with.
Would be nice to get the opinion of people that are more familiar with the compilers code base than I am.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I understand it, all code that processes the types would need to be aware of the wrong nesting of the type declarations and all code that works with the unnamed class declaration would need to be aware of the fact that some of its siblings are in fact contained?
There is no wrong nesting if the AST was built by the Parser. It's just that someone using the AST can put another top-level type in the types
array, and will get reported problems later if they do so.
Feel free to suggest concrete additional test-cases we can add to cover your concerns, it will help in finding the right problem and solutions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, the current AST model may allow extra top-level types together with unnamed class,
maybe I misunderstood. You’re saying the parser will never create a wrongly nested AST? Or would it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
when an unnamed class is detected (ie a method or a field is present at the root of the unit), then all the other types defined at the root of the compilation unit then become member types nested in the unnamed class.
6b6a785
to
0b2891f
Compare
I've simplified things a bit so the UnnamedClass does not surface explicitly, but instead is a specialized type in the compilation unit |
af196fa
to
ae6b847
Compare
* @since 3.36 | ||
* @noreference preview feature | ||
*/ | ||
int unnamedRequiresJava21 = PreviewRelated + 1943; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a dedicated error message that exists for all features that require preview to be enabled, and if you use that feature, there is already a quick fix set up to enable preview. I think it might be easier to use the existing error message.
See
Line 72 in 3d92057
UNNAMMED_CLASSES_AND_INSTANCE_MAIN_METHODS(ClassFileConstants.JDK21, |
ProblemReporter.validateJavaFeatureSupport
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, I've fixed it in latest version of the PR.
e4ac32a
to
15c0ff0
Compare
Back to work from EclipseCon [and the following long weekend here]. First of all, great to see the contributions in the compiler part- Thanks @mickaelistria. Let me start with a general comment- To make this whole feature development easier, and more systematic, we generally divide this activity into smaller chunks - This also helps people who are well versed in specific areas to provide more helpful feedback. Please see Bug Tree as an example
Once you have these issues, please put it as a sub-item under the top level issue, which is [1106] We can start with Grammar culminating in AST. Could you please extract just that part, create a PR and refer it under that issue? In parallel, I will take a look at the grammar part and put my comments there. Thanks |
Thanks @mpalat . Your comments reminds me of the discussion I started a couple of week ago as an attempt to capture the workflow to deal with compiler features, and use it on a "master" issue to track subtasks. Your input at #1493 would then be very welcome to improve this workflow. I will start with a PR proposing the grammar and compiler AST changes. Or would it be fine if we first review the AST change; and then the grammar? Having the AST first will facilitate other progress to happen in parallel. |
@mickaelistria - please have a look at my comments in - #1511 (comment) |
jikespg doesn't succeed to properly read multiple options per line locally (maybe because of different default line ending, or encoding). 1 option per line works fine.
15c0ff0
to
8e5e4ac
Compare
8e5e4ac
to
d4d7860
Compare
Adds a compiler AST UnnamedClass type and grammar rules to parse unnamed classes.
Adds some tests.
Still missing:
What it does
How to test
Author checklist