Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JEP-445: allow parsing unnamed classes #1487

Closed
wants to merge 3 commits into from

Conversation

mickaelistria
Copy link
Contributor

Adds a compiler AST UnnamedClass type and grammar rules to parse unnamed classes.
Adds some tests.

Still missing:

  • codegen for UnnamedClass
  • checks for Java version compatibility
  • More analyseCode to ensure we don't allow more than specified (after parsing)
  • Proper check of existence of a main() method
  • ...

What it does

How to test

Author checklist

@mickaelistria
Copy link
Contributor Author

@jarthana Here is a first step towards unnamed class support. The most important thing IMO if we want to distribute work is to agree on how this unnamed class would surface in the compiler AST API. So far it's a subtype of TypeDeclaration which is set as a field in CompilationUnitDeclaration.
If you think it's OK that way, I will try to extract that bit as a simpler commit so we can merge it ASAP and start working in parallel between parsing/validation/codegen.
I also made a proposal for the grammar/parser with some basic tests checking unnamed class is parsed decently. If you prefer, we can discuss that in a next step.

@akurtakov
Copy link
Contributor

@mpalat Manoj, can you help with this one?

@mickaelistria
Copy link
Contributor Author

The parsing still breaks some cases (I don't know what's the best approach here, is it OK to change the tests?), but it would be great if we could already discuss the AST proposal, so if it seems good enough, we can merge it and parallelize further work.

@jarthana
Copy link
Member

@mpalat Manoj, can you help with this one?

I think @manoj is in EclipseCon and I doubt he can get to this by this week. Unfortunately, I haven't studied this yet. So, it's better to wait for him to take a look.

@akurtakov
Copy link
Contributor

@mpalat Manoj, can you help with this one?

I think @manoj is in EclipseCon and I doubt he can get to this by this week. Unfortunately, I haven't studied this yet. So, it's better to wait for him to take a look.

A review from your side (even if not conclusive) is more than welcome as it would point us if we went in wrong direction somewhere.

@@ -75,6 +75,7 @@ public int compare(Object o1, Object o2) {
public TypeDeclaration[] types;
public ModuleDeclaration moduleDeclaration;
public int[][] comments;
public UnnamedClass unnamedClass;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you describe somewhere why the unnamedClass is not stored in the types array as the single element?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not obvious as per spec and grammar and javac results that the unnamedClass is always the single type element in the compilation AST. There are counter examples such as

file A.java

void main() {}
class A{}

which would even showcase that javac here sees 2 root types in the compilation unit: the unnamed one (resulting in A.class) and the named A, and this case results in an error because A is already defined.

Also, we may need to directly check the existence of an unnamedClass from time to time as it affects further compilation steps, so keeping a particular ref has been convenient.
But I think I see your point and we can instead of a field make it a method

UnnamedClass unnamedClass() {
  return types == null ? null : Stream.of(types).filter(UnnamedClass.class::isInstance).map(UnnamedClass.class::cast).findAny().orElse(null);
}

Is this what you would recommend here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this what you would recommend here?

Firstly I wanted to understand the reasoning behind it.

Generally speaking, it would be nice if the approach wasn't diverging too much from CompilationUnitDeclaration.isPackageInfo() or CompilationUnitDeclaration.isModuleInfo()

https://docs.oracle.com/en/java/javase/21/docs/specs/unnamed-classes-instance-main-methods-jls.html to me reads as if it's always the single top-level class in the CU. Which parts do you find ambiguous? Maybe that should be discussed before making a decision about the AST?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ambiguity is in the grammar, when parsing. The example I gave earlier

// A.java
void main() {}
class A{}

Is ambiguous at parsing: is A a top-level A or a nested A$A ? This is something we cannot easily sort when parsing and we need further steps later to decide that (according to the spec and javac, it's clearly A$A here, but it's not trivial to parse it).

CompilationUnit.isPackageInfo() or CompilationUnit.isModuleInfo() are relatively simple compared because they do use different parsing rules very early, so they leave less room to parsing ambiguity.

Maybe that should be discussed before making a decision about the AST?

Guess what I'm trying with this PR ;) I'm really willing to discuss it and get a sufficient AST in place ASAP so we can then build on top of it.
If I read correctly, you're expecting CompilationUnitDeclaration.isUnnamedClass(). If so, what do you think such method would bring over my current proposal (assuming I remove the explicit unnamedClass field to replace with a method resolving against the resolved types) ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I’d expect the compilation unit to contain exactly one top-Level type declaration if it’s an unnamed class. What you suggested with the stream based boolean method indicted that there may be more than one top-level. I’d certainly expect that this is never the case so types!=null && types.length ==1 && types[0] instanceof UnnamedClass would be the condition that should return the correct result. Is that how you’d model it? (Its not clear to me yet from this WIP PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What you suggested with the stream based boolean method indicted that there may be more than one top-level

Indeed, the current AST model may allow extra top-level types together with unnamed class, but previous steps (Parser.consumeInternalCompilationUnitWithPotentialUnnamedClass) and next steps (Scope) makes that such AST cannot be fully compiled with usual ways. I can add some checks in CompilationUnitDeclaration.analyzeCode to check things before trying to resolve. Would that be enough?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are the implications of what you just said? As far as I understand it, all code that processes the types would need to be aware of the wrong nesting of the type declarations and all code that works with the unnamed class declaration would need to be aware of the fact that some of its siblings are in fact contained? This doesn’t sound great as a model to start with.

Would be nice to get the opinion of people that are more familiar with the compilers code base than I am.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I understand it, all code that processes the types would need to be aware of the wrong nesting of the type declarations and all code that works with the unnamed class declaration would need to be aware of the fact that some of its siblings are in fact contained?

There is no wrong nesting if the AST was built by the Parser. It's just that someone using the AST can put another top-level type in the types array, and will get reported problems later if they do so.
Feel free to suggest concrete additional test-cases we can add to cover your concerns, it will help in finding the right problem and solutions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, the current AST model may allow extra top-level types together with unnamed class,

maybe I misunderstood. You’re saying the parser will never create a wrongly nested AST? Or would it?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

when an unnamed class is detected (ie a method or a field is present at the root of the unit), then all the other types defined at the root of the compilation unit then become member types nested in the unnamed class.

@mickaelistria mickaelistria force-pushed the jep-445 branch 2 times, most recently from 6b6a785 to 0b2891f Compare October 20, 2023 08:15
@mickaelistria
Copy link
Contributor Author

I've simplified things a bit so the UnnamedClass does not surface explicitly, but instead is a specialized type in the compilation unit types list. This seems enough for validation and codegen to perform well.

@mickaelistria mickaelistria force-pushed the jep-445 branch 3 times, most recently from af196fa to ae6b847 Compare October 20, 2023 12:21
* @since 3.36
* @noreference preview feature
*/
int unnamedRequiresJava21 = PreviewRelated + 1943;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a dedicated error message that exists for all features that require preview to be enabled, and if you use that feature, there is already a quick fix set up to enable preview. I think it might be easier to use the existing error message.

See

UNNAMMED_CLASSES_AND_INSTANCE_MAIN_METHODS(ClassFileConstants.JDK21,
and ProblemReporter.validateJavaFeatureSupport

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I've fixed it in latest version of the PR.

@mickaelistria mickaelistria force-pushed the jep-445 branch 2 times, most recently from e4ac32a to 15c0ff0 Compare October 21, 2023 08:15
@mpalat
Copy link
Contributor

mpalat commented Oct 26, 2023

@mpalat Manoj, can you help with this one?

I think @manoj is in EclipseCon and I doubt he can get to this by this week. Unfortunately, I haven't studied this yet. So, it's better to wait for him to take a look.

A review from your side (even if not conclusive) is more than welcome as it would point us if we went in wrong direction somewhere.

Back to work from EclipseCon [and the following long weekend here]. First of all, great to see the contributions in the compiler part- Thanks @mickaelistria.

Let me start with a general comment- To make this whole feature development easier, and more systematic, we generally divide this activity into smaller chunks - This also helps people who are well versed in specific areas to provide more helpful feedback.

Please see Bug Tree as an example
ie it will be great if you can divide this into atleast the following indivual issues:

Once you have these issues, please put it as a sub-item under the top level issue, which is [1106]
(#1106) - like you can see in Issue 890 - First Comment - I have added first two issues for a start.

We can start with Grammar culminating in AST. Could you please extract just that part, create a PR and refer it under that issue? In parallel, I will take a look at the grammar part and put my comments there. Thanks

@mickaelistria
Copy link
Contributor Author

Thanks @mpalat . Your comments reminds me of the discussion I started a couple of week ago as an attempt to capture the workflow to deal with compiler features, and use it on a "master" issue to track subtasks. Your input at #1493 would then be very welcome to improve this workflow.

I will start with a PR proposing the grammar and compiler AST changes. Or would it be fine if we first review the AST change; and then the grammar? Having the AST first will facilitate other progress to happen in parallel.

@mpalat
Copy link
Contributor

mpalat commented Oct 26, 2023

I will start with a PR proposing the grammar and compiler AST changes. Or would it be fine if we first review the AST change; and then the grammar? Having the AST first will facilitate other progress to happen in parallel.

@mickaelistria - please have a look at my comments in - #1511 (comment)

jikespg doesn't succeed to properly read multiple options per line
locally (maybe because of different default line ending, or encoding). 1
option per line works fine.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants