Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generate shorter names for properties #2290

Closed
wants to merge 2 commits into from

Conversation

corymhall
Copy link
Contributor

  • Most of the schemaNestedTypes code was just moved from generate_schema.go
  • The new code is the code related to nestedTypeGraph

I picked an arbitrary max character limit of 120, but those still look pretty long. I checked in pulumi-aws and there are still a lot of types that are up to and over 120.

I currently just pick the shorter name if the normal name would be +120, but I feel like it would be better to introduce a new schema property to enable this at the resource level.

TODO:

  • I need to add some more tests to ensure this is deterministic, but I am not yet sure it is possible to make it completely deterministic without tracking state (maybe by reading in the existing schema.json)

re #1118

- Most of the `schemaNestedTypes` code was just moved from
  `generate_schema.go`
- The new code is the code related to `nestedTypeGraph`

I picked an arbitrary max character limit of 120, but those still look
pretty long. I checked in `pulumi-aws` and there are still a lot of
types that are up to and over 120.

I currently just pick the shorter name if the normal name would be +120,
but I feel like it would be better to introduce a new schema property to
enable this at the resource level.

TODO:
 - I need to add some more tests to ensure this is deterministic, but I
   am not yet sure it is possible to make it completely deterministic
   without tracking state (maybe by reading in the existing schema.json)

re #1118
}
}

func resourceQuicksightTemplate() map[string]*schema.Schema {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I pulled this from the actual aws_quicksight_template resource for completeness sake. It's not the entire schema (i've commented out some parts so we could add them later)

Copy link

codecov bot commented Aug 8, 2024

Codecov Report

Attention: Patch coverage is 98.68173% with 14 lines in your changes missing coverage. Please review.

Project coverage is 57.61%. Comparing base (f18ff2f) to head (c7bb504).
Report is 47 commits behind head on master.

Files Patch % Lines
pkg/tfgen/generate_nested_schema_types.go 95.55% 6 Missing and 2 partials ⚠️
internal/testprovider/schema_large_tokens.go 99.30% 5 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2290      +/-   ##
==========================================
+ Coverage   56.72%   57.61%   +0.89%     
==========================================
  Files         361      365       +4     
  Lines       49547    50822    +1275     
==========================================
+ Hits        28104    29281    +1177     
- Misses      19903    19981      +78     
- Partials     1540     1560      +20     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

corymhall added a commit to pulumi/pulumi-aws that referenced this pull request Aug 9, 2024
PR to test out the changes in pulumi/pulumi-terraform-bridge#2290

Only checking in the java sdk to keep the file changes to a minimum
@@ -58,151 +55,6 @@ type schemaGenerator struct {
language Language
}

type schemaNestedType struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's some code motion here over some tricky code, could we please separate the commits that move the code from the commits that meaningfully edit it for easier review, to be able to see the new code. Thanks a lot!

@t0yv0
Copy link
Member

t0yv0 commented Aug 9, 2024

I'd like to whiteboard a bit here on the approach, this is interesting but a bit hard for me to grok, perhaps we can explore at high level and either commit to this or something slightly differently formulated.

My high-level intuition is that somewhere around gatherSchemaNestedTypesForMember we're doing a pass that identifies all the nested types, and that naming these types is tightly coupled with this pass.

TypePath feels like could be of help here as it's already tracking where we find each necessary type, and TypePathSet admits that the same type can be found under multiple location - a further complication is the merging of structurally identical types here.

Although it's not written this way, the current naming is kind of decided based on the TypePath - it's stacking the properties of how we get toward a type to pick a name for it.

My intuition also is that if we're doing improved naming with conflict resolution that wants to be resilient to re-orderings and additions/removals, its' really hard to do it in gatherSchemaNestedTypesForMember in a single pass.

E.g. if we're targeting something like this (simplifying a bit glossing over inflector.Singularize for lists etc):

type namer interface {
   PickTypeName(TypePath) string 
}

Then the namer needs to perform "live" decisions and commit to picking a name before some other potentially relevant information came in. Then also traversal order needs to be guaranteed to be stable. It's hard.

If we were allowed multiple passes so the namer gets to name multiple types at once, instead on of piecemeal, the namer could solve for conflicts in a more obviously resilient way. So it'd get multiple TypePaths and maybe sort them or solve for conflicts in some other nice way. It could also eventually incorporate prior state as hints.

I was not sure glancing here if your code if your design is single-pass or multi-pass, would love to discuss. The graph approach sounded like it should have some resilience to reordering but in a single-pass traversal it still has to commit to a name before it sees all the other names.

@t0yv0
Copy link
Member

t0yv0 commented Aug 9, 2024

Dreaming we could somehow visually separate the code that's trying to pick names from the code that does the rest of the work, that'd be very nice for maintainability.. But it's somewhat tightly coupled perhaps necessarily so.

Another, orthogonal approach, is to leave code largely as is but do a renaming pass. That is we commit to the original naming, but then analyze the set of names S1 and produce a renaming table R that takes it to the desired better set of names S2 and then just rewrite PackageSpec so all type names are replaced. It's a bit inelegant but completely sidesteps wrangling with invariants in this code.

@t0yv0
Copy link
Member

t0yv0 commented Aug 9, 2024

Riffing off on this last idea, we could also introduce an explicit file that renames 'naturally allocated' names to the desired final names, and the new algorithm would simply be there to help suggesting these at scale.

pulumi-terraform-bridge.yml

typeRenames:
  aws:lex/V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage:V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage: aws:lex/SsmlMessage

Running make tfgen would be allowed to extend the typeRenames mappings but not modify or remove them.

Git diff could help visualize what is changing in the naming table, as well as let the maintainer interfere and suggest even better names easily.

The maintainer could remove mappings and have make tfgen re-allocate them.

Storing these in a file also would naturally introduce the necessary state in the computation as once chosen the names remain locked.

@t0yv0
Copy link
Member

t0yv0 commented Aug 9, 2024

Need to return to this another day, perhaps not a great idea. resources.go is already a source of Go-based configuration where you can in theory reconfigure the name by sticking the override in the right location. It's currently tedious but the idea is right, you want to specify the locations of the new names and not the "naturally allocated names" nonsense like "aws:lex/V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage" . The trouble with specifying things in Go is that it's less amenable to building editors for, but perhaps we can work something out still. Sort of like hexops/autogold can edit files attached to Go-based configuration, we could have ShortenNames(providerInfo *ProviderInfo, nameFile string) that has RW access to nameFile.

@mjeffryes mjeffryes added this to the 0.108 milestone Aug 16, 2024
@corymhall corymhall closed this Aug 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants