Generate shorter names for properties #2290

corymhall · 2024-08-08T18:00:21Z

Most of the schemaNestedTypes code was just moved from generate_schema.go
The new code is the code related to nestedTypeGraph

I picked an arbitrary max character limit of 120, but those still look pretty long. I checked in pulumi-aws and there are still a lot of types that are up to and over 120.

I currently just pick the shorter name if the normal name would be +120, but I feel like it would be better to introduce a new schema property to enable this at the resource level.

TODO:

I need to add some more tests to ensure this is deterministic, but I am not yet sure it is possible to make it completely deterministic without tracking state (maybe by reading in the existing schema.json)

re #1118

- Most of the `schemaNestedTypes` code was just moved from `generate_schema.go` - The new code is the code related to `nestedTypeGraph` I picked an arbitrary max character limit of 120, but those still look pretty long. I checked in `pulumi-aws` and there are still a lot of types that are up to and over 120. I currently just pick the shorter name if the normal name would be +120, but I feel like it would be better to introduce a new schema property to enable this at the resource level. TODO: - I need to add some more tests to ensure this is deterministic, but I am not yet sure it is possible to make it completely deterministic without tracking state (maybe by reading in the existing schema.json) re #1118

corymhall · 2024-08-08T18:03:40Z

internal/testprovider/schema_large_tokens.go

+	}
+}
+
+func resourceQuicksightTemplate() map[string]*schema.Schema {


I pulled this from the actual aws_quicksight_template resource for completeness sake. It's not the entire schema (i've commented out some parts so we could add them later)

codecov · 2024-08-08T18:10:27Z

Codecov Report

Attention: Patch coverage is 98.68173% with 14 lines in your changes missing coverage. Please review.

Project coverage is 57.61%. Comparing base (f18ff2f) to head (c7bb504).
Report is 47 commits behind head on master.

Files	Patch %	Lines
pkg/tfgen/generate_nested_schema_types.go	95.55%	6 Missing and 2 partials ⚠️
internal/testprovider/schema_large_tokens.go	99.30%	5 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #2290      +/-   ##
==========================================
+ Coverage   56.72%   57.61%   +0.89%     
==========================================
  Files         361      365       +4     
  Lines       49547    50822    +1275     
==========================================
+ Hits        28104    29281    +1177     
- Misses      19903    19981      +78     
- Partials     1540     1560      +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

PR to test out the changes in pulumi/pulumi-terraform-bridge#2290 Only checking in the java sdk to keep the file changes to a minimum

t0yv0 · 2024-08-09T21:35:01Z

pkg/tfgen/generate_schema.go

@@ -58,151 +55,6 @@ type schemaGenerator struct {
 	language Language
 }

-type schemaNestedType struct {


There's some code motion here over some tricky code, could we please separate the commits that move the code from the commits that meaningfully edit it for easier review, to be able to see the new code. Thanks a lot!

t0yv0 · 2024-08-09T22:09:01Z

I'd like to whiteboard a bit here on the approach, this is interesting but a bit hard for me to grok, perhaps we can explore at high level and either commit to this or something slightly differently formulated.

My high-level intuition is that somewhere around gatherSchemaNestedTypesForMember we're doing a pass that identifies all the nested types, and that naming these types is tightly coupled with this pass.

TypePath feels like could be of help here as it's already tracking where we find each necessary type, and TypePathSet admits that the same type can be found under multiple location - a further complication is the merging of structurally identical types here.

Although it's not written this way, the current naming is kind of decided based on the TypePath - it's stacking the properties of how we get toward a type to pick a name for it.

My intuition also is that if we're doing improved naming with conflict resolution that wants to be resilient to re-orderings and additions/removals, its' really hard to do it in gatherSchemaNestedTypesForMember in a single pass.

E.g. if we're targeting something like this (simplifying a bit glossing over inflector.Singularize for lists etc):

type namer interface {
   PickTypeName(TypePath) string 
}

Then the namer needs to perform "live" decisions and commit to picking a name before some other potentially relevant information came in. Then also traversal order needs to be guaranteed to be stable. It's hard.

If we were allowed multiple passes so the namer gets to name multiple types at once, instead on of piecemeal, the namer could solve for conflicts in a more obviously resilient way. So it'd get multiple TypePaths and maybe sort them or solve for conflicts in some other nice way. It could also eventually incorporate prior state as hints.

I was not sure glancing here if your code if your design is single-pass or multi-pass, would love to discuss. The graph approach sounded like it should have some resilience to reordering but in a single-pass traversal it still has to commit to a name before it sees all the other names.

t0yv0 · 2024-08-09T22:12:54Z

Dreaming we could somehow visually separate the code that's trying to pick names from the code that does the rest of the work, that'd be very nice for maintainability.. But it's somewhat tightly coupled perhaps necessarily so.

Another, orthogonal approach, is to leave code largely as is but do a renaming pass. That is we commit to the original naming, but then analyze the set of names S1 and produce a renaming table R that takes it to the desired better set of names S2 and then just rewrite PackageSpec so all type names are replaced. It's a bit inelegant but completely sidesteps wrangling with invariants in this code.

t0yv0 · 2024-08-09T22:41:04Z

Riffing off on this last idea, we could also introduce an explicit file that renames 'naturally allocated' names to the desired final names, and the new algorithm would simply be there to help suggesting these at scale.

pulumi-terraform-bridge.yml

typeRenames:
  aws:lex/V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage:V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage: aws:lex/SsmlMessage

Running make tfgen would be allowed to extend the typeRenames mappings but not modify or remove them.

Git diff could help visualize what is changing in the naming table, as well as let the maintainer interfere and suggest even better names easily.

The maintainer could remove mappings and have make tfgen re-allocate them.

Storing these in a file also would naturally introduce the necessary state in the computation as once chosen the names remain locked.

t0yv0 · 2024-08-09T22:47:22Z

Need to return to this another day, perhaps not a great idea. resources.go is already a source of Go-based configuration where you can in theory reconfigure the name by sticking the override in the right location. It's currently tedious but the idea is right, you want to specify the locations of the new names and not the "naturally allocated names" nonsense like "aws:lex/V2modelsIntentInitialResponseSettingCodeHookPostCodeHookSpecificationSuccessConditionalConditionalBranchResponseMessageGroupMessageSsmlMessage" . The trouble with specifying things in Go is that it's less amenable to building editors for, but perhaps we can work something out still. Sort of like hexops/autogold can edit files attached to Go-based configuration, we could have ShortenNames(providerInfo *ProviderInfo, nameFile string) that has RW access to nameFile.

corymhall commented Aug 8, 2024

View reviewed changes

fixing linting

c7bb504

corymhall added a commit to pulumi/pulumi-aws that referenced this pull request Aug 9, 2024

[TESTING]: bridge generate shorter type names

b1f6a40

PR to test out the changes in pulumi/pulumi-terraform-bridge#2290 Only checking in the java sdk to keep the file changes to a minimum

corymhall mentioned this pull request Aug 9, 2024

[TESTING]: bridge generate shorter type names pulumi/pulumi-aws#4356

Closed

t0yv0 reviewed Aug 9, 2024

View reviewed changes

mjeffryes assigned corymhall Aug 9, 2024

mjeffryes added this to the 0.108 milestone Aug 16, 2024

corymhall closed this Aug 22, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Generate shorter names for properties #2290

Generate shorter names for properties #2290

corymhall commented Aug 8, 2024

corymhall Aug 8, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading

t0yv0 Aug 9, 2024

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

Generate shorter names for properties #2290

Generate shorter names for properties #2290

Conversation

corymhall commented Aug 8, 2024

corymhall Aug 8, 2024

Choose a reason for hiding this comment

codecov bot commented Aug 8, 2024 • edited Loading

Codecov Report

t0yv0 Aug 9, 2024

Choose a reason for hiding this comment

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

t0yv0 commented Aug 9, 2024

codecov bot commented Aug 8, 2024 •

edited

Loading