-
Notifications
You must be signed in to change notification settings - Fork 43
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ensure shorter tokens for object types #1118
Comments
Noting some comments per @danielrbradley Scoping per resource may be beneficial for gradual rollout instead of switching the default wholesale, something that abstracts ObjectTypeNamingStrategy parameter, keep current strategy as the default but allow to opt-in to shorter names per resource/datasource. For ecosystem team we can also have a way to option to a different global strategy if need that, say on larger providers like OCI. |
Note it looks like .Type/.NestedType cannot be relied right now to influence the picked tokens, so that is definitely something that we need to make sure is available here. |
@mjeffryes this is the exciting ticket I was thinking of. |
`tfbridge.SchemaInfo` has an `Omit bool` field, which will omit a property from the schema. Unfortunately, omitted fields don't omit their associated types from the schema, leading to unnecessary bloat. This leads to code such as this (from aws): ```go // We removed the `definition` property from quicksights.Template, see // #1118 // But the types are still present in the schema, which pollutes the Go SDK // specifically. This function removes those types from the schema. func removeUnusedQuicksightTypes(pulumiPackageSpec *schema.PackageSpec) { var elidedTypes []string for tok := range pulumiPackageSpec.Types { if strings.Contains(tok, ":quicksight/AnalysisDefinition") || strings.Contains(tok, ":quicksight/DashboardDefinition") || strings.Contains(tok, ":quicksight/TemplateDefinition") { elidedTypes = append(elidedTypes, tok) } } for _, tok := range elidedTypes { delete(pulumiPackageSpec.Types, tok) } } ``` `Omit: true` is only used in 2 providers: `pulumi-aws` and `pulumi-alicloud` ([source](https://github.com/search?q=org%3Apulumi%20Omit%3A%20true&type=code)). Datadog has recenlty added some hefty (+900k lines) types, and I am considering using `Omit: true` to trim unwanted types. This is a necessary pre-cursor for that usage.
`tfbridge.SchemaInfo` has an `Omit bool` field, which will omit a property from the schema. Unfortunately, omitted fields don't omit their associated types from the schema, leading to unnecessary bloat. This leads to code such as this (from aws): ```go // We removed the `definition` property from quicksights.Template, see // #1118 // But the types are still present in the schema, which pollutes the Go SDK // specifically. This function removes those types from the schema. func removeUnusedQuicksightTypes(pulumiPackageSpec *schema.PackageSpec) { var elidedTypes []string for tok := range pulumiPackageSpec.Types { if strings.Contains(tok, ":quicksight/AnalysisDefinition") || strings.Contains(tok, ":quicksight/DashboardDefinition") || strings.Contains(tok, ":quicksight/TemplateDefinition") { elidedTypes = append(elidedTypes, tok) } } for _, tok := range elidedTypes { delete(pulumiPackageSpec.Types, tok) } } ``` `Omit: true` is only used in 2 providers: `pulumi-aws` and `pulumi-alicloud` ([source](https://github.com/search?q=org%3Apulumi%20Omit%3A%20true&type=code)). Datadog has recenlty added some hefty (+900k lines) types, and I am considering using `Omit: true` to trim unwanted types. This is a necessary pre-cursor for that usage.
`tfbridge.SchemaInfo` has an `Omit bool` field, which will omit a property from the schema. Unfortunately, omitted fields don't omit their associated types from the schema, leading to unnecessary bloat. This leads to code such as this (from aws): ```go // We removed the `definition` property from quicksights.Template, see // #1118 // But the types are still present in the schema, which pollutes the Go SDK // specifically. This function removes those types from the schema. func removeUnusedQuicksightTypes(pulumiPackageSpec *schema.PackageSpec) { var elidedTypes []string for tok := range pulumiPackageSpec.Types { if strings.Contains(tok, ":quicksight/AnalysisDefinition") || strings.Contains(tok, ":quicksight/DashboardDefinition") || strings.Contains(tok, ":quicksight/TemplateDefinition") { elidedTypes = append(elidedTypes, tok) } } for _, tok := range elidedTypes { delete(pulumiPackageSpec.Types, tok) } } ``` `Omit: true` is only used in 2 providers: `pulumi-aws` and `pulumi-alicloud` ([source](https://github.com/search?q=org%3Apulumi%20Omit%3A%20true&type=code)). Datadog has recenlty added some hefty (+900k lines) types, and I am considering using `Omit: true` to trim unwanted types. This is a necessary pre-cursor for that usage. It also has the advantage of making more intuitive sense to future users of the bridge.
Merging #1550 will allow users to override type tokens by setting |
- Most of the `schemaNestedTypes` code was just moved from `generate_schema.go` - The new code is the code related to `nestedTypeGraph` I picked an arbitrary max character limit of 120, but those still look pretty long. I checked in `pulumi-aws` and there are still a lot of types that are up to and over 120. I currently just pick the shorter name if the normal name would be +120, but I feel like it would be better to introduce a new schema property to enable this at the resource level. TODO: - I need to add some more tests to ensure this is deterministic, but I am not yet sure it is possible to make it completely deterministic without tracking state (maybe by reading in the existing schema.json) re #1118
We have some progress related to this issue.
|
Hello!
Issue details
When mapping deeply nested TF properties to Pulumi Package Schema
.types
section, tfgen needs to pick "tokens" to identify every record/object type it encounters, and sometimes it picks very long tokens. Unfortunately these tokens are not simply abstract identifiers, but are used by every language codegen (C#, Py, TS, Java, Go) to pick names for classes representing these structures. This can lead to very long class names, because the current naming heuristic makes the name O(N) in the depth of property.Example of an unusable name:
TemplateDefinitionSheetVisualGeospatialMapVisualChartConfigurationFieldWellsGeospatialMapAggregatedFieldWellsValueNumericalMeasureFieldFormatConfigurationNumericFormatConfigurationPercentageDisplayFormatConfigurationSeparatorConfigurationThousandsSeparator
In languages that require file-per-class this additionally creates problems with very long filenames.
The feature request is to make a change to generate shorter tokens for object types. This is scoped to only types of resource/datasource/provider properties, and will not change anything for types representing resources, datasources or provider config itself. The change will make codegen class names shorter, easier to work with, more pleasant, and avoid hard limits on filenames in languages such as C#.
The last bit of the token must be shorter than 256 as that is where codegen breaks. Arguably we can also have a smaller cutoff like 120 for aesthetic reasons. We can make this configurable defaulting to some reasonable value.
Note that this change will be technically breaking, because it will translate to auxiliary types being moved, and some programs depending on the old location of these types will be broken with the update. To mitigate the break, we need to ensure that an escape hatch is available for the bridged provider maintainer to hand-write a token of a given auxiliary type. It looks like
.Type
or.NestedType
fields on SchemaInfo may help with this purpose but we need to confirm as it's not obvious. The expectation is that the maintainers will chose to mostly accept shorter names as a "welcome breaking change", but make exceptions for auxiliary types that happen to be widely used, based on user feedback.Affected area/feature
The text was updated successfully, but these errors were encountered: