Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix!: improve case utils for more optimal member names #975

Merged
merged 9 commits into from
Oct 23, 2023
Merged

Conversation

aajtodd
Copy link
Contributor

@aajtodd aajtodd commented Oct 10, 2023

Issue #

fixes awslabs/aws-sdk-kotlin#1064

Description of changes

Pull in splitOnWordBoundaries implementation from Rust to fix awslabs/aws-sdk-kotlin#1064.
Adds a current snapshot of all the member names from every AWS model and the current expected value to allow us to see the scope of these changes and in the future ensure we don't break them if we need to handle new edge cases. Same for sdk IDs.

This is a breaking change to how names for pretty much everything are formed.

See this commit for the differences in member and client names.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@aajtodd aajtodd marked this pull request as ready for review October 13, 2023 16:55
@aajtodd aajtodd requested a review from a team as a code owner October 13, 2023 16:55
Copy link
Contributor

@lauzadis lauzadis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All the new service names seem good to me!

// all non-alphanumeric characters: "acm-success"-> "acm success"
result = result.replace(Regex("[^A-Za-z0-9+]"), " ")
// emit the current word and update from the next character
val emit = { next: Char ->
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this lambda function definitely seems Rusty

@Test
fun testAllNames() {
// Set this to true to write a new test expectation file
val publishUpdate = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we just always have this enabled?

val publishUpdate = true
val allNames = this::class.java.getResource("/all-names-test-output.txt")?.readText()!!
val errors = mutableListOf<String>()
val output = StringBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we add a header to the file input, actual?

fun testClientNames() {
// jq '.. | select(.sdkId?).sdkId' codegen/sdk/aws-models/*.json > /tmp/sdk-ids.txt
// Set this to true to write a new test expectation file
val publishUpdate = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

same comment about whether this should be configurable or not

val publishUpdate = true
val allNames = this::class.java.getResource("/sdk-ids-test-output.txt")?.readText()!!
val errors = mutableListOf<String>()
val output = StringBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also same comment about file header, it makes it easier to know what's what

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a way to make this auto-generated also?

// with minor changes (s3 and iot as whole words). Previously we used the Java v2 implementation
// https://github.com/aws/aws-sdk-java-v2/blob/2.20.162/utils/src/main/java/software/amazon/awssdk/utils/internal/CodegenNamingUtils.java#L36
// but this has some edge cases it doesn't handle well
val out = mutableListOf<String>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style: Could be replaced with return buildList { ... }.

}

// Skip cases like `AR[N]s`, `AC[L]s` but not `IAM[U]ser`
if (peek == 's' && (doublePeek == null || !doublePeek.isLowerCase())) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: (doublePeek == null || !doublePeek.isLowerCase())doublePeek?.isLowerCase() != true

Comment on lines +93 to +94
// Skip cases like `DynamoD[B]v2`
return !(peek == 'v' && doublePeek?.isDigit() == true)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness/Question: Why do we want to skip cases like this? It seems to me that "DynamoDBv2" in this case represents the words ["Dynamo", "DB", "v2"] not ["Dynamo", "DBv2"]. I think "IPv4" and "IPv6" are exceptions because they're actually whole initialisms but I think generally "vn" is the start of a new word. Several test expectations below changed for the worse IMHO (e.g., "TlsV1_2""Tlsv1_2", "MsChapV1""MsChapv1", etc.).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changing this results in:

Member Names

DynamoDBv2Action,dynamoDbv2Action => dynamoDBv2Action (expected dynamoDbv2Action)
dynamoDBv2,dynamoDbv2 => dynamoDBv2 (expected dynamoDbv2)

Enum Names

expected 'MsChapv1' != actual 'MsChaPv1' for input: MS-CHAPv1
expected 'Tlsv1' != actual 'TlSv1' for input: TLSv1
expected 'Tlsv1_2' != actual 'TlSv1_2' for input: TLSv1.2
org.opentest4j.AssertionFailedError: expected 'MsChapv1' != actual 'MsChaPv1' for input: MS-CHAPv1
expected 'Tlsv1' != actual 'TlSv1' for input: TLSv1
expected 'Tlsv1_2' != actual 'TlSv1_2' for input: TLSv1.2

Client Names

SESv2,Sesv2 => SeSv2 (expected Sesv2)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, turns out I didn't fully understand the endOfAcronym function. Since it's working on determining whether current is the end, I think we need to add a new special handling case between // Not an acronym in progress and // We aren't at the next word yet:

if (nextChar == 'v' && peek?.isDigit() == true) {
    // Handle cases like `DynamoDB[v]2`
    return true
}

This yields dynamoDbV2, MsChapV1, TlsV1_2, etc.

Full function code
private fun endOfAcronym(current: String, nextChar: Char, peek: Char?, doublePeek: Char?): Boolean {
    if (!current.last().isUpperCase()) {
        // Not an acronym in progress
        return false
    }

    if (nextChar == 'v' && peek?.isDigit() == true) {
        // Handle cases like `DynamoDB[v]2`
        return true
    }

    if (!nextChar.isUpperCase()) {
        // We aren't at the next word yet
        return false
    }

    if (peek?.isLowerCase() != true) {
        return false
    }

    // Skip cases like `AR[N]s`, `AC[L]s` but not `IAM[U]ser`
    if (peek == 's' && doublePeek?.isLowerCase() != true) {
        return false
    }

    // Skip cases like `DynamoD[B]v2`
    return !(peek == 'v' && doublePeek?.isDigit() == true)
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: This file appears to specifically be a CSV, not just a TXT. Applies to all-names.txt and sdk-ids-test-output.txt as well (and arguably sdk-ids.txt).

Comment on lines 269 to 274
if (publishUpdate) {
File("sdk-ids-test-output.txt").writeText(output.toString())
}
if (errors.isNotEmpty()) {
fail(errors.joinToString("\n"))
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correctness/Question: Should we really publish the update if there are errors? Or should the error check (and potential fail) happen first?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the update doesn't overwrite the file, it ends up generated at the root of the project so you can diff them

Comment on lines 40 to 41
// println("completeWordInProgress: $completeWordInProgress, curr: $currentWord")

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Remove

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
0.0% 0.0% Duplication

@aajtodd aajtodd merged commit cb23b6b into main Oct 23, 2023
13 of 14 checks passed
@aajtodd aajtodd deleted the fix-case-utils branch October 23, 2023 18:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

cloudformation UpdateStackRequest member is named notificationArNs
3 participants