Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: refactor URL types and supporting classes #989

Merged
merged 19 commits into from
Nov 22, 2023
Merged

Conversation

ianbotsf
Copy link
Contributor

@ianbotsf ianbotsf commented Oct 27, 2023

Issue #

Addresses #659

Description of changes

  • Breaking: Completely rewrite URL APIs to clarify content encoding, when data is in which state, and to reduce the number of times data is encoded/decoded.
    • Remove several now-unused functions and classes mainly relating to URL encoding/decoding
  • Breaking: Move several classes out of aws.smithy.kotlin.runtime.util into better packages like text and collections
  • Add new MultiMap/MutableMultiMap collection types for supporting URL query parameters
  • Add new collection views for providing live mapping of underlying collections. These facilitate providing, for example, a view of data in both encoded and decoded forms while still having a single source of truth.

⚠️ Callouts

  • Tests are in a bit of a bad state right now. I ripped out a bunch of tests against the old APIs but don't yet have the same level of coverage for the new ones...at least at the unit test level. I'll add more but feel free to highlight particular areas you feel could use better support.
  • Still looking for feedback on the ergonomics and usability of the new APIs. The updated tests/codegen should provide some examples.

Companion PR: awslabs/aws-sdk-kotlin#1119

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

override fun toString(): String = buildString {
append("Encodable(decoded=")
append(decoded)
append(",encoded=")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we have a space after comma

public val Fragment: Encoding = PercentEncoding("fragment", VALID_FCHAR)

internal val None = object : Encoding {
override val name = "(no encoding)"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not empty string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value is mainly used for Encodable.toString. Any hypothetical to-string of a non-encoded value would look weird with an empty string name: "Encodable(decoded=foo, encoded=foo, encoding=)".

i++
}
} else {
append(decodeMap[c] ?: throw IllegalArgumentException("unknown encoding"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add c to the exception message

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correctness: missing tests

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this PR omits basically all tests except for some I wrote to help me verify some basics of implementation. After we settle on the API design, I'll write the rest of the tests for this class and others.

processAndSkip(literal, startIndex, handler)
}

override fun toString(): String = "Scanner(remainingText='${text.substring(currentIndex)}')"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would it be useful to also print the text as it was given?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I personally don't think so for two reasons:

  1. It's likely to be verbose and you could wind up with to-string representations like "Scanner(text='The quick brown fox jumped over the lazy dogs',remainingText='quick brown fox jumped over the lazy dogs')". Seems noisy.
  2. The scanner is stateful and forward-only. Any text that was previously processed is effectively out of scope and seems irrelevant in a toString output. The only reason this class doesn't discard already-processed text (e.g., by making text a var and replacing it via substring) is for performance.

Copy link
Contributor

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks ok, no huge objections just some cleanup, polish, and of course tests.

It's hard to tell how big of a change this will be. I still have some concerns on usability but those may be addressed easily IDK.

* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
* SPDX-License-Identifier: Apache-2.0
*/
package aws.smithy.kotlin.runtime.util.text.encoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if we shouldn't rethink util a bit. In particular we have quite a few "text" things not in text like base64 and hex that could be in an "encoding" package.

e.g.

runtime-core/aws/smithy/kotlin/runtime/
    text/
    collections/
    util/
    ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of rethinking the util package but I'm not quite clear on the suggestion. More verbosely, are you suggesting this:

  • aws.smithy.kotlin.runtime
    • collections
      • MutableListView.kt
      • ValuesMap.kt
      • ...
    • text
      • Scanner.kt
      • ...
      • encoding
        • Base64.kt
        • Encoding.kt
        • Hex.kt
        • ...
    • util
      • ...(basically whatever's left)

If so, I like it and I can incorporate it into the wider Url refactor.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes basically something like that. We clearly have text and collection related things that are just sort of random and could be in their own packages at this point I think.


import aws.smithy.kotlin.runtime.InternalApi

private val ALPHA = (('A'..'Z') + ('a'..'z')).toSet()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like we maybe use this and others elsewhere 🤔 , wonder if we shouldn't just make these public in the text package. Not super important just thinking out loud. Ignore if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other place I found similar things was in the existing URL parsing logic. Once that's replaced, I don't know that these will be shared with any other code.

That said, I don't have a problem making them public but I should probably move them to be namespaced under the PercentEncoding companion object.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I could be mistaken, if not we can leave them private.

private val VALID_QCHAR = VALID_FCHAR - setOf('&', '=')

@InternalApi
public interface Encoding {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels too generic of a name given the encodings seem to be related to url percent encoding/specific specs 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Encoding itself can theoretically represent any encoding, including Base64, hex bytes, etc. I should move out PercentEncoding to its own file. The companion vals like UserInfo, Path, etc., should also move.

* @param encoded An encoded URL string
* @return A new [Url] instance
*/
public fun parseEncoded(encoded: String): Url = Url {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there is only one parse method then I still strongly suggest this just be Url.parse().

/**
* Get or set the URL path as a **decoded** string
*/
public var pathDecoded: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: decodedPath/encodedPath. (same for other members). You already use encodedSegments/decodedSegments on UrlPath.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point.

* Update the [QueryParameters] of this URL via a DSL builder block
* @param block The code to apply to the [QueryParameters] builder
*/
public fun queryParameters(block: QueryParameters.Builder.() -> Unit) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be an extension function as it is now. Also no reason to rename it from parameters if you want to reduce churn on the codegen and the runtime.


// Query parameters

private var queryParameters = url?.queryParameters?.toBuilder()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should have direct access and be non-nullable as it is now.

public val parameters = url?.parameters?.toBuilder() ?: QueryParameters.Empty

public val host: Host,
public val port: Int,
public val path: UrlPath,
public val queryParameters: QueryParameters?,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion: Leave as non-nullable and just default it to QueryParameters.Empty.

*/
public fun toBuilder(): Builder = Builder(this)

override fun toString(): String = asEncoded(segments, trailingSlash)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Probably document that this is the encoded version returned.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

/**
* Remove all query parameters from this URL
*/
public fun clearQueryParameters() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems unnecessary, if we just give access to the builder directly you can do urlBuilder.parameters.clear().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've left several comments around query parameters and nullability vs emptiness. Normally I'm a strong proponent of non-nullable collections since emptiness and null generally mean the same thing. In this case, however, there's a subtle difference that the API clearly doesn't communicate:

RFC 3986 § 6.2.3 states that these two URIs are not equivalent and cannot be normalized to each other:

  • http://example.com/
  • http://example.com/?

That's because the second URI has a query string (and that string is empty) whereas the first URI does not have a query string.

That's why Url.queryParameters is nullable. If it's present, a ? gets added to the encoded representation followed by the encoded elements in the map. Similarly, clearQueryParameters is not equivalent to queryParameters.clear(): the former ensures no query string is present while the latter means that a query string is present and empty.

If we want to deviate from the RFC here that's fine, I think we can. If we want to stick to the RFC, I'd welcome suggestions on how to make the distinction between no query string and empty query string more clear from an API perspective.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we handled this with the forceQuery parameter on the builder. I'm not suggesting to ignore this but it's not something we've really needed or used much so can probably be moved to a boolean flag like that (either under Url.Builder directly or as part of QueryParameters[.Builder])).

Mostly I think it's an ergonomics thing and IMHO the non-nullably empty version of this makes it easier to consume and composes better (whereas making it nullable hoists a bunch of concerns of it's parts into the Url.Builder directly rather than letting the individual part deal with it).

/**
* Get or set the URL path as a **decoded** string
*/
public var pathDecoded: String
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, good point.

/**
* Remove all query parameters from this URL
*/
public fun clearQueryParameters() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You've left several comments around query parameters and nullability vs emptiness. Normally I'm a strong proponent of non-nullable collections since emptiness and null generally mean the same thing. In this case, however, there's a subtle difference that the API clearly doesn't communicate:

RFC 3986 § 6.2.3 states that these two URIs are not equivalent and cannot be normalized to each other:

  • http://example.com/
  • http://example.com/?

That's because the second URI has a query string (and that string is empty) whereas the first URI does not have a query string.

That's why Url.queryParameters is nullable. If it's present, a ? gets added to the encoded representation followed by the encoded elements in the map. Similarly, clearQueryParameters is not equivalent to queryParameters.clear(): the former ensures no query string is present while the latter means that a query string is present and empty.

If we want to deviate from the RFC here that's fine, I think we can. If we want to stick to the RFC, I'd welcome suggestions on how to make the distinction between no query string and empty query string more clear from an API perspective.

/**
* Get or set the query parameters as a **decoded** string
*/
public var queryParametersDecoded: String?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got existing code that takes a query string, splits it into query parameters, and then stuffs those into a parameters builder:

if (uri.query != null && uri.query.isNotBlank()) {
    val parsedParameters = uri.query.splitAsQueryParameters()
    parameters.appendAll(parsedParameters)
}

Seems like it'd make sense to move that capability inside of the builder itself so that calling code can just:

encodedQueryParameters = uri.query // handles the null case too!

*/
public fun toBuilder(): Builder = Builder(this)

override fun toString(): String = asEncoded(segments, trailingSlash)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point.

* Copyright Amazon.com, Inc. or its affiliates. All Rights Reserved.
* SPDX-License-Identifier: Apache-2.0
*/
package aws.smithy.kotlin.runtime.util.text.encoding
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like the idea of rethinking the util package but I'm not quite clear on the suggestion. More verbosely, are you suggesting this:

  • aws.smithy.kotlin.runtime
    • collections
      • MutableListView.kt
      • ValuesMap.kt
      • ...
    • text
      • Scanner.kt
      • ...
      • encoding
        • Base64.kt
        • Encoding.kt
        • Hex.kt
        • ...
    • util
      • ...(basically whatever's left)

If so, I like it and I can incorporate it into the wider Url refactor.


import aws.smithy.kotlin.runtime.InternalApi

private val ALPHA = (('A'..'Z') + ('a'..'z')).toSet()
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only other place I found similar things was in the existing URL parsing logic. Once that's replaced, I don't know that these will be shared with any other code.

That said, I don't have a problem making them public but I should probably move them to be namespaced under the PercentEncoding companion object.

private val VALID_QCHAR = VALID_FCHAR - setOf('&', '=')

@InternalApi
public interface Encoding {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a good point. Encoding itself can theoretically represent any encoding, including Base64, hex bytes, etc. I should move out PercentEncoding to its own file. The companion vals like UserInfo, Path, etc., should also move.

Comment on lines 53 to 62
@Test
fun testEquivalenceOfQueryStrings() {
testEquivalence(
// "https://amazon.com?", // Old parser strips '?' with empty query params...is bug?
"https://amazon.com?foo=bar",
// "https://amazon.com?foo=bar&baz=qux", // Old parser sorts query by key...is bug?
"https://amazon.com?foo=f%F0%9F%98%81o",
// "https://amazon.com?foo=f+o%20o", // Old parser encodes qparam ' ' to %20 instead of '+'...is bug?
)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: No one remarked on these commented-out test cases. Our existing URL parser has some non-standard behavior that was surprising to me, such as stripping empty query strings and empty fragments. The new parser (correctly, I believe) retains them. This matches the RFC but is a different from our current behavior and may cause unexpected problems.

Do we have any thoughts about RFC adherence vis-à-vis current behavior?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Stripping empty query strings and fragments 🤷 . I think a parser could set these as empty and if we go with my suggestion of a flag you could set the flag but in practice it's not really going to matter for us I don't think.

The %20 vs + depends on whether the content type is application/x-www-form-urlencoded.

https://en.wikipedia.org/wiki/Percent-encoding#The_application.2Fx-www-form-urlencoded_type

Our existing implementation branches on this but from what I can tell we have generally not been using form-url encoding. Your decode function will have to handle this for query params but for encode I'd probably default to %20 and maybe provide a flag or something to enable encoding as + like we do now.

@ianbotsf ianbotsf marked this pull request as ready for review November 17, 2023 18:25
@ianbotsf ianbotsf requested a review from a team as a code owner November 17, 2023 18:25
Copy link
Contributor

@aajtodd aajtodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good

val encodeFn = if (segment.isGreedyLabel) {
writer.format("#T(greedy = true)", encodeSymbol)
if (segment.isGreedyLabel) {
write("addAll(#S.split(#S))", "\${$identifier}", '"')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question/correctness: Why is this splitting on "?

Also previously we used a dedicated function encodeLabel which adheres to slightly different encoding semantics than HTTP RFC does. I'm not sure this is correct to replace appending (or splitting greedy labels and appending) as it does now.

Copy link
Contributor Author

@ianbotsf ianbotsf Nov 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right on both accounts. I've fixed the " splitting to be / as intended. I've fixed part of the label encoding but I still have more changes to make to support query parameters.

*/
package aws.smithy.kotlin.runtime.collections

public interface MultiMap<K, V> : Map<K, List<V>> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reminder to go back and sweep for docs

fragment = url.fragment
}

public val requestRelativePath: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs

get() = asEncoded(segments, trailingSlash)
set(value) { parseEncoded(value) }

public val segments: MutableList<Encodable> = path?.segments?.toMutableList() ?: mutableListOf()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Is it worth exposing this? When would we or a user need to interact with this vs encodedSegments/decodedSegements? Same question for other places where we follow this pattern where we have the trio (Encodable, Encoded, Decoded) exposed as fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could make it @InternalApi. It seemed slightly weird that the built UrlPath would have a segments: List<Encodable> member but the builder wouldn't have a mutable equivalent. But practically speaking, most times users will interact with the encoded/decoded variants.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No it's fine. It was more of a sanity check "do we need this" but if we make use of it somewhere then that's ok. I was imagining that most code was setting/getting it through encodedSegments/decodedSegments.

* * Segments of `..` are used to discard ancestor paths (e.g., `/a/b/../c` → `/a/c`)
* * All other segments are unmodified
*/
public fun normalize() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: We don't call this by default anywhere on the path right? Just for signing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SDK does not call this anywhere by default except during signing (on a copy), Still, I added it here because it seems useful for users to be able to normalize paths which may be constructed dynamically, coming from external sources, etc.

}

private val encoded: String
public val requestRelativePath: String
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs?

@@ -113,7 +102,7 @@ public suspend fun dumpRequest(request: HttpRequestBuilder, dumpBody: Boolean):
val buffer = SdkBuffer()

// TODO - we have no way to know the http version at this level to set HTTP/x.x
buffer.writeUtf8("${request.method} ${request.url.encodedPath}\r\n")
buffer.writeUtf8("${request.method} ${request.url.requestRelativePath}\r\n")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

question: Why would this (and elsewhere where we've replaced encodedPath with requestRelativePath) not be path.encoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous encodedPath field was a misnomer. Beyond just the path, it also included the query parameters and fragment. Further, it also prepended a / if none was present. Replacing it with path.encoded (as I did initially based on the name) causes breaks in signing, protocol tests, and several other places.

I chose to name it requestRelativePath because the places this is used generally deal with forming an HTTP request line (e.g., GET /requestRelativePath?foo=bar#baz HTTP/1.1). It's still kinda wordy though...open to alternative naming suggestions.

public val port: Int,
public val path: UrlPath,
public val parameters: QueryParameters,
public val userInfo: UserInfo,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

suggestion this could remain nullable, we don't use it much and unlike QueryParameters and Headers it's "empty" state is less useful.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that I've switched this to non-nullable I find I prefer it since it involves less null-coalescing operations for users. I could switch it back if you feel strongly but I have a slight preference for leaving it non-null.


import aws.smithy.kotlin.runtime.InternalApi
import aws.smithy.kotlin.runtime.util.CaseInsensitiveMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fix: Should this not be in collections?

Also (and this can be a follow on PR) we should look at what's left in util and see if it still belongs there. On surface Stack and ReadThroughCache utils probably belongs in collections, possibly Attributes?. Also some of the other APIs probably belong in io or elsewhere.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. I can move it and the others you mentioned to collections as well.

import aws.smithy.kotlin.runtime.InternalApi

@InternalApi
public class PercentEncoding(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs

internal fun Url.Builder.canonicalQueryParams(): String {
val canonicalized = parameters
.entries
.associate { (key, values) -> key.reencode().encoded to values.map { it.reencode().encoded } } // FIXME 🤮
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: could use a more descriptive FIXME message 😄

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh whoops, I should've fixed that before committing/pushing it.

@@ -137,6 +143,8 @@ internal class OkHttpProxyAuthenticator(
}
}

private fun UserInfo.toBasicCredentials() = Credentials.basic(userName.decoded, password.decoded)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: this seems like an unnecessary utility since it's only used once

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, it was a bit more useful when UserInfo was nullable. Will remove.

/**
* Represents the parameters in a URL query string.
*/
public class QueryParameters private constructor(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's okay, because the query string keys must be case-sensitively unique according to Smithy. https://smithy.io/2.0/spec/http-bindings.html#httpquery-trait

* A flag indicating whether to force inclusion of the `?` query separator even when there are no parameters (e.g.,
* `http://foo.com?` vs `http://foo.com`).
*/
public val forceQuery: Boolean,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: forceQuerySeparator?

public fun encodableFromDecoded(decoded: String): Encodable = Encodable(decoded, encode(decoded), this)
public fun encodableFromEncoded(encoded: String): Encodable {
val decoded = decode(encoded)
// val reencoded = encode(decoded) // TODO is this right?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: leftover TODO

@ianbotsf ianbotsf changed the title wip: add new URL types and supporting classes chore: refactor URL types and supporting classes Nov 21, 2023
} else {
write("add(#L(#S))", encodeFn, "\${$identifier}")
// literal
val encodeFn = format("#T.Path.encode", RuntimeTypes.Core.Text.Encoding.PercentEncoding)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question/correctness: Why wouldn't this use the same (smithy label) encode function?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Confusingly, path literals are encoded differently from path labels. In particular, literals have a wider variety of characters they accept without encoding while labels are more conservative. I couldn't find explicit documentation supporting this so I fall back to the RFC 3982 specification of how paths should be encoded.

Note that at least one protocol test enforces that path literal segments are not over-encoded. In this test, encoding literals with SmithyLabel would result in the path /ReDosLiteral/abc/%28a%2B%29%2B which fails to match the expected value of /ReDosLiteral/abc/(a+)+.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You might add some of this context as a comment in the code

add(PercentEncoding.Path.encode("foo"))
}
parameters.encodedParameters {
val labels = mutableMultiMapOf<String, String>()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Question: Why do we need an auxiliary data structure here?

Wonder if we can have an extension function or something that allows you to set the encoding to use.

        public fun encodedParameters(
            encoding: Encoding = PercentEncoding.Query,
            block: MutableMultiMap<String, String>.() -> Unit
        ) {
            asView(
                Encodable::encoded,
                encoding::encodableFromEncoded,
                Encodable::encoded,
                encoding::encodableFromEncoded,
            ).apply(block)
        }
parameters.encodedParameters(PercentEncoding.SmithyLabel) {
     if (input.query1 != null) add("Query1", input.query1)
} 

@@ -215,7 +215,7 @@ class HttpStringValuesMapSerializer(

private val HttpBinding.Location.addFnName: String
get() = when (this) {
HttpBinding.Location.QUERY, HttpBinding.Location.QUERY_PARAMS -> "add" // uses MutableMultiMap
HttpBinding.Location.QUERY, HttpBinding.Location.QUERY_PARAMS -> "labels.add" // uses MutableMultiMap
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this feels too coupled to where it's used

@@ -39,13 +39,14 @@ public class Encodable internal constructor(

if (decoded != other.decoded) return false
if (encoded != other.encoded) return false
return encoding == other.encoding
// return encoding == other.encoding
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't tell whether I consider this correct or not...

@@ -229,6 +231,8 @@ class UrlParsingTest {
assertEquals(0, parsed.size)
assertTrue(parsed.forceQuerySeparator, "Expected forceQuery=true for $url")
}

println(full)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit remove


class MutableMultiMapViewTest {
@Test
fun testCompetingViews() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice

Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot E 29 Security Hotspots
Code Smell A 22 Code Smells

No Coverage information No Coverage information
1.2% 1.2% Duplication

@lauzadis lauzadis merged commit a9e46e5 into main Nov 22, 2023
13 of 14 checks passed
@lauzadis lauzadis deleted the chore-refactor-urls branch November 22, 2023 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants