-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have the file system signer read the cert files when it needs to in case they have changed #65
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, this looks good - thanks for the PR!
Not to discredit your work, but I'm wondering how useful making this change will be for customers. If the credential helper is used as a long-running process (update
or serve
), the shortest amount of time between credentials updates is about 10 minutes (shortest temporary credential validity is 15 minutes, and the credential helper will perform rotations five minutes before credential expiry). Since CreateSession
calls don't need to be made very frequently, would it make more sense just to read the certificate and private key information right before making the CreateSession
call as opposed to tracking changes in file data?
} | ||
|
||
// GetCertChain Read a certificate bundle and return a chain of all the certificates is contains | ||
func GetCertChain(certificateBundleId string) ([]*x509.Certificate, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also minor, but maybe it would be good to be consistent with the way in which we handle the parsing a single certificate vs. multiple certificates from a file? For a single certificate, I noticed that you added another return value to the existing ReadCertificateData
function, but for multiple certificates, you created a new function. Maybe there's a good reason to do it this way though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadCertificateData
already had the information that I needed and was discarding it so it seemed easiest to just return the info so that everywhere that called it didn't have to didn't have to have an extra 8 lines to redecode and reparse the cret data and the caller could chose to use or ignore the new return value.
I added GetCertChain
to encapsulate the process of appending all the certs from the bundle so that looping didn't need to be repeated when it is called.
If you want I can revert ReadCertificateData
and create a GetCertificate
that calls ReadCertificateData
and then does the decoding and parsing. Up to you, just let me know.
Sorry for the late response - I've added some comments on the PR now. You can feel free to ignore the minor comments - I don't consider them to be blocking. |
A little background for what led me here. I plan on using this as a sidecar container in some pods that are running outside of AWS but need to interact with some AWS services. We plan on using very short lived certificates, so I was testing with certs that were only valid for an hour and were reissued at 30 minutes to exaggerate what we intend and do some testing. That is what helped me notice what was happening. It certainly could be made to read certs before calling the That being said if you want to shift this to modifying the interface to have a read/refresh sort of function I can certainly go down that path. I would either need someone to fix the implementations for windows/darwin cert stores or at very least test and verify for me if I take a stab at changing those. I will take a look at and address the other comments. Let me know how you want to proceed with the larger replaced certificates issue discussed above. |
aa68ad6
to
a078c4a
Compare
Is there any consideration of having an official image available for this helper? |
Thanks for explaining what led you to make this change. Regarding the issue of file watching vs. reading certificate and private key data right before calling
I'll have to discuss this with my team as well. But in the meantime, to help with prioritization, could you +1 #51? |
So it could be that there's something I'm missing, but why does there have to be a modification to the This client application also integrates with PKCS#11 and OS-specific certificate stores, and in each of those cases, the private key data is never cached by the credential helper process (it can't directly access the private key data). Instead, the application will search the secure store for the right private key each time the In the certificate rotation scenario, though, you would want the long-running application to call What are your thoughts? |
I obviously don't know much about the OS cert store implementations. They looked to me like they were getting a reference to a cert once, but I will take your word for those references returning updated certs when necessary. So, if the OS cert stores don't have this issue when used in a long running process then we can just focus on the Let me know what you think. |
Sorry, I may not have been clear in my earlier message, but I believe the certificate data is cached for the lifetime of the process in some cases, even when using the OS certificate store implementations. But the private key data necessarily can't be cached (since that data isn't directly accessible by the application). I have to refresh my memory on how it works for the OS certificate store implementations, but I know that at least for the PKCS#11 signer, the private key is searched for again, each time it needs to be used to sign. If, for other signers, the behavior is, "search for the private key data right before signing," I was thinking that the
I think that's acceptable. You could have cases in which the file changes multiple times between
One of my concerns includes error handling (what happens if the certificate file is deleted at some point but then reintroduced before the next |
@13ajay I have pushed a change to have the file system signer read certs on demand. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for implementing this!
Would you also be able to run go fmt
on the files that you changed as a part of this PR? And if you have the time, could you also add unit testing?
aws_signing_helper/serve.go
Outdated
} | ||
credentialProcessOutput, gcErr := GenerateCredentials(opts, signer, signatureAlgorithm) | ||
if gcErr != nil { | ||
log.Printf("Error generating credentials: %s", gcErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: missing a \n
at the end of this log line?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
aws_signing_helper/serve.go
Outdated
@@ -240,6 +247,9 @@ func AllIssuesHandlers(cred *RefreshableCred, roleName string, opts *Credentials | |||
return | |||
} | |||
} else { | |||
if Debug { | |||
log.Println("Using previous obtained credentials") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: should "previous" be "previously" instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
aws_signing_helper/signer.go
Outdated
@@ -205,35 +197,22 @@ func GetSigner(opts *CredentialsOpts) (signer Signer, signatureAlgorithm string, | |||
" within the PKCS#12 file") | |||
} | |||
// Not a PEM certificate? Try PKCS#12 | |||
certificateChain, privateKey, err = ReadPKCS12Data(opts.CertificateId) | |||
certificateChain, _, err = ReadPKCS12Data(opts.CertificateId) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: I don't think certificateChain
is used here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
aws_signing_helper/signer.go
Outdated
for _, certificate := range certificateChainPointers { | ||
certificateChain = append(certificateChain, certificate) | ||
} | ||
certificateChain = chain |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why assign to certificateChain
after checking the error if we will return from the function anyways in the case that there's an error? certificateChain
is local anyways, so I don't think there's any risk of assigning nil
to it right before returning from the function (in the case of error).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chain
is local to the if block and therefore not available when it needs to be returned. certificateChain
is scoped to the function and is available on the line when this return is called:
return GetPKCS11Signer(opts.LibPkcs11, certificate, certificateChain, opts.PrivateKeyId, opts.CertificateId, opts.ReusePin)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may not have been clear in my comment, but I guess I'm asking why you can't do something like certificateChain, err := GetCertChain(opts.CertificateBundleId)
instead of first assigning to chain
and then assigning to certificateChain
. The only place in which chain
isn't assigned to certificateChain
is in the case of an error (in which case, we would return from the function right away).
} | ||
} | ||
return GetFileSystemSigner(privateKey, certificateChain[0], certificateChain) | ||
return GetFileSystemSigner(opts.PrivateKeyId, opts.CertificateId, opts.CertificateBundleId, true) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we're using PKCS#12 here, I think previous assumptions have been that the certificate bundle would be stored in the PKCS#12 container. Not sure if making this change hurts though... What happens in the case that there's a PKCS#12 bundle file referenced with a certificate bundle in it as well as a "normal" certificate bundle file? Will the contents of the two bundles add to each other?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not really sure I follow. The previous implementation was calling ReadPKCS12Data
and passing what was read to a file system signer to be cached for the duration of the process. The new implementation is still calling ReadPKCS12Data
and if it doesn't get an error back then it passes the same path information that was available here to the new file system signer so that it can call ReadPKCS12Data
when it needs to read the certs if isPkcs12
is true. They seem functionally equivalent. Am I not understanding something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous implementation ignores opts.CertificateBundleId
when opts.CertificateId
contains a path to a PKCS#12 file. In the PKCS#12 case, we expect the certificate bundle to be provided in the PKCS#12 container specified by opts.CertificateId
, so technically, opts.CertificateBundleId
isn't used at all I believe. In your changes, you're passing in opts.CertificateBundleId
into GetFileSystemSigner
in order to create the FileSystemSigner
. It's possible that they're functionally equivalent, but I wanted to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When isPkcs12
is true and the FileSystemSigner
reads the cert data it ignores opts.PrivateKeyId
and opts.CertificateBundleId
https://github.com/aws/rolesanywhere-credential-helper/pull/65/files#diff-9760200f6d4074d48601f0b9219340d8f7706d99fc9066d45527e99eabcbecc7R112-R119
Is there something specific in the formatting that you think is off? My IDE runs What test case are you wanting. The only test that I see that is using a |
I thought that maybe you hadn't since I saw a diff that looked like this:
But it's possible that your IDE formats in a slightly different way. Either way, I don't think it really matters - no need to worry about it.
There isn't anything specific that I had in mind, but I did want to make sure that this code gets tested in some way prior to us merging it. Would you be able to describe how you tested it? I suppose my team will need some way of doing integration testing prior to merging this change, which I can discuss with my colleagues. But I think it would also help to learn how you did testing for these changes yourself. |
go fmt doesn't seem to care about the extra linefeed(s). It was just there because I added code to that method in the previous iteration and then removed it. I removed the linefeed.
The testing I did is the same as my initial testing that allowed me to notice the issue. I built an image and added a sidecar to an external-dns pod to allow external-dns to interact with Route53. The pod running the code currently in the PR minus the removed linefeed and unneeded variable has been running for 3 days without crashing and continuing to allow its paired external-dns to function properly while the certificate is replaced every 30 minutes. I also modified the test we discussed to verify the certificate returned by the signer matches the original read from the file and then again after the file has changed. |
FWIW, I had the same need and quickly put together michaelsauter@5bf1b53 before finding this PR. It differs slightly in that I did not make any change to any signer, instead just added a flag to refresh the signer before use. In general I think I prefer the change done by @krmichelos, I just didn't want to make so many changes :) Looking forward to have this PR merged! |
Sorry for the late reply. I think this looks good, but we will have to do an internal review before this gets merged. I'll let you know when I have an update. |
@13ajay Any update here? |
Sorry, I have received some more comments on this. I will post them on this PR later today. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added comments that I got from a review with security.
.pre-commit-config.yaml
Outdated
@@ -0,0 +1,6 @@ | |||
repos: | |||
- repo: https://github.com/dnephin/pre-commit-golang | |||
rev: v0.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was flagged as potentially problematic in the review, as something that could change without us noticing. Maintaining a fork or scripts that implement the same functionality might be fine, but depending on how much utility there is, it might be easier to remove this and not think about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's locked to a specific version. It also only does something if you configure your system to care about it. Do you want me to remove the .pre-commit-config.yaml?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would prefer we remove it in the case there are sharp edges with respect to maintenance. That was also the recommendation from our brief internal security review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove pre-commit file
log.Printf("Failed to read certificate: %s\n", err) | ||
os.Exit(1) | ||
} | ||
} else if len(chain) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens when chain
has length 0
and there also isn't a certPath
? Do we fail gracefully?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe cert will be returned as nil. What did the previous implementation do when there was no bundle and no cert? From looking at the code I would guess that you got a file system signer with nil for the certificate and nil for the certificate chain. Would you like an else that logs and os exits?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the previous behavior was. I think there's no way to trigger this case if you're using the credential helper application traditionally, but you may be able to trigger it if you take a dependency on the repository and use the FileSystemSigner
directly. To be safe, let's log and exit as you suggested.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added else with exit
var chain []*x509.Certificate | ||
if fileSystemSigner.bundlePath != "" { | ||
chain, err = GetCertChain(fileSystemSigner.bundlePath) | ||
if err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we've already loaded the private key and we fail somewhere else, can there be a leak of private key data? Can we zero the data?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I set privateKey to nil in that paths of code that os.Exit
for _, certificate := range certificateChainPointers { | ||
chain = append(chain, certificate) | ||
} | ||
return chain, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we fail in the case that chain
has length 0
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ReadCertificateBundleData
appears to return an error if the file isn't a valid cert chain. If you would like I can add a check for a 0 length chain and return an "empty cert chain" error. Do you want me to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's fine - I think we can leave it as is.
@13ajay I have responded to the comments, please let me know how to proceed. |
@13ajay I have made changes and comments. Let me know what you think |
@13ajay how do you want me to proceed on the couple of open conversation items? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I apologize for the delay. I've responded to your comments now.
.pre-commit-config.yaml
Outdated
@@ -0,0 +1,6 @@ | |||
repos: | |||
- repo: https://github.com/dnephin/pre-commit-golang | |||
rev: v0.5.1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I would prefer we remove it in the case there are sharp edges with respect to maintenance. That was also the recommendation from our brief internal security review.
log.Printf("Failed to read certificate: %s\n", err) | ||
os.Exit(1) | ||
} | ||
} else if len(chain) > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure what the previous behavior was. I think there's no way to trigger this case if you're using the credential helper application traditionally, but you may be able to trigger it if you take a dependency on the repository and use the FileSystemSigner
directly. To be safe, let's log and exit as you suggested.
for _, certificate := range certificateChainPointers { | ||
chain = append(chain, certificate) | ||
} | ||
return chain, nil |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it's fine - I think we can leave it as is.
…ase they have changed
@13ajay I have made the requested changes. Let me know what you think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry that this process has taken so long, and thanks for the contribution. Approved.
@13ajay No worries, thanks for staying with it and getting it merged. Do you have timeline on when these changes might be available in a released version? |
As of now, our next release is scheduled for about two months from now, and these changes should be made available as a part of that release. It's possible (but probably unlikely) that we have another release before then, in which case, these changes would be made available as a part of that release. |
Issue #64
Fixes #64
Added file watching to update certificate contents when they change on disk so that subsequent requests after the original certificate has expired will still succeed.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.