Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

VNet daemon: Set code signing requirements #44639

Merged
merged 9 commits into from
Jul 29, 2024
Merged

Conversation

ravicious
Copy link
Member

This PR makes sure that only tsh signed by us can contact the VNet daemon and that the VNet daemon is executed from a tsh binary signed by us.

It achieves this using setCodeSigningRequirement and setConnectionCodeSigningRequirement which ensure that the designated requirement of the app on the other side of an XPC connection satisfies a certain requirement.

A detailed breakdown of what constitutes a matching requirement is documented in the code. But in short, the requirement boils down to "an app with the same code signing identifier and the team identifier as me, signed by a Developer ID Application cert issued by Apple".

To extract the code signing identifier and the team identifier out of the currently running code, we use SecCodeCopySelf and SecCodeCopySigningInformation. Those are APIs from the Code Signing framework written in C and use Core Foundation framework. This means that there's a lot of auxiliary code related to memory management. I documented every step to make this easier to understand, but it does require an understanding of a couple of key Obj-C characteristics (I do link to docs explaining them).

Verifying the client

Video
unsigned-client-calling-daemon.mov

This is seems to be the more important half of this PR. The VNet daemon exposes privileged APIs to unprivileged processes, so we don't want just any program on the device to be able to talk to the daemon.

In the video, a client that has not been signed at all attempts to talk to the daemon. Since we tell the listener to place a requirement on each new connection, the connection gets interrupted before it even gets to the daemon. So launchd spawns the daemon, but the daemon never receives the message nor knows about the connection attempt.

This poses a problem from the perspective of being able to easily debug this. Since the XPC service never learns about this connection, we cannot log this attempt on the daemon side. The client side does not get informed that it does not match the code signing requirement. It merely gets informed that the connection was interrupted. As such, when the client sees that the connection was interrupted, I log a debug message pointing towards Console.app to check the reason behind the interruption.

I made a thread on the Developer Forums asking if there's a way to detect this, but so far it doesn't seem like it's possible.

Verifying the daemon

Video
daemon-with-wrong-code-sign-req.mov

Here's a demo of the client verifying that the daemon can be trusted. To simulate a daemon with wrong code signing requirement, I modified the requirement passed to setCodeSigningRequirement so that it's impossible to match.

It's harder to understand what exactly macOS does in this scenario and it isn't well documented. In the video you can see that the connection between the client and the daemon gets established, but it's interrupted soon after. The daemon gets the message from the client and executes its own handler for that message. This passes the socket back to Go code on the daemon side and creates a TUN device, which is then passed to the client. The client proceeds to use the TUN device.

This all happens in a separate goroutine while the startVnet method of VNEDaemonClient waits for the completion block to be called by the daemon. Instead, it gets back an error. This makes the client return early and shut down. The daemon sees that the client was shut down and stops execution as well.

The hard to understand part is that I don't know if it's guaranteed that the interruption happens within the block where the client calls the daemon or not:

- (void)startVnet:(VnetConfig *)vnetConfig completion:(void (^)(NSError *))completion {
// This way of calling the XPC proxy ensures either the error handler or
// the reply block gets called.
// https://forums.developer.apple.com/forums/thread/713429
id proxy = [self.connection remoteObjectProxyWithErrorHandler:^(NSError *error) {
completion(error);
}];
[(id<VNEDaemonProtocol>)proxy startVnet:vnetConfig
completion:^(NSError *error) {
completion(error);
}];
}

According to my tests, it seems that the connection gets invalidated after the daemon calls the completion block passed from the client, but before the result of it gets back to the client. As in, it's always [self.connection remoteObjectProxyWithErrorHandler] that calls completion with an error first.

What was easier to implement in this scenario is that when the daemon does not satisfy the requirement, the client gets a clear information about it through the NSXPCConnectionCodeSigningRequirementFailure error code.

The client could be refactored to wait for the full dance between it and the daemon to be complete before attempting to use the TUN device. However, this adds further complexity to an already complicated system dealing with async stuff.

Both client_darwin.go and serivce_darwin.go are going to use them,
so it doesn't make sense to keep them in either of those files or scattered
across them.
There's little need for those two files to be kept separate.
@ravicious ravicious added no-changelog Indicates that a PR does not require a changelog entry backport/branch/v16 labels Jul 25, 2024
@ravicious ravicious force-pushed the r7s/code-sign-req branch from 1b91d79 to 208943c Compare July 26, 2024 08:31
@ravicious ravicious force-pushed the r7s/code-sign-req branch from 208943c to 22d7ae2 Compare July 26, 2024 08:50
@@ -41,3 +48,73 @@
}
return strdup("");
}

bool getCodeSigningRequirement(NSString **outRequirement, NSError **outError) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a C function working on Obj-C objects, which I think is fine.

From what I understand, the pattern with passing stuff by reference is common when you want to "return" more than one thing. This is especially common when it comes to NSError. https://developer.apple.com/library/archive/documentation/Cocoa/Conceptual/ProgrammingWithObjectiveC/ErrorHandling/ErrorHandling.html#//apple_ref/doc/uid/TP40011210-CH9-SW5

@ravicious ravicious marked this pull request as ready for review July 26, 2024 08:54
@ravicious ravicious requested review from codingllama and gzdunek July 26, 2024 08:54
@github-actions github-actions bot requested review from lxea and r0mant July 26, 2024 08:54
@ravicious ravicious removed request for r0mant and lxea July 26, 2024 08:55
Copy link
Contributor

@codingllama codingllama left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the detailed explanation. It all makes a lot more sense because of it.

lib/vnet/daemon/client_darwin.go Show resolved Hide resolved

package daemon

// #cgo CFLAGS: -Wall -xobjective-c -fblocks -fobjc-arc -mmacosx-version-min=10.15
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've run into problems before with multiple #cgo declarations in the same package. You might want to have all import "C" related stuff in a single file. Just a word of caution.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed #cgo declarations from all files but common_darwin.go. bc3e9f5

lib/vnet/daemon/common_darwin.h Outdated Show resolved Hide resolved
lib/vnet/daemon/common_darwin.m Outdated Show resolved Hide resolved
@ravicious ravicious enabled auto-merge July 29, 2024 13:52
@ravicious ravicious added this pull request to the merge queue Jul 29, 2024
Merged via the queue into master with commit d9c645d Jul 29, 2024
40 checks passed
@ravicious ravicious deleted the r7s/code-sign-req branch July 29, 2024 14:26
@public-teleport-github-review-bot

@ravicious See the table below for backport results.

Branch Result
branch/v16 Create PR

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/branch/v16 no-changelog Indicates that a PR does not require a changelog entry size/md
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants