-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VNet: Make sure daemon client has adequate permissions for passed TELEPORT_HOME
#44751
Conversation
// egid of the user starting VNet. Unsafe for production use, as the egid comes from an unstrusted | ||
// source. | ||
egid int | ||
// euid of the user starting VNet. Unsafe for production use, as the euid comes from an unstrusted | ||
// source. | ||
euid int |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unlike, say, sudo
, osascript -e "do shell script \"…\" with administrator privileges"
does not seem to offer SUDO_UID
or any other way to identify the user who spawned osascript in the first place.
osascript -e "do shell script \"id -ru\" with administrator privileges"
returns 0
.
However, after releasing the launch daemon we don't plan to use osascript in production (since we compile tsh with VNETDAEMON=yes
). It's there only to make development easier. As such, I pass egid and euid through argv.
lib/vnet/daemon/service_darwin.go
Outdated
@@ -77,6 +78,7 @@ func Start(ctx context.Context, workFn func(context.Context, Config) error) erro | |||
"ipv6_prefix", config.IPv6Prefix, | |||
"dns_addr", config.DNSAddr, | |||
"home_path", config.HomePath, | |||
"cred", fmt.Sprintf("%#v", config.ClientCred), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I had to look up the %#v
format string, I think here %+v
is probably slightly better because its prints field names without the type names
"cred", fmt.Sprintf("%#v", config.ClientCred), | |
"cred", fmt.Sprintf("%+v", config.ClientCred), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be cleaner to implement String
on the ClientCred
type (and then remember to wrap it in logutils.StringerAttr
) or maybe even just implement LogValue
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LogValue
is pretty cool, TIL. It ends up being logged like this. Admittedly, slog.GroupValue
seems to look better with JSON formatters but I'll take this.
2024-07-30T16:58:15+02:00 INFO [VNET:DAEM] Received VNet config socket_path:/var/folders/1x/xhqmbxsd3_v9zdv78ybr14zr0000gn/T/vnetffa74891-73de-43c8-9df1-23d80ca0873c.sock ipv6_prefix:fde3:5ebb:fece:: dns_addr:fde3:5ebb:fece::2 home_path:/Users/fav/.tsh creds_valid:true egid:20 euid:501 trace_id:8a25ef3f4bcc9bf975b59e63996dbe41 span_id:ff3cdafcd7d941c7 daemon/service_darwin.go:75
lib/vnet/osconfig_darwin.go
Outdated
|
||
log.InfoContext(ctx, "Temporarily dropping root privileges.", "egid", c.daemonClientCred.Egid, "euid", c.daemonClientCred.Euid) | ||
defer func() { | ||
if err == nil { | ||
log.InfoContext(ctx, "Restored root privileges.", "egid", rootEgid, "euid", rootEuid) | ||
} | ||
}() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recommend moving this to the top of the function, before the Setegid/Seteuid calls/defers. Then we'll actually print "Temporarily dropping root privileges." before the attempt, and "Restored root privileges." after successfully doing that
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have to be concerned about the size of struct VnetConfigResult
changing and two processes potentially disagreeing about it (an older service running and a freshly upgraded client connecting to it) or is it fine as long as the new fields are at the end of the struct?
lib/vnet/osconfig_darwin.go
Outdated
} | ||
defer func() { | ||
syscallErr := trace.Wrap(syscall.Setegid(rootEgid), "reverting egid") | ||
err = trace.NewAggregate(err, syscallErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
trace.aggregate
composes really poorly with itself, I'd find some other way to keep track of the multierror before returning if we really care about that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Solved by removing trace.NewAggregate
altogether in favor of using panic
.
lib/vnet/osconfig_darwin.go
Outdated
return trace.Wrap(err, "setting egid") | ||
} | ||
defer func() { | ||
syscallErr := trace.Wrap(syscall.Setegid(rootEgid), "reverting egid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it make sense to do anything but panic if the OS prevents us from restoring the egid/euid that we had before? The whole process is basically unusable at that point right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, true. I've never had a legitimate reason to use panic
in any program that I've written, so I was hesitant to use it. But an error is technically recoverable, while here we just shouldn't continue execution at all.
|
||
// doWithDroppedRootPrivileges drops the privileges of the current process to those of the client | ||
// process that called the VNet daemon. | ||
func (c *osConfigurator) doWithDroppedRootPrivileges(ctx context.Context, fn func() error) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing prevents this from being called in parallel. Should we have a package-level mutex (or maybe just a package level atomic bool) that prevents more than one goroutine from running while we do these privilege shenanigans?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add an atomic bool and make sure that this method returns an error if compare and swap fails. If someone runs into that error during development, they'll have to consciously think about what the behavior of this function should be, instead of simply waiting for a mutex (which could be a valid option, but I don't think we should make this decision now).
|
||
// doWithDroppedRootPrivileges drops the privileges of the current process to those of the client | ||
// process that called the VNet daemon. | ||
func (c *osConfigurator) doWithDroppedRootPrivileges(ctx context.Context, fn func() error) (err error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll add an atomic bool and make sure that this method returns an error if compare and swap fails. If someone runs into that error during development, they'll have to consciously think about what the behavior of this function should be, instead of simply waiting for a mutex (which could be a valid option, but I don't think we should make this decision now).
lib/vnet/osconfig_darwin.go
Outdated
return trace.Wrap(err, "setting egid") | ||
} | ||
defer func() { | ||
syscallErr := trace.Wrap(syscall.Setegid(rootEgid), "reverting egid") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, true. I've never had a legitimate reason to use panic
in any program that I've written, so I was hesitant to use it. But an error is technically recoverable, while here we just shouldn't continue execution at all.
@ravicious See the table below for backport results.
|
This is the last PR we need before we're able to ship the launch daemon version of VNet.
When starting VNet, the unprivileged process sends a couple of arguments to the privileged launch daemon, among them a path to
TELEPORT_HOME
of the user that wants to start VNet.The daemon then lists profiles based on files from that dir, creates Teleport clients and for each profile it fetches the list of leaf clusters and a VNet config resource for each cluster.
This potentially allows a user to make the daemon read data belonging to any other user. To prevent this, this PR makes it so that before reading files from
TELEPORT_HOME
, the daemon sets its egid and euid to that of the daemon client. This way the daemon won't read data that the user doesn't have access to.egid and euid of the daemon client is returned by the kernel based on an audit token included with each XPC message. I don't know if there's an official docs on this fact, but there's a couple of blog posts from security researchers on this topic (example 1, example 2). I used Obj-C APIs for that which I assume use C APIs underneath.
vnet-euid.mov
In the demo video above, I first spawn a launch daemon as usual with my regular
TELEPORT_HOME
. Then I spawn it again but withTELEPORT_HOME
set to a dir I don't have access to. Most of the time the execution won't even get to that point, becausetsh
will attempt to read itsconfig.yaml
fromTELEPORT_HOME
. The dir I used in the demo has permissions set to 777, unlike something like/var/root
with 750.I don't have much experience with syscalls that change (e)uids and guids, I know about them mostly from caveats in blogposts and manpages. I based my implementation on Temporarily dropping root privileges in Go and a couple of other articles I read about
setuid
.Edoardo has already brought up on Slack the problem of not spawning a child process and setting the euid from there. For now, I think it should be safe, as the VNet admin process itself is quite simple and beyond the loop where it configures the DNS, it doesn't do any extra async work. As such, setting a process-wide euid and egid does not conflict with any other code that's executed at the same time.
If we wanted to reexec, we'd have to stop caching the VNet configs (
lib/vnet/clusterconfigcache.go
) or reimplement the cache so that it works across processes.