-
Notifications
You must be signed in to change notification settings - Fork 94
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: save host run file output #1288
Conversation
Thanks for adding this. To pass tests, you'll want to run Is there any testing that can be added? Also, it's likely this will need a docs change, particularly if this is going to change paths to output which we would usually consider a breaking change. |
hi @xavpaice!
I will make sure to run that before flipping to ready for review.
I plan to add some unit test after we agree on the approach and implementation.
Agree. Definitely need to add more docs for this feature. |
This change captures the feature well. My minor comments are
config:
dummy.yaml: |-
username: postgres
password: <my-pass>
path/to/data: "some sample data"
config.toml: |-
title = "TOML Example"
[owner]
name = "Tom Preston-Werner" |
Like this idea, although maybe we should just not allow |
da8fb57
to
6ff150a
Compare
My intention with this is to reduce friction when landing files for programs/scripts to consume. This way, programs/scripts would not need to arrange files to directories before running. But I think we can leave this out since its a bit of an edge case. If need arises, we can introduce it With regards to Additionally, for the security focused folks, I think we need to be explicit on whether to inherit the entire environment of the parent process using a key such as We will have some always present environment variables such as Here is a complete example env:
- AWS_REGION=us-east-1
- AWS_ACCESS_KEY_ID={{ aws.key-id.template }}
- AWS_SECRET_ACCESS_KEY={{ aws.secret.template }}
inheritParentEnvs: false # Whether to inherit all the variable or not. If true, inheritEnvs would be made redundant
inheritEnvs: # list of env variables to inherit from the parent process
- SHELL
- LANG
- PYTHON For backward compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good now. Could you please add some tests as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are almost there. I managed to run the collector locally but could not collect anything since TS_WORKSPACE_DIR
does not exist. I added a few more comments in the PR (see below).
Here are the files I used to test
- custom script
#!/usr/bin/env bash
pwd # Should we ensure the command's CWD is the $TS_WORKSPACE_DIR? Its currently the dir where support bundle is launched
env | grep AWS
ls -al $TS_WORKSPACE_DIR # This fails. Dir does not exist
ls -al $TS_INPUT_DIR # This fails. Dir does not exist
cat $TS_INPUT_DIR/dummy.yaml
echo "Drop my data: $($TS_INPUT_DIR/dummy.yaml)" >> $TS_WORKSPACE_DIR/mydata.txt # I'll have a redactor redact my secret later on
- Spec
apiVersion: troubleshoot.sh/v1beta2
kind: SupportBundle
metadata:
name: run
spec:
hostCollectors:
- run:
collectorName: "ping-google"
command: "custom-host-run.sh"
env:
- AWS_REGION=us-east-1
- AWS_ACCESS_KEY_ID={{ aws.key-id.template }}
- AWS_SECRET_ACCESS_KEY={{ aws.secret.template }}
inheritParentEnvs: false
inheritEnvs:
- SHELL
- LANG
- TERM
outputDir: "magic-outputs"
input:
dummy.yaml: My sample data REDACTED SECRET
# config: # I cannot load this config
# dummy.yaml: |-
# username: postgres
# password: <my-pass>
# config.toml: |-
# title = "TOML Example"
# [owner]
# name = "Tom Preston-Werner"
---
apiVersion: troubleshoot.sh/v1beta2
kind: Redactor
metadata:
name: redactor-spec
spec:
redactors:
- name: redact-text
removals:
values:
- REDACTED SECRET
pkg/collect/host_run.go
Outdated
if err != nil { | ||
return "", errors.New(fmt.Sprintf("failed to created temp dir for: %s", runHostCollector.OutputDir)) | ||
} | ||
cmd.Env = append(cmd.Env, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we change the CWD
of the command to TS_WORKSPACE_DIR
? The working dir is currently where support-bundle
is launched from
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great question. TS_WORKSPACE_DIR
is the temporary output directory so it only exists when user specify .outputDir
.
The working dir is currently where support-bundle is launched from
I think this is the existing behavior. I thought it is confusing if the CWD
changes when the .outputDir
is specified vs not specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nvm I think the spec i was using was with absolute path so masked the issue. let me fix that
pkg/collect/host_run.go
Outdated
ExitCode string `json:"exitCode"` | ||
Error string `json:"error"` | ||
OutputDir string `json:"outputDir"` | ||
Input string `json:"input"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a map[string]string
or map[string]any
? I could not load this yaml. How is the input
parameter meant to behave? Isn't the key
meant to be the filename and the the value
be the contents of the file, all stored as flat files in TS_INPUT_DIR
directory?
input:
dummy.yaml: My sample data REDACTED SECRET
config: # I cannot load this config
dummy.yaml: |-
username: postgres
password: <my-pass>
config.toml: |-
title = "TOML Example"
[owner]
name = "Tom Preston-Werner"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the input files will be written into the TS_INPUT_DIR
, which is a temp directory. It will always be deleted. I previously also choose to output the input file to somewhere but then removed that part -- talked with Martin and we thought input might be sensitive and might be good to just skip storing it.
The user triggers this bundle will have input file on their hand anyways but anything output to that bundle might do harm when the bundle is given to someone else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or btw it should be
input:
dummy.yaml: My sample data REDACTED SECRET
dummy2.yaml: |-
username: postgres
password: <my-pass>
config.toml: |-
title = "TOML Example"
[owner]
name = "Tom Preston-Werner"
that's why its map[string]string
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With regards to tests, please add
- An example in examples directory similar to what I posted in my previous comment. The command part can be
bash -c <some inline code e.g env, cat $TS_INPUT_DIR/dummy.yaml, ls -al...>
so as not to be forced to add an external script - Unit tests around processEnvVars
- Ensure directories that need to be created are created, and exist.
- Ensure cmd has the correct parameters set i.e env vars, CWD etc
- Add at least one level 2 (
klog.V(2)
) log, more if possible, that should show what the collector is doing like in this example.
after fixing the issue this should work although you need to change command to: "./custom-host-run.sh" |
closed this PR accidentally 🙈 and opened a new one #1376 |
Description, Motivation and Context
Address the collector part of #1078
.env
to allow user to specify which env vars to import into this host run.input
to allow user to create input files(e.g., config file, sample data for this host run). User needs to utilizeTS_INPUT_DIR
env var in their script, with the filename they specify in.input
, e.g.,$TS_INPUT_DIR/dummy.yaml
.outputDir
to allow user to specify a name of a directory(within the directory of the collector in bundle) to host their host run command file output. Note that command run has no access to this directoryExample SupportBundle:
Result bundle:
Checklist
Does this PR introduce a breaking change?