Skip to content

Releases: vmware-tanzu/crash-diagnostics

v0.3.0

11 Sep 22:57
f5667c5
Compare
Choose a tag to compare

v0.3.0

This version introduces a new direction for the project. Instead of continuing with a Dockerfile-like configuration language, this version has been completely redesigned to adopt Starlark, a Python-like language, as the configuration language to write Crashd scripts. There are many other important new features introduced in this version:

  • Use Starlark to create simple or complex scripts to automate interaction with Kubernetes cluster
  • Use Python-like constructs and functions to create scripts to query or capture infrastructure information
  • Support for provider model that allows interaction with a growing list of infrastructure providers including local KinD clusters, plain Kubernetes clusters, and Cluster-API based clusters
  • Ability to automatically enumerate nodes and execute commands on those nodes to capture
  • Easily query and capture object and other cluster information from Kubernetes API server
  • Adoption of the shortened named crashd for the referring to the project and the built binary.
  • Update go directive in the go.mod file to version 1.15.

The Crashd Script

This release introduces the use of Starlark as the language to create scripts to interact with the Kubernetes cluster. For instance, the following script shows how to use the kube_nodes_provider which uses the Kubernetes Nodes objects to enumerate and discover compute resources that are part of the cluster. Then (assuming ability to securely SSH to the node) the script executes a simple command on each node and retrieves and prints the result:

# setup and configuration
ssh=ssh_config(
    username=os.username,
    private_key_path=args.ssh_pk_path,
    port=args.ssh_port,
    max_retries=50,
)

hosts=resources(
    provider=kube_nodes_provider(
        kube_config=kube_config(path=args.kubecfg),
        ssh_config=ssh,
    ),
)

# commands to run on each host
uptimes = run(cmd="uptime", resources=hosts)

# result for resource 0 (localhost)
print(uptimes.result)

Explore more example scripts here.

Script Elements Available

Release 0.3.0 comes with many built-in functions and other types to help you create functioning and useful scripts. Each built-in function falls in to one the following category:

Configuration functions

  • crashd_config()
  • kube_config()
  • ssh_config()

Provider functions

  • capa_provider()
  • capv_provider()
  • host_list_provider()
  • kube_nodes_provider()
  • resources()

Command functions

  • archive()
  • capture()
  • capture_local()
  • copy_from()
  • run()
  • run_local()

Kubernetes functions

  • kube_capture()
  • kube_get()

See the complete list of script elements here

Known Issues

N/A

Changelog

6a8c896 Adds docs for --args-file and use-ssh-agent
689a4fe Adds .crashd directory when running the program
ec0d453 Adds support for passphrase protected ssh key
ecafae3 Adds args-file flag to run command
bf00df2 updates the args flag example
da64801 Refactor e2e test framework
985a657 Changes ssh command string for proxy args
c342713 Updates extensions for the example scripts
7384add Updates the CLI name to crashd
2501def Includes a new provider for CAPA managed objects
cf42fe3 Replaces named flag with positional argument
6118ba0 Reference documentation update
9ec5022 Fixes the function name for kube functions
dc5b5cc Adds the set_as_default directive
1013294 Adds meaningful constructor name to starlark structs
7bfa1a0 New provider for CAPV resource enumeration
a2d6299 Multiple Example Starlark Scripts
4afbef5 Implementation of the archive function
427eece Adds kube_nodes_provider starlark built-in
f108963 Implementation of the capture_local() Starlark function
a46c358 Implementation of the Starlark run_local() function
d17fad8 Implementation of starlark copy_from() function
0f489be Adds kube_get starlark built-in
6257c24 Implementation of the capture() starlark function.
12be3fc Implementation of kube_capture starlark function
ac2a419 Adds kube_config built-in
1dda5e6 Implementation of run starlark function
daf30cb Implemenation of host_list_provider function.
90209bf Starlark - Base implementation to support configuration

Known Issues

  • Running crashd without the --args-file flag and the default args file fails to run [#176]

v0.3.0-beta

03 Sep 19:04
bc18d64
Compare
Choose a tag to compare
v0.3.0-beta Pre-release
Pre-release

This version includes a few incremental changes to address the following issues:

  • (#128) add support for passphrase protected ssh keys
  • (#163) improve passing multiple arguments to the run command

Changes

  • crashd_config directive
    A new Boolean parameter use_ssh_agent was added to the crashd_config() directive . Whenever this parameter is set, crashd starts a new instance of the ssh-agent and any ssh keys used in the script get added to this instance of the agent. Correspondingly, all following ssh/scp operations leverage this ssh-agent for remote connections.

  • run command
    A new flag --args-file was added to the crashd run command. This flag takes as input a path to a file containing new-line separated key=value pairs which are passed to the diagnostics script during runtime.

  • Go directive
    The go directive in the go.mod file was updated to version 1.15.

Changelog

ec0d453 Adds support for passphrase protected ssh key
ecafae3 Adds args-file flag to run command

v0.3.0-alpha

20 Aug 03:32
32ab1d7
Compare
Choose a tag to compare
v0.3.0-alpha Pre-release
Pre-release

v0.3.0-alpha

This version introduces a new direction for the project. Instead of continuing with a Dockerfile-like configuration language, this version has been completely redesigned to adopt Starlark, a Python-like language, as the configuration language to write Crashd scripts. There are many other important new features introduced in this version:

  • Use Starlark to create simple or complex scripts to automate interaction with Kubernetes cluster
  • Use Python-like constructs and functions to create scripts to query or capture infrastructure information
  • Support for provider model that allows interaction with a growing list of infrastructure providers including local KinD clusters, plain Kubernetes clusters, and Cluster-API based clusters
  • Ability to automatically enumerate nodes and execute commands on those nodes to capture
  • Easily query and capture object and other cluster information from Kubernetes API server
  • Adoption of the shortened named crashd for the referring to the project and the built binary.

The Crashd Script

This release introduces the use of Starlark as the language to create scripts to interact with the Kubernetes cluster. For instance, the following script shows how to use the kube_nodes_provider which uses the Kubernetes Nodes objects to enumerate and discover compute resources that are part of the cluster. Then (assuming ability to securely SSH to the node) the script executes a simple command on each node and retrieves and prints the result:

# setup and configuration
ssh=ssh_config(
    username=os.username,
    private_key_path=args.ssh_pk_path,
    port=args.ssh_port,
    max_retries=50,
)

hosts=resources(
    provider=kube_nodes_provider(
        kube_config=kube_config(path=args.kubecfg),
        ssh_config=ssh,
    ),
)

# commands to run on each host
uptimes = run(cmd="uptime", resources=hosts)

# result for resource 0 (localhost)
print(uptimes.result)

Explore more example scripts here.

Script Elements Available

Release 0.3.0 introduces many new language elements and functions.

Configuration functions

  • crashd_config()
  • kube_config()
  • ssh_config()

Provider functions

  • capa_provider()
  • capv_provider()
  • host_list_provider()
  • kube_nodes_provider()
  • resources()

Command functions

  • archive()
  • capture()
  • capture_local()
  • copy_from()
  • run()
  • run_local()

Kubernetes functions

  • kube_capture()
  • kube_get()

See the complete list of script elements here

Known Issues

  • Support for passphrase-protected SSH keys may not work when executing commands on compute nodes

Changelog

bf00df2 updates the args flag example
da64801 Refactor e2e test framework
985a657 Changes ssh command string for proxy args
c342713 Updates extensions for the example scripts
7384add Updates the CLI name to crashd
2501def Includes a new provider for CAPA managed objects
cf42fe3 Replaces named flag with positional argument
6118ba0 Reference documentation update
9ec5022 Fixes the function name for kube functions
dc5b5cc Adds the set_as_default directive
1013294 Adds meaningful constructor name to starlark structs
7bfa1a0 New provider for CAPV resource enumeration
a2d6299 Multiple Example Starlark Scripts
4afbef5 Implementation of the archive function
427eece Adds kube_nodes_provider starlark built-in
f108963 Implementation of the capture_local() Starlark function
a46c358 Implementation of the Starlark run_local() function
d17fad8 Implementation of starlark copy_from() function
0f489be Adds kube_get starlark built-in
6257c24 Implementation of the capture() starlark function.
12be3fc Implementation of kube_capture starlark function
ac2a419 Adds kube_config built-in
1dda5e6 Implementation of run starlark function
daf30cb Implemenation of host_list_provider function.
90209bf Starlark - Base implementation to support configuration

v0.2.3-alpha.0

27 Mar 21:10
2d878bf
Compare
Choose a tag to compare
v0.2.3-alpha.0 Pre-release
Pre-release

Changelog

2d878bf Merge pull request #57 from vladimirvivien/parsing-cmd-with-colons

v0.2.2

04 Mar 00:50
c558caa
Compare
Choose a tag to compare

v0.2.2

This is a bug fix release. As outline in #48 the previous version of Crash did not do a good job pulling all of the objects (specially logs) from the server. This fix attempts to bring KUBEGET on parity with kubectl cluster-info --dump.

KUBEGET objects

KUBEGET does the following when retrieving objects

  • All retrieved objects are saved in its respective file placed in directory kubeget
  • Each object retrieved is saved into a JSON file
  • Namespaced objects are saved in a corresponding namespaced sub-directory
  • Non-namespaced objects are saved in the root kubeget dir

The following shows an example directory layout of KUBEGET objects

crashd/stage
└── kubeget
    ├── apiservices.json
    ├── clusterinformations.json
    ├── clusterrolebindings.json
    ├── clusterroles.json
    ├── default
    │   ├── configmaps.json
    │   ├── controllerrevisions.json
    │   ├── cronjobs.json
    │   ├── daemonsets.json
    │   ├── deployments.json

KUBEGET logs

Fetching container logs with KUBEGET stores logs in a directory structure as follows

<namespace>/<pod-name>/<container-name>/container.log

Each container log is saved individually in its associated file as shown in the following example directory layout:

    ├── kube-system
    │   ├── calico-kube-controllers-ff95847f5-tjccn
    │   │   └── calico-kube-controllers
    │   │       └── calico-kube-controllers.log
    │   ├── calico-node-87b7l
    │   │   ├── calico-node
    │   │   │   └── calico-node.log
    │   │   ├── install-cni
    │   │   │   └── install-cni.log
    │   │   └── upgrade-ipam
    │   │       └── upgrade-ipam.log

Changelog

c558caa Merge pull request #52 from vladimirvivien/kubeget-logs-fix-take-2
fe86bb0 GitHub Actions CI update with [email protected]
a832e06 Documentation update for KUBEGET
9dcb1c8 Update KUBEGET to better organize object search results
c479bac Refactor k8s client code for object search

v0.2.1

13 Feb 00:29
33aa529
Compare
Choose a tag to compare

v0.2.1

This latest release of Crash-Diagnostics introduce several fundamental changes that were introduced in several alpha releases as outlined below:

Command result redirected to Stdout

Commands RUN and CAPTURE can direct their output to the console

RUN cmd:"/bin/journalctl -l -u kube-apiserver" echo:"true"

Unified Executor Backend

New Package for End-to-End Tests

  • New Go testing package to help with true end-to-end testing.
  • The new package is used to launch an OpenSSH server docker image during tests as a way to test all commands that need SSH/SCP.
  • The package is also capable of automating the creation of K8s cluster using kind to test commands and other directives that rely on a Kubernetes cluster.

Enhancement to FROM

The FROM directive has been enhanced with several features including the ability to discover remote machines (from which to source diagnostics information) from an available K8s API-server.

  • FROM supports param nodes: to hint at K8s machine discovery using a K8s API-server. For instance FROM nodes:"all" will source from all machines represented by a node object in the API-server.
  • The nodes: param can be used to list specific Node names FROM nodes:"node.name.1 node.name.2
  • FROM also supports a labels: param to filter out sourced nodes (i.e. FROM nodes:"all" labels:"foo=bar").
  • FROM now supports param port: to specify the default port to use in cases where it is not specified (i.e. FROM hosts:"10.10.20.100 10.10.20.200 port:"2222").

Changelog

33aa529 Merge pull request #46 from vladimirvivien/from-node-enhancements
a5b71b3Doc update for FROM directive changes
ce9a8f5 Test updates for all end-to-end tests
f983bde Executor support for parameterized connection retries
8a11b46 Automate create/destroy kind clusters for tests
958cf84 FROM command/test refactor for new params

v0.2.1-alpha.1

23 Jan 01:58
9348f8a
Compare
Choose a tag to compare
v0.2.1-alpha.1 Pre-release
Pre-release

v0.2.1-alpha.1

This release introduced a non-functionality change that updates Crash Diag to use a single executor backend based on SSH/SCP. Prior to this change, the code supported two executor backends one for local execution and one for remote execution. This release uses only the remote execution model via SSH/SCP for both local and remote machines.

Using a single executor backend that relies on SSH/SCP means that testing would require standing up an SSH/SCP server. The following changes were done to support this:

  • Refactor all executor code to only run using SSH backend
  • Update to test code to start/stop a full SSH server via Docker containers
  • Enhancement to the SSH connection code to retry upon failed connection
  • Update to CI/CD code for end-to-end testing of diagnostics scripts using SSH/SCP

Changelog

9348f8a Merge pull request #43 from vladimirvivien/single-exec-backend
c416e21 Documentation update for ssh/scp backend
8cd3a84 Update to GitHub Actions for end-to-end SSH/SCP tests
b1ec56e Add test cert/key, update GitActions for testing
2421cab Refactor tests to support ssh/scp exec backend only
31cedc5 Remove local exec backend, refactor, ssh-server for testing
8d105f9 Refactor to switch to scp/ssh for command exec
b7c0854 Command and script changes

v0.2.1-alpha.0

09 Jan 14:59
a95de45
Compare
Choose a tag to compare
v0.2.1-alpha.0 Pre-release
Pre-release

v0.2.1-alpaha.0

This release implements the capability to direct CAPTURE and RUN command output to the standard output using the echo parameter as shown below:

RUN cmd:"/bin/journalctl -l -u kube-apiserver" echo:"true"

See docs for detail.

Changelog

a95de45 Merge pull request #42 from vladimirvivien/exec-echo
91df766 Document update for RUN and CAPTURE commands
d2433cc RUN and CAPTURE and tests to output to sdout
6775f28 Command update to support echo param

v0.2.0

21 Dec 15:15
5e455c9
Compare
Choose a tag to compare

Release v0.2.0

KUBEGET

This release introduces new directive KUBEGET to retrieve API objects and pod logs from the API server as shown in the following example:

KUBEGET objects groups:"core" kinds:"pods" namespaces:"kube-system default" containers:"kindnet-cni etcd"

Read more about KUBEGET in the README.

COPY File Pattern

In this release, command COPY now supports file pattern or globbing when specifying one or more files to copy from the cluster node as shown below:

COPY /var/logs/kube*.log

GitHub Actions

Other changes in this release includes switching the build system from Travis to GitHub Actions.

Changelog

5e455c9 Merge pull request #39 from vladimirvivien/ghactions-fix
53b2eb5 Fixes for GitHub Action workflows
1fe7326 Remove travis.yaml file
855a6cb Changelog update
71cbe4a Documentation update for file globbing

v0.2.0-alpha.0

14 Dec 02:39
9e776c0
Compare
Choose a tag to compare

This release introduces new directive KUBEGET to retrieve API objects and pod logs from the API server. When an API connection is configured properly using KUBECONFIG, KUBEGET can be used to retrieve any accessible arbitrary API objects or access logs for running pods as shown in the following example:

KUBEGET objects groups:"core" kinds:"pods" namespaces:"kube-system default" containers:"kindnet-cni etcd"

The previous command would retrieve all pods from namespace kube-system or default having containers named kindnet-cni or etcd.

See README for detail.