Skip to content

Commit

Permalink
Add a job TTL such that they don't hang around forever (#75)
Browse files Browse the repository at this point in the history
* Fix comment line location - it somehow got into the wrong spot.

* Add a job TTL such that they don't hang around forever

After a job completes, the pod does not need to stay on the cluster
as a "completed" pod forever.  It just takes up kubernetes state
space when it has long since completed.  Keeping it around for a
while after completing allows for inspection but once that time
has been reached, it should evaporate and no longer take up state
space.  (Jobs don't restart once completed so they are not very
useful and in larger clusters the state space becomes a bottleneck)
  • Loading branch information
Michael-Sinz authored Jun 10, 2021
1 parent cec463a commit c83db48
Show file tree
Hide file tree
Showing 4 changed files with 9 additions and 2 deletions.
2 changes: 1 addition & 1 deletion helm/vmss-prototype/Chart.yaml
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
apiVersion: v1
description: A Helm chart for the Kamino vmss-prototype pattern image generator
name: vmss-prototype
version: 0.0.12
version: 0.0.13
maintainers:
- name: Michael Sinz
email: [email protected]
Expand Down
1 change: 1 addition & 0 deletions helm/vmss-prototype/templates/vmss-prototype.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -48,6 +48,7 @@ spec:
{{- end }}

# This is indented like it is under either the Job.spec or CronJob.spec.jobTemplate.spec
ttlSecondsAfterFinished: {{ .Values.kamino.jobTtl }}
template:
metadata:
labels:
Expand Down
6 changes: 6 additions & 0 deletions helm/vmss-prototype/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ kamino:
# Minimum is 2.
imageHistory: 3

# Number of seconds after the job completes before it is cleaned up
# see https://kubernetes.io/docs/concepts/workloads/controllers/job/#ttl-mechanism-for-finished-jobs
# This has it clean up the pod from the cluster within an hour, just to
# help reduce left over state in the cluster.
jobTtl: 3600

drain:
# Drain grace period is the maximum time to allow pods to drain load
# and leave the node. The default of 300 seconds is relatively long
Expand Down
2 changes: 1 addition & 1 deletion vmss-prototype/vmss-prototype
Original file line number Diff line number Diff line change
Expand Up @@ -401,6 +401,7 @@ def image_tweaks(node_name):
# Give kubernetes a few moments to notice we are running
# as the rest of this stuff really just happens quickly
'/bin/sleep 4',
# Update an ancestry.log
'/bin/echo "$(/bin/date) VMSS-Prototype Donor: $(/bin/hostname)" >> /var/log/ancestry.log',
# Multiple lines so it is easier to read all of the different
# items we are cleaning up (removing)
Expand All @@ -420,7 +421,6 @@ def image_tweaks(node_name):
' /var/lib/waagent/GoalState.*.xml'
' /var/lib/waagent/*.manifest.xml'
'',
# Update an ancestry.log
# This forces the machine-id to be re-issued
'/bin/cp /dev/null /etc/machine-id',
# Finally, we need to power off now
Expand Down

0 comments on commit c83db48

Please sign in to comment.