You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, helm chart installation fails with with the following values enabled:
tls.enabled = true
tls.certs.selfSigner.enabled = false
tls.certs.certManager = true
With the following values, when we try to create an instance after successful operator installation, stateful set is created but no pods are scheduled. While trying to describe the stateful set, we get the following output.
kubectl describe sts cockroachdb-sample -n openshift-operators
Name: cockroachdb-sample
Namespace: openshift-operators
CreationTimestamp: Wed, 04 Oct 2023 01:42:03 +0530
Selector: app.kubernetes.io/component=cockroachdb,app.kubernetes.io/instance=cockroachdb-sample,app.kubernetes.io/name=cockroachdb
Labels: app.kubernetes.io/component=cockroachdb
app.kubernetes.io/instance=cockroachdb-sample
app.kubernetes.io/managed-by=Helm
app.kubernetes.io/name=cockroachdb
helm.sh/chart=cockroachdb-11.2.1
Annotations: meta.helm.sh/release-name: cockroachdb-sample
meta.helm.sh/release-namespace: openshift-operators
Replicas: 3 desired | 0 total
Update Strategy: RollingUpdate
Pods Status: 0 Running / 0 Waiting / 0 Succeeded / 0 Failed
Pod Template:
Labels: app.kubernetes.io/component=cockroachdb
app.kubernetes.io/instance=cockroachdb-sample
app.kubernetes.io/name=cockroachdb
Service Account: cockroachdb-sample
Init Containers:
copy-certs:
Image: busybox
Port: <none>
Host Port: <none>
Command:
/bin/sh
-c
cp -f /certs/* /cockroach-certs/; chmod 0400 /cockroach-certs/*.key
Environment:
POD_NAMESPACE: (v1:metadata.namespace)
Mounts:
/certs/ from certs-secret (rw)
/cockroach-certs/ from certs (rw)
Containers:
db:
Image: cockroachdb/cockroach:v23.1.11
Ports: 26257/TCP, 8080/TCP
Host Ports: 0/TCP, 0/TCP
Args:
shell
-ecx
exec /cockroach/cockroach start --join=${STATEFULSET_NAME}-0.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-1.${STATEFULSET_FQDN}:26257,${STATEFULSET_NAME}-2.${STATEFULSET_FQDN}:26257 --advertise-host=$(hostname).${STATEFULSET_FQDN} --certs-dir=/cockroach/cockroach-certs/ --http-port=8080 --port=26257 --cache=25% --max-sql-memory=25% --logtostderr=INFO
Liveness: http-get https://:http/health delay=30s timeout=1s period=5s #success=1 #failure=3
Readiness: http-get https://:http/health%3Fready=1 delay=10s timeout=1s period=5s #success=1 #failure=2
Environment:
STATEFULSET_NAME: cockroachdb-sample
STATEFULSET_FQDN: cockroachdb-sample.openshift-operators.svc.cluster.local
COCKROACH_CHANNEL: kubernetes-helm
Mounts:
/cockroach/cockroach-certs/ from certs (rw)
/cockroach/cockroach-data/ from datadir (rw)
Volumes:
datadir:
Type: PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
ClaimName: datadir
ReadOnly: false
certs:
Type: EmptyDir (a temporary directory that shares a pod's lifetime)
Medium:
SizeLimit: <unset>
certs-secret:
Type: Projected (a volume that contains injected data from multiple sources)
SecretName: cockroachdb-node
SecretOptionalName: <nil>
Topology Spread Constraints: topology.kubernetes.io/zone:ScheduleAnyway when max skew 1 is exceeded for selector app.kubernetes.io/component=cockroachdb,app.kubernetes.io/instance=cockroachdb-sample,app.kubernetes.io/name=cockroachdb
Volume Claims:
Name: datadir
StorageClass: standard-csi
Labels: app.kubernetes.io/instance=cockroachdb-sample
app.kubernetes.io/name=cockroachdb
Annotations: <none>
Capacity: 2Gi
Access Modes: [ReadWriteOnce]
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedCreate 1s (x13 over 22s) statefulset-controller create Pod cockroachdb-sample-0 in StatefulSet cockroachdb-sample failed error: pods "cockroachdb-sample-0" is forbidden: unable to validate against any security context constraint: [provider "anyuid": Forbidden: not usable by user or serviceaccount, provider restricted-v2: .spec.securityContext.fsGroup: Invalid value: []int64{1000}: 1000 is not an allowed group, provider restricted-v2: .initContainers[0].runAsUser: Invalid value: 1000: must be in the ranges: [1000400000, 1000409999], provider restricted-v2: .containers[0].runAsUser: Invalid value: 1000: must be in the ranges: [1000400000, 1000409999], provider "restricted": Forbidden: not usable by user or serviceaccount, provider "nonroot-v2": Forbidden: not usable by user or serviceaccount, pod.metadata.annotations[seccomp.security.alpha.kubernetes.io/pod]: Forbidden: seccomp may not be set, pod.metadata.annotations[container.seccomp.security.alpha.kubernetes.io/copy-certs]: Forbidden: seccomp may not be set, pod.metadata.annotations[container.seccomp.security.alpha.kubernetes.io/db]: Forbidden: seccomp may not be set, provider "hostmount-anyuid": Forbidden: not usable by user or serviceaccount, provider "machine-api-termination-handler": Forbidden: not usable by user or serviceaccount, provider "hostnetwork-v2": Forbidden: not usable by user or serviceaccount, provider "hostnetwork": Forbidden: not usable by user or serviceaccount, provider "hostaccess": Forbidden: not usable by user or serviceaccount, provider "node-exporter": Forbidden: not usable by user or serviceaccount, provider "privileged": Forbidden: not usable by user or serviceaccount]
This is an issue because the user that is defined inside the securityContext for pod templates in stateful set for the helm chart does not have required permission for the workloads to be scheduled.
We cannot have any SCCs defined in CSV for the helm chart operator since SCC uses serviceAccount name which can be provided seperately by the user in helm values and operator gets created beforehand.
POSSIBLE SOLUTIONS:
As a workaround, we can apply a SecurityContextConstraint. Now OCP already provides a number of SCCs with defined permissions but none of them except privileged one works for the service account which is not recommended as it is the most relaxed SCC and should be used only for cluster administration. (NOT RECOMMENDED)
We can add define a custom SCC which will be a part of templates that will contain only the minimum permissions required to run all the cockroach related workloads and apply it conditionally if the installation is being done on an OCP cluster.
The text was updated successfully, but these errors were encountered:
@himanshu-cockroach Which OpenShift version did you start observing this issue from? And has this been observed in any other OpenShift versions than the one tested on?
@harshn08 The last time I tested this was on 4.13. However I don't think it has to do much with openshift cluster version.
Although, one thing I'm almost certain about is that this issue most probably got introduced after this PR went in.
PROBLEM:
Currently, helm chart installation fails with with the following values enabled:
With the following values, when we try to create an instance after successful operator installation, stateful set is created but no pods are scheduled. While trying to describe the stateful set, we get the following output.
This is an issue because the user that is defined inside the
securityContext
for pod templates in stateful set for the helm chart does not have required permission for the workloads to be scheduled.We cannot have any SCCs defined in CSV for the helm chart operator since SCC uses
serviceAccount
name which can be provided seperately by the user in helm values and operator gets created beforehand.POSSIBLE SOLUTIONS:
As a workaround, we can apply a SecurityContextConstraint. Now OCP already provides a number of SCCs with defined permissions but none of them except
privileged
one works for the service account which is not recommended as it is the most relaxed SCC and should be used only for cluster administration. (NOT RECOMMENDED)We can add define a custom SCC which will be a part of templates that will contain only the minimum permissions required to run all the cockroach related workloads and apply it conditionally if the installation is being done on an OCP cluster.
The text was updated successfully, but these errors were encountered: