Debugging¶
Start with Kubernetes status, then inspect operator logs, then compare with Omni.
Check resources¶
kubectl get omniconnections,omniclusters,omnicontrolplanes,omniworkers,omnimachines,omnikubeconfigexports,omnihelmreleases \
--namespace omni-cluster-operator-system
Use describe for condition messages and recent events:
kubectl describe omniconnection omni \
--namespace omni-cluster-operator-system
kubectl describe omnicluster edge \
--namespace omni-cluster-operator-system
kubectl describe omnikubeconfigexport cluster-01-automation-kubeconfig \
--namespace omni-cluster-operator-system
kubectl describe omnihelmrelease cluster-01-metrics-server \
--namespace omni-cluster-operator-system
Check logs¶
kubectl logs deployment/omni-cluster-operator-controller-manager \
--namespace omni-cluster-operator-system \
--all-containers
Common conditions and reasons¶
| Condition | What it means |
|---|---|
Reachable |
The operator can reach the configured Omni endpoint with the service account key. |
Stalled |
Reconciliation has hit an error that GitOps health checks should treat as failed rather than still progressing. |
Accepted |
A child document references an existing OmniCluster. |
Validated |
The assembled Omni template passed upstream Omni validation. |
Synced |
The desired template was synced to Omni. |
Exported |
A requested workload-cluster kubeconfig Secret was written. |
Ready |
The resource is ready for normal use. |
| Reason | Likely cause |
|---|---|
MissingSecret |
The referenced Secret or key does not exist in the operator namespace. |
ConnectionFailed |
Omni endpoint, TLS, network, or credential failure. If the message mentions Omni environment assignments, store only the OMNI_SERVICE_ACCOUNT_KEY value in the referenced Secret key, not the whole copied env block. |
MissingConnection |
OmniCluster.spec.connectionRef.name points at a missing OmniConnection. |
MissingCluster |
A child resource points at a missing OmniCluster. |
ValidationFailed |
The rendered Omni cluster template is not accepted by Omni validation. |
SyncFailed |
Omni rejected or failed the create/update operation. |
ExportFailed |
Omni could not issue a kubeconfig, or the returned kubeconfig could not be parsed. |
Suspended |
OmniCluster.spec.suspend is true. |
Deleting |
The resource is waiting for remote cleanup before finalizer removal. |
DeleteFailed |
Omni deletion failed. |
Admission failures¶
The chart installs validating webhooks. If kubectl apply fails before an object is stored, read the webhook error message first. Common invalid shapes include:
- Both
machinesandmachineClassset on a control plane or worker set. - Neither
machinesnormachineClassset. - Reserved worker set name
control-planes. - Invalid version strings.
- Ambiguous inline and file-backed patch or manifest sources.
- Duplicate names in
OmniCluster.spec.kubernetes.manifests[]. - Invalid
OmniKubeconfigExportfields, such as blank service-account groups,renewBeforegreater than or equal tottl, orsystem:masterswithoutserviceAccount.allowClusterAdmin: true. - Invalid
OmniHelmReleasefields, such as malformed chart values, missing kubeconfig Secret references, or direct Helm credentials that do not have workload-cluster RBAC. - Invalid
OmniSecretSyncfields, such as missing source or target Secret references, blank kubeconfig Secret keys, or unsupported deletion policies.
Kubeconfig export issues¶
OmniKubeconfigExport creates and rotates a target Secret only after the parent cluster and connection are available.
Check the export status:
kubectl get omnikubeconfigexport cluster-01-automation-kubeconfig \
--namespace omni-cluster-operator-system \
--output yaml
Check the target Secret metadata and key:
kubectl get secret cluster-01-automation-kubeconfig \
--namespace omni-cluster-operator-system \
--output jsonpath='{.metadata.annotations}{"\n"}{.data.kubeconfig}' | head
Common causes:
MissingCluster:spec.clusterRef.namedoes not match anOmniClusterin the same namespace.MissingConnection: the referenced cluster points at an unavailableOmniConnection.ExportFailed: Omni rejected the service-account kubeconfig request, credentials are invalid, or Omni returned data that is not a kubeconfig.- Secret consumers are reading the wrong key. The default key is
kubeconfig; custom keys come fromspec.targetSecretRef.key.
Direct Helm issues¶
OmniHelmRelease reads a workload-cluster kubeconfig Secret and runs Helm actions directly in that cluster.
If a release is not ready, check the release and the referenced Secret:
kubectl get omnihelmreleases,secrets \
--namespace omni-cluster-operator-system
kubectl describe omnihelmrelease cluster-01-metrics-server \
--namespace omni-cluster-operator-system
Common causes are missing kubeconfig Secret data, insufficient workload-cluster RBAC for the exported user or group, unreachable chart repositories, invalid chart versions, invalid values, or Helm wait timeouts.
Secret sync issues¶
OmniSecretSync reads a management-cluster source Secret and writes a target Secret directly into the workload cluster using the referenced kubeconfig Secret.
If a sync is not ready, check the sync, source Secret, and kubeconfig Secret:
kubectl get omnisecretsyncs,secrets \
--namespace omni-cluster-operator-system
kubectl describe omnisecretsync cluster-01-ghcr \
--namespace omni-cluster-operator-system
Common causes are missing source Secret data, a missing or invalid kubeconfig Secret key, insufficient workload-cluster RBAC for Secret writes, a missing target namespace when createNamespace is false, or an immutable target Secret that cannot be updated.
Stuck deletion¶
OmniCluster uses a finalizer because it owns remote Omni lifecycle. If deletion is stuck:
- Check the operator pod is running.
- Describe the
OmniClusterforDeleteFaileddetails. - Check operator logs for Omni delete errors.
- Decide whether the remote Omni cluster should be deleted or orphaned.
To keep the remote Omni cluster and let Kubernetes deletion proceed, set orphan mode before deleting: