Platform services and cleanup state
This page summarizes the platform services that matter for OCP operations and separates active platform capabilities from retired or source-only app material.
Summary
Service matrix
| Area | Current state | Notes |
|---|---|---|
| Storage | Hubs are storage-light with LVMS. Spokes retain ODF. | Hub ODF/NooBaa and related stacks were removed; RHACS PVCs remain bound on LVMS. |
| Identity | WSO2 IS removed. Google and htpasswd remain by cluster role. | spoke-dc is Google-only; other clusters retain htpasswd plus Google where recorded. |
| AI | No user AI workloads found. | spoke-dr has RHOAI operator only; hub-dc has stale-looking RHOAI artifacts. |
| User workload metrics | Enabled only on spokes. | hub-dc and hub-dr are explicitly disabled; spoke-dc and spoke-dr run the user workload monitoring stack. |
| ACM Observability | Enabled on hubs for fleet metrics. | Both hubs are Ready. PVC requests were increased, but the lab still uses non-resilient lvms-vg1. |
| Tracing | spoke-dc OpenTelemetry collector is healthy. | The inactive bridge exporter was removed from the collector pipelines; the active trace path keeps both Tempo exporters. |
| Image pre-pull | Enabled on hubs as a short-term DR bridge. | openshift-image-prepull warms selected ACM, MCE, GitOps, RHACS, ACM Observability, and OADP images. A durable mirror/IDMS remains open. |
| Demo apps | No removable demo apps on spokes. | demo-orders was wiped from desired state. |
| OADP | General daily backups complete in latest recorded run. | OADP backup health ServiceMonitors and alert rules now exist on all clusters. |
| Vault / external secrets | ESO installed and smoke-wired to external Vault. | All four clusters have a namespace-scoped SecretStore/rke2-vault and synced smoke ExternalSecret. Real app secrets still need namespace-specific Vault policies and roles. |
Hub observability
ACM Observability placement
- ACM Observability is a hub management service and is separate from OpenShift user workload monitoring.
- The operator creates the
open-cluster-management-observabilitynamespace, Thanos components, Alertmanager, collectors, and ACM Grafana dashboards. - Dedicated external MinIO buckets are used:
acm-observability-hub-dcandacm-observability-hub-dr. - The object-store secret
thanos-object-storageis seeded out-of-band per hub and is not committed to Git. - The lab uses
lvms-vg1for ACM Observability PVCs, now sized as Alertmanager/rule2Giand compact/receive/store20Gi. This is acceptable for the lab, but production should use resilient storage for Thanos stateful components. - Grafana on the hubs is part of ACM Observability, not a standalone app dashboard install.
Secrets
Vault direction
For this fleet, the selected pattern is to run Vault outside the OpenShift clusters as an HA secrets service, then connect each cluster through its own Kubernetes auth mount, policies, and short-lived roles. The current OpenShift-facing Vault endpoint is vault-rke2.apps.sub.comptech-lab.com.
External Secrets Operator is now installed on all four clusters and smoke-wired through components/platform/external-secrets-vault. The smoke path validates SecretStore/rke2-vault, ExternalSecret/eso-vault-smoke, and target Secret key creation on hub-dc, hub-dr, spoke-dc, and spoke-dr. No Vault token, TokenReview JWT, or application secret value is committed to Git.
Use per-cluster Vaults only when clusters must survive long isolation windows, have separate trust boundaries, or cannot depend on a shared secrets control plane. For real application onboarding, create namespace-specific Vault policies and roles before committing non-secret SecretStore or ExternalSecret references.
Retired or absent
What should not reappear accidentally
- WSO2 IS OAuth integration and its client secret on
spoke-dc. demo-ordersnamespace and workload resources.- Non-core application middleware previously tracked with demo workload material on the OCP spokes.
- Inactive tracing bridge exporters in
openshift-tracing/otelunless the destination service is deployed and ready. - Hub ODF/NooBaa, hub app logging/tracing test stacks, and hub MinIO/Loki/Tempo stacks removed by storage-light cleanup.
- User workload Prometheus and Thanos ruler pods on the management hubs.
- Any assumption that pre-pull replaces image mirroring; the durable recovery path still needs a DR-reachable registry mirror.
Operational note
Source-only app bases
Some app bases may remain in source repositories as templates or retained examples. They are not live workloads unless a cluster overlay includes them. Current rendered spoke overlays do not deploy candidate demo apps.