462 views
# SCS Hackathon 2023 - Container Layer Monitoring Idea/Proposal: * Central Managagement Kubernetes Cluster for Monitoring managed Kubernetes Clusters * May be same mgmt cluster used for Identity Prodvider (Zitadel/Keycloak) * Cluster runs time-series database, Loki, Blackbox Exporter (e.g. to check CAPI Endpoints) etc. * Which TSDB to use? Prometheus with Thanos? Cortex, Mimir, other? * Existing (IaaS) Grafana can be used to utilize those datasources * Could utilize dNation Dashboard design * When Customer requests a managed Cluster, Cluster is provisioned with several default Exporters to provide necessary metrics data for Cloud Operators to check cluster health * Exporters directly access by user in the Cluster? How to handle them when Cluster leaves managed state? * Exporters can include node-exporter, kubelet/apiserver/.. metrics, cadvisor (need to configured properly, i.e. resource consumption) * It need to be specified what data should be scraped to setup exporters accordingly * Those exporter endpoints should also be accessable for the customer so they can set up their own monitoring solution for their applications * Monitoring data should be as much as needed and as minimal as possible * Customer should not be charged much for resources that are needed for the monitoring Open Points: * Which Metrics are really useful for Cloud Providers to fullfil SLAs * Which components to use? * How and what should run on the mgmt Kubernetes Cluster and how does tenant separation work? * Are custom Node images needed? How are tools installed on the customer clusters?