Ensure that all products correctly expose metrics

## Description 

> [!NOTE]
>  In SDP 25.7 we [finished the initial rollout](https://github.com/stackabletech/issues/issues/692) of our Listener operator.

Ensure all Stackable operators correctly expose Prometheus metrics. During the listener rollout (hdfs → kafka → all other operators), we established a pattern how metrics are to be exposed (`metrics` service, labels, etc. see below) but we're not sure if we followed these practices consistently.

Context:
 * https://github.com/stackabletech/decisions/issues/51
   * This is what every operator is supposed to implement
 * https://github.com/stackabletech/issues/issues/692

This issue is to make sure all products correctly expose metrics according to the decision mentioned above.

## Tasks

For every product check that
1. A metrics service exists
   - It has the label `prometheus.io/scrape=true`
   - It has the according annotations `prometheus.io/scheme`, `prometheus.io/port` and `prometheus.io/path`
   - It only exposes the metrics port, no data
   - The port is called `metrics`
1. No other service exposes a port `metrics`
1. No other service has a `prometheus.io/scrape=true` label
1. All *metric* services have a correct `app.kubernetes.io/name` value that is appropriate for the service in question (`listener` is not appropriate, `kafka` would be for example).
   - This is important, as this label is carried over into the Prometheus metrics.
   - Service created by listener-operator have "wrong" labels, e.g. `app.kubernetes.io/name=listener`. This is not good, bad out of scope for this issue
1. The Pod has the `metrics` port (if possible - it could be the case that the port number clashes with e.g. HTTP - which k8s doesn't like for some reason)
1. JMX Exporter: Check `<role>.yaml` in docker images
   - Do they still work properly? E.g. for Kafka we use [2.0.0](https://github.com/prometheus/jmx_exporter/blob/main/examples/kafka-2_0_0.yml)
   - Are any updates / improvements available?
   - Are all metrics there or do we lose any due to filtering etc.?
1. The monitoring stack still collects the metrics out of the box and uses native metrics wherever possible (This was originally done in https://github.com/stackabletech/demos/pull/284, for this issue we only need to make sure we don't break it)
1. Ideally all of the products work with a single ServiceMonitor similar to https://github.com/stackabletech/demos/blob/b72cee51ef8231c583bde26dde0bd5ab60d2381e/stacks/monitoring/prometheus-service-monitors.yaml#L171-L220
1. Documentation is updated with new names and any changes done
1. Release notes for breaking changes including migration paths have been written

### Products to check/fix
- [ ] https://github.com/stackabletech/airflow-operator/pull/698
- [ ] https://github.com/stackabletech/druid-operator/pull/761
- [ ] https://github.com/stackabletech/hbase-operator/pull/701
- [ ] https://github.com/stackabletech/demos/pull/316
- [ ] https://github.com/stackabletech/hdfs-operator/pull/721
- [ ] https://github.com/stackabletech/hdfs-operator/pull/726
- [ ] https://github.com/stackabletech/hive-operator/pull/641
- [ ] https://github.com/stackabletech/kafka-operator/pull/897
- [ ] https://github.com/stackabletech/nifi-operator/pull/855
- [x] OpenSearch (@siegfriedweber)
      No dedicated metrics service exists, because the metrics are provided via the default API endpoint. However, the default service is annotated with the mentioned Prometheus annotations.
- [ ] https://github.com/stackabletech/spark-k8s-operator/pull/619
- [ ] https://github.com/stackabletech/superset-operator/pull/671
- [ ] https://github.com/stackabletech/trino-operator/pull/807
- [ ] https://github.com/stackabletech/zookeeper-operator/pull/978
- [ ] https://github.com/stackabletech/opa-operator/pull/767

### Consolidate Monitoring Stack
- [ ] https://github.com/stackabletech/demos/pull/324

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ensure that all products correctly expose metrics #747

Description

Tasks

Products to check/fix

Consolidate Monitoring Stack

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Ensure that all products correctly expose metrics #747

Description

Description

Tasks

Products to check/fix

Consolidate Monitoring Stack

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions