flyteorg/flyte

[flyte2] Executor: expose promutils (default-registry) metrics on the controller-runtime metrics endpoint

Open

#7,453 opened on May 29, 2026

View on GitHub
 (3 comments) (0 reactions) (1 assignee)Python (378 forks)batch import
flyte2good first issue

Repository metrics

Stars
 (3,705 stars)
PR merge metrics
 (Avg merge 3d 8h) (116 merged PRs in 30d)

Description

Part of #7445.

Summary

The executor exposes a Prometheus /metrics endpoint via the controller-runtime metrics server, but it serves controller-runtime's own registry — so metrics registered through promutils (which use the default Prometheus registry) are collected but never scraped. Bridge the two so the executor:* scoped metrics actually show up.

Background

executor/setup.go:75-91 configures the controller-runtime metrics server:

metricsServerOptions := metricsserver.Options{ BindAddress: cfg.MetricsBindAddress, ... } // default ":10254" (config.go:15)
mgr, err := ctrl.NewManager(sc.K8sConfig, ctrl.Options{ Metrics: metricsServerOptions, ... })

The controller-runtime metrics server serves sigs.k8s.io/controller-runtime/pkg/metrics's own registry (controller-runtime built-ins + Go/process collectors).

Meanwhile promutils registers every metric on the default Prometheus registry — see flytestdlib/promutils/scope.go:269-419, all using prometheus.Register(...) (i.e. prometheus.DefaultRegisterer). The executor wires several such scopes:

  • promutils.NewScope("executor") for the webhook (setup.go:112)
  • promutils.NewScope("executor:storage") for the data store (setup.go:116)
  • promutils.NewScope("executor") for plugins (setup.go:124)
  • promutils.NewScope("executor:catalog") for the catalog client (setup.go:142)

Nothing bridges the default registry into the controller-runtime registry, so none of these executor:* metrics are exposed on :10254. They're dead today.

What to do

Make the default-registry metrics scrapeable on the metrics endpoint. Options (pick the cleanest; discuss in the issue):

  1. Register the default gatherer with controller-runtime's registry — e.g. have controller-runtime's metrics server also gather prometheus.DefaultGatherer, or register the relevant collectors into ctrlmetrics.Registry.
  2. Point promutils at controller-runtime's registry — less ideal, would require a promutils change.
  3. Serve a merged handler — expose a handler that gathers from both registries.

Acceptance criteria

  • Scraping the executor metrics endpoint (:10254 by default) returns both controller-runtime built-ins and the executor:* / executor_* metrics registered via promutils (storage, catalog, webhook).
  • No duplicate-registration panics at startup.
  • Verification: the PR description shows a curl <metricsBindAddress>/metrics excerpt (or a test) demonstrating an executor-scoped metric appears.

Pointers

  • executor/setup.go:75-142 — metrics server options, manager creation, and the promutils scopes.
  • flytestdlib/promutils/scope.go:269-419 — confirms promutils uses prometheus.Register (default registry).
  • controller-runtime metrics docs: https://book.kubebuilder.io/reference/metrics

Notes for contributors

  • This is the highest-impact executor metrics fix: the metrics already exist, they're just not exposed.
  • Coordinate with #7446 (which adds /metrics to the non-controller services) for naming consistency, but this issue is executor-specific and can proceed independently.

Contributor guide