This document is for an older version of Crossplane.

This document applies to Crossplane version v1.17 and not to the latest release v1.18.

Crossplane produces Prometheus style metrics for effective monitoring and alerting in your environment. These metrics are essential for helping to identify and resolve potential issues. This page offers explanations of all these metrics gathered from Crossplane. Understanding these metrics helps you maintain the health and performance of your resources. Please note that this document focuses on Crossplane specific metrics and doesn’t cover standard Go metrics.

To enable the export of metrics it’s necessary to configure the --set metrics.enabled=true option in the helm chart.

1metrics:
2  enabled: true

These Prometheus annotations expose the metrics:

1prometheus.io/path: /metrics
2prometheus.io/port: "8080"
3prometheus.io/scrape: "true"
Metric NameDescriptionFurther Explanation
certwatcher_read_certificate_errors_totalTotal number of certificate read errors
certwatcher_read_certificate_totalTotal number of certificate reads
composition_run_function_seconds_bucketHistogram of RunFunctionResponse latency (seconds)
controller_runtime_active_workersNumber of used workers per controllerThe number of threads processing jobs from the work queue.
controller_runtime_max_concurrent_reconcilesMaximum number of concurrent reconciles per controllerDescribes how reconciles can happen in parallel.
controller_runtime_reconcile_errors_totalTotal number of reconciliation errors per controllerA counter that counts reconcile errors. Sharp or non stop rising of this metric might be a problem.
controller_runtime_reconcile_time_seconds_bucketLength of time per reconciliation per controller
controller_runtime_reconcile_totalTotal number of reconciliations per controller
controller_runtime_webhook_latency_seconds_bucketHistogram of the latency of processing admission requests
controller_runtime_webhook_requests_in_flightCurrent number of admission requests served
controller_runtime_webhook_requests_totalTotal number of admission requests by HTTP status code
rest_client_requests_totalNumber of HTTP requests, partitioned by status code, method, and host
workqueue_adds_totalTotal number of adds handled by workqueue
workqueue_depthCurrent depth of workqueue
workqueue_longest_running_processor_secondsThe number of seconds has the longest running processor for workqueue been running
workqueue_queue_duration_seconds_bucketHow long in seconds an item stays in workqueue before requestedThe time it takes from the moment a job enter the workqueue until the processing of this job starts.
workqueue_retries_totalTotal number of retries handled by workqueue
workqueue_unfinished_work_secondsThe number of seconds of work done that’s in progress and hasn’t observed by work_duration. Large values means stuck threads.
workqueue_work_duration_seconds_bucketHow long in seconds processing an item from workqueue takesThe time it takes from the moment the job start until it finish (either successfully or with an error).
crossplane_managed_resource_existsThe number of managed resources that exist
crossplane_managed_resource_readyThe number of managed resources in Ready=True state
crossplane_managed_resource_syncedThe number of managed resources in Synced=True state
upjet_resource_ext_api_duration_bucketMeasures in seconds how long it takes a Cloud SDK call to complete
upjet_resource_external_api_calls_totalThe number of external API callsThe number of calls to cloud providers, with labels describing the endpoints resources.
upjet_resource_reconcile_delay_seconds_bucketMeasures in seconds how long the reconciles for a resource delay from the configured poll periods
crossplane_managed_resource_deletion_seconds_bucketThe time it took to delete a managed resource
crossplane_managed_resource_first_time_to_readiness_seconds_bucketThe time it took for a managed resource to become ready first time after creation
crossplane_managed_resource_first_time_to_reconcile_seconds_bucketThe time it took to detect a managed resource by the controller
upjet_resource_ttr_bucketMeasures in seconds the time-to-readiness (TTR) for managed resources