Troubleshoot Crossplane

This document is for an unreleased version of Crossplane.

This document applies to the Crossplane master branch and not to the latest release v1.18.

Requested Resource Not Found

If you use the Crossplane CLI to install a Provider or Configuration (for example, crossplane install provider xpkg.upbound.io/crossplane-contrib/provider-aws:v0.33.0) and get the server could not find the requested resource error, more often than not, that’s an indicator that the Crossplane CLI you’re using is outdated. In other words some Crossplane API has been graduated from alpha to beta or stable and the old plugin isn’t aware of this change.

Resource Status and Conditions

Most Crossplane resources have a status section that can represent the current state of that particular resource. Running kubectl describe against a Crossplane resource will frequently give insightful information about its condition. For example, to determine the status of a GCP CloudSQLInstance managed resource use kubectl describe for the resource.

1kubectl describe cloudsqlinstance my-db
2Status:
3  Conditions:
4    Last Transition Time:  2019-09-16T13:46:42Z
5    Reason:                Creating
6    Status:                False
7    Type:                  Ready

Most Crossplane resources set the Ready condition. Ready represents the availability of the resource - whether it’s creating, deleting, available, unavailable, binding, etc.

Resource Events

Most Crossplane resources emit events when something interesting happens. You can see the events associated with a resource by running kubectl describe - for example, kubectl describe cloudsqlinstance my-db. You can also see all events in a particular namespace by running kubectl get events.

1Events:
2  Type     Reason                   Age                From                                                   Message
3  ----     ------                   ----               ----                                                   -------
4  Warning  CannotConnectToProvider  16s (x4 over 46s)  managed/postgresqlserver.database.azure.crossplane.io  cannot get referenced ProviderConfig: ProviderConfig.azure.crossplane.io "default" not found

Note that events are namespaced, while many Crossplane resources (XRs, etc) are cluster scoped. Crossplane emits events for cluster scoped resources to the ‘default’ namespace.

Crossplane Logs

The next place to look to get more information or investigate a failure would be in the Crossplane pod logs, which should be running in the crossplane-system namespace. To get the current Crossplane logs, run the following:

1kubectl -n crossplane-system logs -lapp=crossplane

Note that Crossplane emits few logs by default - events are typically the best place to look for information about what Crossplane is doing. You may need to restart Crossplane with the --debug flag if you can’t find what you’re looking for.

Provider Logs

Remember that much of Crossplane’s functionality is provided by providers. You can use kubectl logs to view provider logs too. By convention, they also emit few logs by default.

1kubectl -n crossplane-system logs <name-of-provider-pod>

All providers maintained by the Crossplane community mirror Crossplane’s support of the --debug flag. The easiest way to set flags on a provider is to create a DeploymentRuntimeConfig and reference it from the Provider:

 1apiVersion: pkg.crossplane.io/v1beta1
 2kind: DeploymentRuntimeConfig
 3metadata:
 4  name: debug-config
 5spec:
 6  deploymentTemplate:
 7    spec:
 8      selector: {}
 9      template:
10        spec:
11          containers:
12          - name: package-runtime
13            args: 
14            - --debug
15---
16apiVersion: pkg.crossplane.io/v1
17kind: Provider
18metadata:
19  name: provider-aws
20spec:
21  package: xpkg.upbound.io/crossplane-contrib/provider-aws:v0.33.0
22  runtimeConfigRef:
23    apiVersion: pkg.crossplane.io/v1beta1
24    kind: DeploymentRuntimeConfig
25    name: debug-config

Note that a reference to a DeploymentRuntimeConfig can be added to an already installed Provider and it will update its Deployment accordingly.

Compositions and composite resource definition

General troubleshooting steps

Crossplane and its providers log most error messages to resources’ event fields. Whenever your Composite Resources aren’t getting provisioned, follow the following steps:

  1. Get the events for the root resource using kubectl describe or kubectl get event

  2. If there are errors in the events, address them.

  3. If there are no errors, follow its subresources.

    kubectl get <KIND> <NAME> -o=jsonpath='{.spec.resourceRef}{" "}{.spec.resourceRefs}' | jq

  4. Repeat this process for each resource returned.

Note
The rest of this section show you how to debug issues related to compositions without using external tooling. If you are using ArgoCD or FluxCD with UI, you can visualize object relationships in the UI. You can also use the kube-lineage plugin to visualize object relationships in your terminal.

Examples

Composition

You deployed an example application using a claim. Kind = ExampleApp. Name = example-application.

The example application never reaches available state as shown below.

  1. View the claim.

    1kubectl describe exampleapp example-application
    2
    3Status:
    4Conditions:
    5    Last Transition Time:  2022-03-01T22:57:38Z
    6    Reason:                Composite resource claim is waiting for composite resource to become Ready
    7    Status:                False
    8    Type:                  Ready
    9Events:                    <none>
    
  2. If the claim doesn’t have errors, inspect the .spec.resourceRef field of the claim.

    1kubectl get exampleapp example-application -o=jsonpath='{.spec.resourceRef}{" "}{.spec.resourceRefs}' | jq
    2
    3{
    4  "apiVersion": "awsblueprints.io/v1alpha1",
    5  "kind": "XExampleApp",
    6  "name": "example-application-xqlsz"
    7}
    
  3. In the preceding output, you see the cluster scoped resource for this claim. Kind = XExampleApp name = example-application-xqlsz

  4. View the cluster scoped resource’s events.

    1kubectl describe xexampleapp example-application-xqlsz
    2
    3Events:
    4Type     Reason                   Age               From                                                             Message
    5----     ------                   ----              ----                                                             -------
    6Normal   PublishConnectionSecret  9s (x2 over 10s)  defined/compositeresourcedefinition.apiextensions.crossplane.io  Successfully published connection details
    7Normal   SelectComposition        6s (x6 over 11s)  defined/compositeresourcedefinition.apiextensions.crossplane.io  Successfully selected composition
    8Warning  ComposeResources         6s (x6 over 10s)  defined/compositeresourcedefinition.apiextensions.crossplane.io  can't render composed resource from resource template at index 3: can't use dry-run create to name composed resource: an empty namespace may not be set during creation
    9Normal   ComposeResources         6s (x6 over 10s)  defined/compositeresourcedefinition.apiextensions.crossplane.io  Successfully composed resources
    
  5. You see errors in the events. it’s complaining about not specifying namespace in its compositions. For this particular kind of error, you can get its subresources and check which one isn’t created.

     1kubectl get xexampleapp example-application-xqlsz -o=jsonpath='{.spec.resourceRef}{" "}{.spec.resourceRefs}' | jq
     2
     3[
     4    {
     5        "apiVersion": "awsblueprints.io/v1alpha1",
     6        "kind": "XDynamoDBTable",
     7        "name": "example-application-xqlsz-6j9nm"
     8    },
     9    {
    10        "apiVersion": "awsblueprints.io/v1alpha1",
    11        "kind": "XIAMPolicy",
    12        "name": "example-application-xqlsz-lp9wt"
    13    },
    14    {
    15        "apiVersion": "awsblueprints.io/v1alpha1",
    16        "kind": "XIAMPolicy",
    17        "name": "example-application-xqlsz-btwkn"
    18    },
    19    {
    20        "apiVersion": "awsblueprints.io/v1alpha1",
    21        "kind": "IRSA"
    22    }
    23]
    
  6. Notice the last element in the array doesn’t have a name. When a resource in composition fails validation, the resource object isn’t created and doesn’t have a name. For this particular issue, you must specify the namespace for the IRSA resource.

Composite resource definition

Debugging Composite Resource Definition (XRD) is like debugging Compositions.

  1. Get the XRD

    1kubectl get xrd testing.awsblueprints.io
    2
    3NAME                       ESTABLISHED   OFFERED   AGE
    4testing.awsblueprints.io                           66s
    
  2. Notice its status it not established. You describe this XRD to get its events.

    1kubectl describe xrd testing.awsblueprints.io
    2
    3Events:
    4Type     Reason              Age                    From                                                             Message
    5----     ------              ----                   ----                                                             -------
    6Normal   ApplyClusterRoles   3m19s (x3 over 3m19s)  rbac/compositeresourcedefinition.apiextensions.crossplane.io     Applied RBAC ClusterRoles
    7Normal   RenderCRD           18s (x9 over 3m19s)    defined/compositeresourcedefinition.apiextensions.crossplane.io  Rendered composite resource CustomResourceDefinition
    8Warning  EstablishComposite  18s (x9 over 3m19s)    defined/compositeresourcedefinition.apiextensions.crossplane.io  can't apply rendered composite resource CustomResourceDefinition: can't create object: CustomResourceDefinition.apiextensions.k8s.io "testing.awsblueprints.io" is invalid: metadata.name: Invalid value: "testing.awsblueprints.io": must be spec.names.plural+"."+spec.group
    
  3. You see in the events that Crossplane can’t generate corresponding CRDs for this XRD. In this case, ensure the name is spec.names.plural+"."+spec.group

Providers

You can use install providers in two ways: configuration.pkg.crossplane.io and provider.pkg.crossplane.io. You can use either one to install providers with no functional differences to providers themselves. If you define a configuration.pkg.crossplane.io object, Crossplane creates a provider.pkg.crossplane.io object and manages it. Refer to the Packages documentation for more information about Crossplane Packages.

If you are experiencing provider issues, steps below are a good starting point.

  1. Check the status of provider object.

     1kubectl describe provider.pkg.crossplane.io provider-aws
     2
     3Status:
     4    Conditions:
     5        Last Transition Time:  2022-08-04T16:19:44Z
     6        Reason:                HealthyPackageRevision
     7        Status:                True
     8        Type:                  Healthy
     9        Last Transition Time:  2022-08-04T16:14:29Z
    10        Reason:                ActivePackageRevision
    11        Status:                True
    12        Type:                  Installed
    13    Current Identifier:      crossplane/provider-aws:v0.29.0
    14    Current Revision:        provider-aws-a2e16ca2fc1a
    15Events:
    16    Type    Reason                  Age                      From                                 Message
    17    ----    ------                  ----                     ----                                 -------
    18    Normal  InstallPackageRevision  9m49s (x237 over 4d17h)  packages/provider.pkg.crossplane.io  Successfully installed package revision
    

    In the output above you see that this provider is healthy. To get more information about this provider, you can dig deeper. The Current Revision field let you know of your next object to look at.

  2. When you create a provider object, Crossplane creates a ProviderRevision object based on the contents of the OCI image. In this example, you’re specifying the OCI image to be crossplane/provider-aws:v0.29.0. This image contains a YAML file which defines Kubernetes objects such as Deployment, ServiceAccount, and CRDs. The ProviderRevision object creates resources necessary for a provider to function based on the contents of the YAML file. To inspect what’s deployed as part of the provider package, you inspect the ProviderRevision object. The Current Revision field above indicates which ProviderRevision object this provider uses.

    1kubectl get providerrevision provider-aws-a2e16ca2fc1a
    2
    3NAME                        HEALTHY   REVISION   IMAGE                             STATE    DEP-FOUND   DEP-INSTALLED   AGE
    4provider-aws-a2e16ca2fc1a   True      1          crossplane/provider-aws:v0.29.0   Active                               19d
    

    When you describe the object, you find all CRDs managed by this object.

     1kubectl describe providerrevision provider-aws-a2e16ca2fc1a
     2
     3Status:
     4    Controller Ref:
     5        Name:  provider-aws-a2e16ca2fc1a
     6    Object Refs:
     7        API Version:  apiextensions.k8s.io/v1
     8        Kind:         CustomResourceDefinition
     9        Name:         natgateways.ec2.aws.crossplane.io
    10        UID:          5c36d1bc-61b8-44f8-bca0-47e368af87a9
    11        ....
    12Events:
    13    Type    Reason             Age                    From                                         Message
    14    ----    ------             ----                   ----                                         -------
    15    Normal  SyncPackage        22m (x369 over 4d18h)  packages/providerrevision.pkg.crossplane.io  Successfully configured package revision
    16    Normal  BindClusterRole    15m (x348 over 4d18h)  rbac/providerrevision.pkg.crossplane.io      Bound system ClusterRole to provider ServiceAccount
    17    Normal  ApplyClusterRoles  15m (x364 over 4d18h)  rbac/providerrevision.pkg.crossplane.io      Applied RBAC ClusterRoles
    

    The event field also indicates any issues that may have occurred during this process.

  3. If you don’t see any errors in the event field above, you should check if Crossplane provisioned deployments and their status.

     1kubectl get deployment -n crossplane-system
     2
     3NAME                        READY   UP-TO-DATE   AVAILABLE   AGE
     4crossplane                  1/1     1            1           105d
     5crossplane-rbac-manager     1/1     1            1           105d
     6provider-aws-a2e16ca2fc1a   1/1     1            1           19d
     7
     8kubectl get pods -n crossplane-system
     9
    10NAME                                         READY   STATUS    RESTARTS   AGE
    11crossplane-54db688c8d-qng6b                  2/2     Running   0          4d19h
    12crossplane-rbac-manager-5776c9fbf4-wn5rj     1/1     Running   0          4d19h
    13provider-aws-a2e16ca2fc1a-776769ccbd-4dqml   1/1     Running   0          4d23h
    

    If there are any pods failing, check its logs and remedy the problem.

Pausing Crossplane

Sometimes, for example when you encounter a bug, it can be useful to pause Crossplane if you want to stop it from actively attempting to manage your resources. To pause Crossplane without deleting all of its resources, run the following command to scale down its deployment:

1kubectl -n crossplane-system scale --replicas=0 deployment/crossplane

Once you have been able to rectify the problem or smooth things out, you can unpause Crossplane by scaling its deployment back up:

1kubectl -n crossplane-system scale --replicas=1 deployment/crossplane

Pausing Providers

Providers can also be paused when troubleshooting an issue or orchestrating a complex migration of resources. Creating and referencing a DeploymentRuntimeConfig is the easiest way to scale down a provider, and the DeploymentRuntimeConfig can be modified or the reference can be removed to scale it back up:

 1apiVersion: pkg.crossplane.io/v1beta1
 2kind: DeploymentRuntimeConfig
 3metadata:
 4  name: scale-config
 5spec:
 6  deploymentTemplate:
 7    spec:
 8      selector: {}
 9      replicas: 0
10      template: {}
11---
12apiVersion: pkg.crossplane.io/v1
13kind: Provider
14metadata:
15  name: provider-aws
16spec:
17  package: xpkg.upbound.io/crossplane-contrib/provider-aws:v0.33.0
18  runtimeConfigRef:
19    apiVersion: pkg.crossplane.io/v1beta1
20    kind: DeploymentRuntimeConfig
21    name: scale-config

Note that a reference to a DeploymentRuntimeConfig can be added to an already installed Provider and it will update its Deployment accordingly.

Deleting When a Resource Hangs

The resources that Crossplane manages will automatically be cleaned up so as not to leave anything running behind. This is accomplished by using finalizers, but in certain scenarios the finalizer can prevent the Kubernetes object from getting deleted.

To deal with this, we essentially want to patch the object to remove its finalizer, which will then allow it to be deleted completely. Note that this won’t necessarily delete the external resource that Crossplane was managing, so you will want to go to your cloud provider’s console and look there for any lingering resources to clean up.

In general, a finalizer can be removed from an object with this command:

1kubectl patch <resource-type> <resource-name> -p '{"metadata":{"finalizers": []}}' --type=merge

For example, for a CloudSQLInstance managed resource (database.gcp.crossplane.io) named my-db, you can remove its finalizer with:

1kubectl patch cloudsqlinstance my-db -p '{"metadata":{"finalizers": []}}' --type=merge

Tips, Tricks, and Troubleshooting

In this section we’ll cover some common tips, tricks, and troubleshooting steps for working with Composite Resources. If you’re trying to track down why your Composite Resources aren’t working the [Troubleshooting][trouble-ref] page also has some useful information.

Troubleshooting Claims and XRs

Crossplane relies heavily on status conditions and events for troubleshooting. You can see both using kubectl describe - for example:

1# Describe the PostgreSQLInstance claim named my-db
2kubectl describe postgresqlinstance.database.example.org my-db

Per Kubernetes convention, Crossplane keeps errors close to the place they happen. This means that if your claim isn’t becoming ready due to an issue with your Composition or with a composed resource you’ll need to “follow the references” to find out why. Your claim will only tell you that the XR isn’t yet ready.

To follow the references:

  1. Find your XR by running kubectl describe on your claim and looking for its “Resource Ref” (aka spec.resourceRef).
  2. Run kubectl describe on your XR. This is where you’ll find out about issues with the Composition you’re using, if any.
  3. If there are no issues but your XR doesn’t seem to be becoming ready, take a look for the “Resource Refs” (or spec.resourceRefs) to find your composed resources.
  4. Run kubectl describe on each referenced composed resource to determine whether it’s ready and what issues, if any, it’s encountering.