Using Velero to Protect Cluster API

ByContributor January 11, 2021January 23, 2021

Cluster API (also known as CAPI) is, as you may already know, an effort within the upstream Kubernetes community to apply Kubernetes-style APIs to cluster lifecycle management—in short, to use Kubernetes to manage the lifecycle of Kubernetes clusters. If you’re unfamiliar with CAPI, I’d encourage you to check out my introduction to Cluster API before proceeding. In this post, I’m going to show you how to use Velero (formerly Heptio Ark) to backup and restore Cluster API objects so as to protect your organization against an unrecoverable issue on your Cluster API management cluster.

To be honest, this process is so straightforward it almost doesn’t need to be explained. In general, the process for backing up the CAPI management cluster looks like this:

Pause CAPI reconciliation on the management cluster.
Back up the CAPI resources.
Resume CAPI reconciliation.

In the event of catastrophic failure, the recovery process looks like this:

Restore from backup onto another management cluster.
Resume CAPI reconciliation.

Let’s look at these steps in a bit more detail.

Pausing and Resuming Reconciliation

The process for pausing and resuming reconciliation of CAPI resources is outlined in this separate blog post. To summarize that post here for convenience, the Cluster API spec includes a paused field that causes the Cluster API controllers to stop reconciliation when the field is set to true (and resume reconciliation when the field is false or absent). Setting this field allows you, the cluster operator, to pause or resume reconciliation.

Backing up CAPI Resources

Once you’ve paused reconciliation for Cluster API, you can then run a backup using Velero. Based on my testing, I didn’t see anything unusual or odd about running a backup; generally speaking, it looks to be as simple as velero backup create (with appropriate flags). Given the large number of custom resources used by Cluster API (Clusters, Machines, MachineDeployments, KubeadmConfigs, etc.) it may be challenging to include only Cluster API resources using Velero’s --include-resources functionality. It’s probably easier to either a) not use any of Velero’s filtering functionality and catch everything, or b) make sure you are either using namespaces or labels comprehensively for CAPI objects and then use Velero’s --include-namespaces and/or --selector filtering options for selecting things to be included in the backup. Refer to Velero’s resource filtering documentation for more details.

Restoring from Backup

As with creating the backup using Velero, restoring from the Velero backup follows the standard Velero procedures (i.e., run velero restore create with appropriate flags/options). Naturally, the cluster to which you are restoring should be an appropriately-configured Cluster API management cluster with the appropriate Cluster API components already installed.

Since this article is more focused on the “Oh no my management cluster is dead” scenario, all of the information on disaster recovery in the Velero docs is appropriate.

After the restore is complete, you’ll then want to resume reconciliation on the target/destination cluster, as outlined above.

Backup and Restore Versus Moving

The clusterctl utility used by CAPI for initializing management clusters (among other things) also has a move subcommand that can be used to move CAPI resources from one cluster to another cluster. Some readers may be wondering why we should bother with Velero, and if they could use clusterctl move instead.

clusterctl move is a viable option for moving CAPI objects between two clusters as long as both the source and target clusters are up and running. Using Velero, on the other hand, only requires that the source cluster is up and running when a backup needs to be taken; users can then restore this backup to another cluster even if the source cluster has completely failed. I’m also of the opinion that Velero will provide more fine-grained control over what can be backed up and restored, although I have yet to test that directly.

Additional Resources

Readers may find the following resources useful as well:

Disaster recovery use case with Velero

Cluster migration use case with Velero

I hope that readers find this article helpful. If there’s anything I’ve discussed here that you’d like to see examined/explained in greater detail, feel free to let me know. You can find me on the Kubernetes Slack instance, or find me on Twitter. I’d love to hear from you!

Maximizing Your Forms with RSForm! Pro – The Ultimate Guide

Joomla plugins are vital tools that enhance the functionality of a website, with RSForm! Pro distinguished as a robust form-building solution. This overview aims to outline the key features, benefits, and straightforward installation process of RSForm! Pro. It will explore various customization options that enable users to tailor forms to their specific requirements, review common… […]

Boost Your Website Speed with Our Performance Optimization Plugin

In the rapidly evolving digital landscape, the performance of a Joomla website significantly influences user engagement. Performance optimization plugins serve as essential tools aimed at enhancing the speed and efficiency of a website by combining, minifying, and compressing assets such as CSS, JavaScript, and images. This article examines the advantages these plugins provide, ranging from… […]

How to Upgrade from Ubuntu 22.04 LTS to Ubuntu 24.04 LTS

The stable version of Ubuntu 24.04 LTS (code-named Noble Numbat) is released on April 25th 2024, if you are curious to know what is in it, you can now upgrade to the version of it… The post How to Upgrade from Ubuntu 22.04 LTS to Ubuntu 24.04 LTS appeared first on FAST DOMAINS.

16 Best Linux Distributions for Older Computers

Do you have an old laptop that has gathered layers of dust over time and you don’t exactly what to do with it? A good place to start would be to install a Linux distribution… The post 16 Best Linux Distributions for Older Computers appeared first on FAST DOMAINS.

Using Velero to Protect Cluster API

Pausing and Resuming Reconciliation

Backing up CAPI Resources

Restoring from Backup

Backup and Restore Versus Moving

Additional Resources

The blueprint to securely solve the elusive zero-touch provisioning of IoT devices at scale

New – AWS Public IPv4 Address Charge + Public IP Insights | Amazon Web Services

Rubrik adds protection for AWS, Azure, Oracle databases

NetEase to shut down public cloud service amid rising AI competition

CloudWatch Metric Streams – Send AWS Metrics to Partners and to Your Apps in Real Time

Enable Git Commit Message Syntax Highlighting in Vim on Fedora

© Copyright

VMware ESXi Power Optimization Overview

WiredGorilla

Pausing and Resuming Reconciliation

Backing up CAPI Resources

Restoring from Backup

Backup and Restore Versus Moving

Additional Resources

Similar Posts

© Copyright