A Reconciler is a logical pattern which takes the state of a set of objects in the API server, and propagates changes to other components based on that state. This logic is asynchronous, and works by having a function which is called when a change occurs in the set of resources the reconciler is watching, which provides the current state of the object, and the type of change that occurred (Create, Update, Delete). For more information, see the Asynchronous Business Logic section in Platform Concepts.
Both reconcilers and watchers are used for the reconciliation process.
Whether you use one or the other is down to preference, and use-case.
Both reconcilers and watchers are powered by the same informer design within an InformerController
, with just slightly different handling logic.
They both have an Opinionated
variant that can wrap the interface as well.
The major difference between a Reconciler
and a Watcher
is that a Reconciler
has a single function, Reconcile
, which is called for every event for a kind, while a Watcher
has a function for each event type (Add
, Update
, Delete
). There are more minor differences in how these events are handled as well:
- A
Reconciler
can return a response with an explicit "retry after this time period" message, while aWatcher
will only return success/fail (nil
orerror
). - A
Watcher
will give you the previous state of the resource on an update event, while a Reconcile event will not. - A
Reconciler
can pass state between retries in-memory if storing state in the API server is failing, however, this will only persist until the operator is restarted and should not be relied upon excepting situations where the API server cannot be reached.
When writing a reconciler, it's important to take a few things into consideration:
- If you make an update to the object you're doing the reconcile (or watch) event for, this will trigger another reconcile (or watch) event. Generally, favor only updating subresources (specifically
status
) and some metadata in your reconcile (or watch) events, as astatus
update should not trigger themetadata.generation
value to increase (onlymetadata.resourceVersion
), which will allow you to filter events out. Using theoperator.OpinionatedWatcher
will filter these events for you, but you will need to track this yourself in a Reconciler; if you prefer not to use OpinionatedWatcher or want to do your own event filtering, keep in mind how updates within your reconcile loop will be received. - The reconciler is taking action on every consumed event. Finding ways to escape from a reconcile or watcher event early will help your overall program logic.
- All objects for the kind(s) you are watching are cached to memory by default. This can be customized by using a different informer implementation, such as
operator.CustomCacheInformer
. Custom informers can be used insimple.App
withAppConfig.InformerConfig.InformerSupplier
, or by using your own customapp.App
implementation. - Don't rely on retries to track operator state; use the
status
subresource to track operator success/failure, so that your operator can work out state from a fresh start (a restart will remove all pending retries, which are stored purely in-memory). This also allows a user to track operator status by viewing thestatus
subresource. - If your reconcile process makes requests for other resources, consider caching, as high-traffic objects may cause your application to have to make these requests extremely frequently.
- If your operator has a watcher or reconciler that updates the resource in a deterministic way (such as adding a label based on the spec), consider adding mutation for the kind on your App instead, as it makes that process synchronous and will never leave the object in an intermediate state (and reduces calls to the API server from your operator). Mutation can be added for a kind in
simple.App
withAppConfig.ManagedKinds[].Mutator
, or by implementing the behavior inMutate
if you're using a customapp.App
implementation (don't forget to add mutation in your manifest as well). - When you have multiple versions of a kind, your reconciliation should only deal with one of them (typically the latest), as events are always issued for any version as the version requested by the operator's watch (so a user creating a
v1
version of a resource will still produce av2
version of that resource in a watch request for thev2
of the kind). - CRD's have a built-in conversion mechanism that is roughly equivalent to running
json.Marshal
on the stored version and thenjson.Unmarshal
into the requested version. If this is not good enough for your purposes, add conversion to your app (forsimple.App
, useAppConfig.Converters
, or implementConvert
if you're implementingapp.App
yourself. Don't forget to add conversion in your manifest as well).
Let's consider an example reconciler for a kind defined by the CUE:
{
kind: "MyKind"
current: "v1"
versions: {
"v1": {
schema: {
spec: {
someInfo: string
otherInfo: string
}
status: {
lastAppliedGeneration: int
}
}
}
}
}
We want to build a reconciler that will send someInfo
and otherInfo
to some other system, but we only need to do this if someInfo
or otherInfo
change.
Since there are ways a resource can change without the contents of spec
being altered, we need a way to track if the current spec
has been applied.
To do this here, we track lastAppliedGeneration
in the status
. generation
is a kubernetes API object metadata property which increments when the spec changes,
so we can use it to check if a given request has a different version of the spec than one we've already processed.
func NewMyKindReconciler(infoClient InfoClient, store *resource.TypedStore[*v1.MyKind]) operator.Reconciler {
// operator.TypedReconciler implements operator.Reconciler but calls ReconcilerFunc with an operator.TypedReeconcileRequest
// rather than an operator.ReconcileRequest, avoiding the need to cast resource.Object into our go type.
// We could also have a struct which implements operator.Reconciler, but this is easier for simple things.
return &operator.TypedReconciler[*v1.MyKind]{
ReconcileFunc: func(ctx context.Context, req operator.TypedReconcileRequest[*v1.MyKind]) (operator.ReconcileResult, error) {
logging.FromContext(ctx).Info("Reconcile request", "name", req.Object.GetName(), "action", operator.ResourceActionFromReconcile(req.Action), "generation", req.Object.GetGeneration())
// If we're deleting the object, tell InfoClient
if req.Action == operator.ReconcileActionDeleted {
err := infoClient.Delete(req.Object.GetNamespace(), req.Object.GetName())
return operator.ReconcileResult{}, err
}
// If the last applied generation matches the current generation of the resource, we can ignore this reconcile request
if req.Object.GetGeneration() == req.Object.Status.LastAppliedGeneration {
return operator.ReconcileResult{}, nil
}
// Attempt to apply the state to the third-party service
err := infoClient.ApplyInfo(req.Object.GetNamespace(), req.Object.GetName(), req.Object.Spec.SomeInfo, req.Object.Spec.OtherInfo)
if err != nil {
// Check the error, if it's retryable, tell the controller to try again in a bit
if IsRetryable(err) {
return operator.ReconcileResult{
RequeueAfter: time.Minute,
}, nil
}
// Otherwise, return the error. The controllers RetryPolicy will dictate if it should be retried, and after how long
return operator.ReconcileResult{}, err
}
// Set status.lastAppliedGeneration
req.Object.Status.LastAppliedGeneration = req.Object.GetGeneration()
_, err = store.UpdateSubresource(ctx, req.Object.GetStaticMetadata().Identifier(), resource.SubresourceStatus, req.Object)
if err != nil {
return operator.ReconcileResult{}, err
}
return operator.ReconcileResult{}, nil
}
}
}
We could write a similar Watcher
, with the caveat being that this function becomes split and duplicated amongst the watcher's methods for each action,
and we can't control requeue behavior from the watcher response (we have to leave it up to the controller's RetryPolicy
).
A watcher version of this would look like:
func NewMyKindReconciler(infoClient InfoClient, store *resource.TypedStore[*v1.MyKind]) operator.Reconciler {
// simple.Watcher implements operator.Watcher and calls the defined functions for each event.
// We could also have a struct which implements operator.Watcher, but this is easier for simple things.
return &simple.Watcher{
AddFunc: func(ctx context.Context, obj resource.Object) error {
logging.FromContext(ctx).Info("Add event", "name", obj.GetName(), "action", "add", "generation", obj.GetGeneration())
// Cast the object
mykind, ok := obj.(*v1.MyKind)
if !ok {
return fmt.Errorf("unable to cast object into *v1.MyKind")
}
// We still need to check the lastAppliedGeneration, as an add event can be called on operator startup,
// as the application doesn't have the state to know if the resource already existed.
if mykind.GetGeneration() == mykind.Status.LastAppliedGeneration {
return nil
}
// Attempt to apply the state to the third-party service
err := infoClient.ApplyInfo(mykind.GetNamespace(), mykind.GetName(), mykind.Spec.SomeInfo, mykind.Spec.OtherInfo)
if err != nil {
// Return the error. The controller's RetryPolicy will dictate if it should be retried, and after how long
return err
}
// Set status.lastAppliedGeneration
mykind.Status.LastAppliedGeneration = mykind.GetGeneration()
_, err = store.UpdateSubresource(ctx, mykind.GetStaticMetadata().Identifier(), resource.SubresourceStatus, mykind)
if err != nil {
return err
}
return nil
}
UpdateFunc: func(ctx context.Context, oldObj resource.Object, newObj resource.Object) error {
logging.FromContext(ctx).Info("Update event", "name", obj.GetName(), "action", "update", "generation", obj.GetGeneration())
// Cast the object
mykind, ok := newObj.(*v1.MyKind)
if !ok {
return fmt.Errorf("unable to cast object into *v1.MyKind")
}
// If the last applied generation matches the current generation of the resource, we can ignore this update
if mykind.GetGeneration() == mykind.Status.LastAppliedGeneration {
return nil
}
// Attempt to apply the state to the third-party service
err := infoClient.ApplyInfo(mykind.GetNamespace(), mykind.GetName(), mykind.Spec.SomeInfo, mykind.Spec.OtherInfo)
if err != nil {
// Return the error. The controller's RetryPolicy will dictate if it should be retried, and after how long
return err
}
// Set status.lastAppliedGeneration
mykind.Status.LastAppliedGeneration = mykind.GetGeneration()
_, err = store.UpdateSubresource(ctx, mykind.GetStaticMetadata().Identifier(), resource.SubresourceStatus, mykind)
if err != nil {
return err
}
return nil
}
DeleteFunc: func(ctx context.Context, obj resource.Object) error {
logging.FromContext(ctx).Info("Delete event", "name", obj.GetName(), "action", "delete", "generation", obj.GetGeneration())
err := infoClient.Delete(req.Object.GetNamespace(), req.Object.GetName())
return operator.ReconcileResult{}, err
}
}
}