Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AutoHEP Kube-Controller enable custom hostendpoint templates #9901

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

MichalFupso
Copy link
Contributor

Description

Updates the AutoHEP Kube-Controller to allow creation of multiple HostEndpoints per Node. This can be controller with the new KubeControllersConfiguration Template

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Updates the AutoHEP Kube-Controller to allow creation of multiple custom HostEndpoints per Node 

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@MichalFupso MichalFupso requested a review from a team as a code owner February 26, 2025 19:12
@marvin-tigera marvin-tigera added this to the Calico v3.30.0 milestone Feb 26, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Feb 26, 2025
Copy link
Member

@caseydavenport caseydavenport left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good! Most of my comments are pretty straightforward I hope!

@@ -110,6 +110,25 @@ type NodeControllerConfig struct {
type AutoHostEndpointConfig struct {
// AutoCreate enables automatic creation of host endpoints for every node. [Default: Disabled]
AutoCreate string `json:"autoCreate,omitempty" validate:"omitempty,oneof=Enabled Disabled"`

CreateDefaultHostEndpoint string `json:"createDefaultHostEndpoint,omitempty" validate:"omitempty,oneof=Enabled Disabled"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be a new enum type

type DefaultHostEndpointMode string

const (
  DeafultHostEndpointsEnabled DefaultHostEndpointMode = "Enabled"
  DeafultHostEndpointsDisabled DefaultHostEndpointMode = "Disabled"
)

Name string `json:"name,omitempty" validate:"omitempty,name"`

// InterfaceSelectorCIDR contains a list of CIRDs used for matching nodeIPs to the AutoHostEndpoint
InterfaceSelectorCIDR []string `json:"interfaceSelectorCIDR,omitempty" validate:"cidrs"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
InterfaceSelectorCIDR []string `json:"interfaceSelectorCIDR,omitempty" validate:"cidrs"`
InterfaceCIDRs []string `json:"interfaceCIDRs,omitempty" validate:"cidrs"`

idk, this feels more right to me.

// InterfaceSelectorCIDR contains a list of CIRDs used for matching nodeIPs to the AutoHostEndpoint
InterfaceSelectorCIDR []string `json:"interfaceSelectorCIDR,omitempty" validate:"cidrs"`

// Labels adds the specified labels to the generated AutoHostEndpoint
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should explain how conflicts with our own labels are handled.


type Template struct {
// Name is appended to the end of the generated AutoHostEndpoint name
Name string `json:"name,omitempty" validate:"omitempty,name"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Name string `json:"name,omitempty" validate:"omitempty,name"`
GenerateName string `json:"generateName,omitempty" validate:"omitempty,name"`

Similar to k8s concept, since this isn't actually a name it's used to generate a name.


if isAutoHostEndpoint(hostEndpoint) {
c.hostEndpointTracker.addHostEndpointForNode(hostEndpoint)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything here that triggers a sync for this HEP / node when it gets an update - do we need to?

e.g., if a users modifies one of the auto HEPs accidentally, presumably we want to quickly reconcile it back to the desired state?

}

// getAutoHostEndpointExpectedIPs returns all the known IPs on the node resource
func (c *autoHostEndpointController) getAutoHostEndpointExpectedIPs(node *libapi.Node) []string {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (c *autoHostEndpointController) getAutoHostEndpointExpectedIPs(node *libapi.Node) []string {
func (c *autoHostEndpointController) getExpectedIPs(node *libapi.Node) []string {

for k, v := range node.Labels {
hepLabels[k] = v
}
for k, v := range templateLabels {
if _, ok := hepLabels[k]; ok {
logrus.Warnf("overwriting duplicate label %s", k)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logrus.Warnf("overwriting duplicate label %s", k)
f := logrus.Fields{"key": k, "nodeVal": hepLabels[k], "userVal": v}
logrus.WithFields(f).Warn("overwriting label from underlying Node resource")

We should almost never be putting variables into the log text - using Fields makes the logs more consistent and adds structure, which is valuable when ingesting logs into a log processor.

},
}

if interfaceName != "" {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this check - Spec.InterfaceName is a string field, and so if we don't set it to anything, it will default to "" anyway.


ctx, cancel := context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
latestHostEndpoint, err := c.client.HostEndpoints().Get(ctx, expected.Name, options.GetOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to perform a Get() here? Don't we have a cache of host endpoint objects that we can use?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In ETCD mode the resource received via the syncer does not contain the resourceVersion that we need for the update. We first need to call get() which returns the resource with resourceVersion and then we can update it


ctx, cancel = context.WithTimeout(context.Background(), 10*time.Second)
defer cancel()
_, err = c.client.HostEndpoints().Update(ctx, expected, options.SetOptions{})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing we could do it proactively update our cache here - that would do two things:

  • Avoid tricky / awkward bugs where we have to wait for an Update from the API server for our in-memory representation to be correct. e.g., in case we want to do any further processing on the HEP after updating it.
  • Allow us to filter out updates that we already know about when triggering syncs based on HEP events. i.e., if we get an event but the new HEP is identical to the one in our cache, we don't need to trigger a sync.

}
}

if rc.Node.AutoHostEndpointConfig != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Was there a purpose for it not being considered the else statement of the if above?

@@ -3439,6 +3445,41 @@ func init() {
Entry("should not accept invalid assignIPs value for LoadBalancer config",
api.LoadBalancerControllerConfig{AssignIPs: "incorrect-value"}, false,
),
Entry("should accept template with incorrect name",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: "should accept" or should not accept?

_, outError = c.KubeControllersConfiguration().Create(ctx, kcc, options.SetOptions{})
Expect(outError).To(HaveOccurred())
Expect(outError.Error()).To(ContainSubstring("Template name must be unique"))
})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Would it be worth creating and validating a valid struct, in addition to validating the invalid ones?

@@ -59,14 +60,64 @@ func (r kubeControllersConfiguration) fillDefaults(res *apiv3.KubeControllersCon
res.Spec.Controllers.LoadBalancer.AssignIPs = apiv3.AllServices
}
}

// Validate and default HostEndpoint Template
Copy link
Contributor

@lucastigera lucastigera Mar 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just for my own understanding, why weren't those validations implemented in /validator/v3/validator.go?

@stevegaossou
Copy link
Contributor

Preparing PR description...

@stevegaossou
Copy link
Contributor

Preparing review...

@stevegaossou
Copy link
Contributor

Preparing PR description...

@stevegaossou
Copy link
Contributor

Preparing review...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants