Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support nested BGP peering with calico-nodes running in local kubevirt VM pods #9875

Open
wants to merge 17 commits into
base: master
Choose a base branch
from

Conversation

song-jiang
Copy link
Member

@song-jiang song-jiang commented Feb 20, 2025

Description

This PR adds support for allowing calico-node to peer with calico-node instances running inside KubeVirt VM pods locally, based on the labels of the VM pods.

API changes:

  • New field LocalWorkloadSelector to BGPPeer resource.
  • New field localWorkloadPeeringIPV4 and localWorkloadPeeringIPV4 to BGPConfigurations.

Felix changes:

  • It watches BGPPeer and calculates local workloads selected by the BGPPeer.
  • It populates endpoint status files with peering information.
  • It add localWorkloadPeeringIP to the network interface of the workload selected by the BGPPeer.

Confd changes

  • It watches endpoint status files updated by Felix.
  • It reconfigures bird.cfg/bird6.cfg based on the peering information read from endpoint status files.

libcalico-go changes

  • Added status-file-writer and status-file-watcher.

Related issues/PRs

Todos

  • Tests
  • Documentation
  • Release note

Release Note

Support nested BGP peering with calico-nodes running in local kubevirt VM pods.

Reminder for the reviewer

Make sure that this PR has the correct labels and milestone set.

Every PR needs one docs-* label.

  • docs-pr-required: This change requires a change to the documentation that has not been completed yet.
  • docs-completed: This change has all necessary documentation completed.
  • docs-not-required: This change has no user-facing impact and requires no docs.

Every PR needs one release-note-* label.

  • release-note-required: This PR has user-facing changes. Most PRs should have this label.
  • release-note-not-required: This PR has no user-facing changes.

Other optional labels:

  • cherry-pick-candidate: This PR should be cherry-picked to an earlier release. For bug fixes only.
  • needs-operator-pr: This PR is related to install and requires a corresponding change to the operator.

@song-jiang song-jiang requested a review from a team as a code owner February 20, 2025 11:35
@marvin-tigera marvin-tigera added this to the Calico v3.30.0 milestone Feb 20, 2025
@marvin-tigera marvin-tigera added release-note-required Change has user-facing impact (no matter how small) docs-pr-required Change is not yet documented labels Feb 20, 2025
Copy link
Contributor

@aaaaaaaalex aaaaaaaalex left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couldn't spot any glaring issues (though I understand you know of one!)

{{- end}}
# For peer {{.Key}}
{{- if eq $data.ip ($node_ip) }}
# Skipping ourselves ({{$node_ip}})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why would the node itself show up in the local WEP peers?

logCxt.Debug("Workload endpoint status file created")
epStatus, err := epstatus.GetWorkloadEndpointStatusFromFile(fileName)
if err != nil {
logCxt.WithError(err).Error("Failed to read endpoint status from file, it may just be created.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feels like this might be spammy since we'll always race with felix writing. can you defer the error (if the file is still bad after >5s then log an error).

if len(epStatus.Ipv4Nets) != 0 {
ip, _, err := net.ParseCIDR(epStatus.Ipv4Nets[0])
if err != nil {
log.WithError(err).Error("Workload endpoint status does not have a valid Ipv4Nets, ignore it for now")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd probably use Warn for this since you're handling the problem (by ignoring it)

"github.com/projectcalico/calico/libcalico-go/lib/backend/model"
)

var _ = Describe("ActiveBGPPeerCalculator", func() {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This component should be tested through the calculation graph FV suite so that we get the benefits of its "fuzzing" approach.

@@ -234,6 +243,8 @@ func newEndpointManager(
floatingIPsEnabled bool,
nft bool,
) *endpointManager {
nlHandle, _ := netlink.NewHandle()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't look right (ignoring the error, not shimmable). Use a netlinkshim.HandleManager, which has a mock alternative.

}

// Peer information that we track for each active local endpoint.
type EpPeerData struct {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
type EpPeerData struct {
type EndpointBGPPeer struct {

Think spelling it out would help in the other files where this name is seen.

var err error
// If LocalBGPPeerIP has been updated, we need to remove old peer IP from all workload interfaces.
for ifaceName := range m.activeWlIfaceNameToID {
err = m.removeBGPPeerIPOnInterface(ifaceName, m.localBGPPeerIP)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suspicious that we need to remove the old IP specifically; what if the desired IP changes while Felix is restarting? Seems we'd get stuck


addrs, err := m.nlHandle.AddrList(link, family)
if err != nil {
// Not sure why this would happen, but pass it up.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Link might be deleted under you by CNI plugin

return nil
}

func lookupLink(nlHandle netlinkHandle, name string) (link netlink.Link, err error, notFound bool) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should use errors.Is(err, netlink.LinkNotFoundError) in the caller; that's more common to see

if !errors.Is(err, fs.ErrExist) {
lastError = err
logrus.Error("IterActionNoOp")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dev error left in?

@song-jiang song-jiang changed the title [WIP]Support nested BGP peering with calico-nodes running in local kubevirt VM pods Support nested BGP peering with calico-nodes running in local kubevirt VM pods Mar 11, 2025
@@ -94,6 +94,7 @@ protocol bgp Global_10_192_0_3 from bgp_template {
calico_export_to_bgp_peers(true);
reject;
}; # Only want to export routes for workloads.
next hop self;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change to a normal non-local peer; was that intended?

Comment on lines +146 to +157



# Skipping global bgp peer (2001::102)


# Skipping global bgp peer (2001::103)


# Skipping global bgp peer (2001::104)


Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should be less noisy about skipping non-local peers? Just seems like we're adding a bit of cruft to every file instead of just saying "# No local peers configured."

}

// Given a new peer data, check and update the cache if needed.
func (abp *ActiveBGPPeerCalculator) checkAndUpdatePeerData(id model.WorkloadEndpointKey, newPeerData EndpointBGPPeer) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: please can you move these "leaf" functions below onEndpointUpdate() I much prefer reading top-down to bottom up.

logCxt := logrus.WithField("update", update)
switch id := update.Key.(type) {
case model.WorkloadEndpointKey:
if update.Value != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should use an InheritIndex for matching the WEPs. Some of their labels get inherited from the namespace/service account.

// It does not support assigning multiple IPs to the interface.

// ipNetStr is string format of net.IPNet.
type ipNetStr string
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Think this type re-invents somethignw e already have. Why not use Felix's ip.Addr type, which is comparable (so can be used in map keys) and already has methods for converting to CIDR (and could easily be extended to with IsIPv6Bootstrap() for example. I think go's stdlib net.IP has some methods for checking the type of the address that you might be able to leverage.

// reset w.fsWatcher
w.fsWatcher = nil

if w.newFsnotifyWatcherErr {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: prefer to avoid polluting prod code with test machinery. I'd add a field newFsnotifyWatcher func() error that can be shimmed, with the default impl being the real one.

// Start begins watching the directory.
func (w *FileWatcher) runWatcher() {
// Get current state of the directory and emit initial events.
w.scanDirectory()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think your initial scan needs to be after you've started watching. Otherwise you might miss a file being created between here and starting the watch.

if err != nil {
log.WithError(err).Info("Error initializing fsnotify. Falling back to polling.")
} else {
defer w.fsWatcher.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer in a loop is usually a bug, defer won't run until the function returns, so you'll stack up defers with each loop.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My go-to answer is to move the loop body to a method.


currentState := make(map[string]os.FileInfo)

err := filepath.Walk(w.dir, func(path string, info os.FileInfo, err error) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you need to walk subdirs or is our directory flat? If flat, I think you could just do os.ReadDir()

func (w *FileWatcher) Stop() {
close(w.stopChan)
if w.fsWatcher != nil {
w.fsWatcher.Close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also deferred in the main loop, so this could close it twice (and possibly race if it's not concurrency safe?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs-pr-required Change is not yet documented release-note-required Change has user-facing impact (no matter how small)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants