Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cfp/37315 bgp session default gateway #68

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
170 changes: 170 additions & 0 deletions cilium/CFP-37315-bgp-session-with-default-gateway.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,170 @@
# CFP-37315

**SIG: SIG-BGP**

**Begin Design Discussion:** 2024-02-24

**Cilium Release:** X.XX

**Authors:** Naveen Achyuta <[email protected]>

## Summary

Cilium’s current BGP implementation requires specifying the peer IP address in the BGP cluster configuration. In large-scale environments with thousands of Kubernetes nodes, managing distinct BGP configuration files (one per node) becomes impractical.

This proposal offers a simpler alternative: rely on the default gateway of ToR switches to automatically create BGP sessions when no peer IP is provided in the BGP cluster configuration.

## Motivation

Large networks with thousands of Kubernetes nodes cannot feasibly manage the peer IP addresses of all ToR switches within numerous BGP cluster configurations. Allowing Cilium to automatically discover and use the default gateway as the peer simplifies BGP session creation, reducing operational overhead and configuration complexity.

## Goals

### Auto-Discovery of Peer IPs
Allow Cilium to establish BGP sessions by discovering the default gateway on the node, when no peer IP is explicitly provided in the BGP cluster configuration.
### Single Config File for All Nodes
Enable large networks to use a generic BGP configuration file without needing to specify unique peer IP addresses on a per-node basis.

## Non-Goals

* _List aspects which are specifically out of context for this CFP._

## Proposal

### Overview

At a high level, this proposal modifies how Cilium handles BGP peer IP configuration. If a user omits the peer IP for a BGP process, the Cilium agent attempts to retrieve the node’s default gateway from statedb (which stores local routing information). Cilium can then use this default gateway as the peer IP to establish the BGP session. If the default gateway is not found, the agent skips the neighbor and logs an error for it.

### BGP Cluster Config

If the user skips the peer ip for a bgp process and enables auto discovery, cilium agent should not return back an error when its creating new bgp sessions. It should first rely on statedb to get default gateway. If not found, it should skip the neighbor and log an error.
Every peer will have to specify the address family so that we fetch the default gateway for that address family

config proposal:
```
bgpInstances:
- name: "65001"
localASN: 65001
autoDiscovery: true
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can make this a stanza under peers? We can put address family information under the new stanza as well.

peers:
- name: peer0
  peerASN: 65000
  peerConfigRef:
    name: peer-config0
  autoDiscovery:
    mode: default-gateway
    addressFamily: ipv4 // or ipv6

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ya i agree. it will provide flexibility per peer. but i'm curious about the use-case. What is the use-case to use different discovery mechanisms for each peer OR auto-discovery for one peer and static ip for another?

Copy link
Member

@YutaroHayakawa YutaroHayakawa Mar 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example, using default gateway to peer with ToR and use static ip to peer with route reflector. Whether or not this is realistic, I'd prefer the option that gives the maximum leverage. The instance-level knob enforces users to use default gateway for all peers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

okay i was wondering if there is an existing use-case.
but i agree with you. it provides flexibility for sure.

peers:
- name: "peer1 - 65000"
peerASN: 65000
afi: ipv4
peerConfigRef:
name: "cilium-peer"
- name: "peer2 - 65000"
peerASN: 65000
afi: ipv6
peerConfigRef:
name: "cilium-peer"
```
In multi-homed environments (i.e., multiple default gateways leading to different ASNs), this proposal does not address which gateway belongs to which peer ASN.

example multi-homed config:
```
bgpInstances:
- name: "65001"
localASN: 65001
peers:
- name: "65000"
peerASN: 65000
peerConfigRef:
name: "cilium-peer"
- name: "65011"
peerASN: 65011
peerConfigRef:
name: "cilium-peer"
```
If multiple default gateways exist, we cannot reliably match them to different peer ASNs. One workaround is to use same local ASN on both the ToRs for bgp sessions with the k8s node, and bgp cluster config will have one peer configured.

proposed multi-homed config:

```
bgpInstances:
- name: "65001"
localASN: 65001
peers:
- name: "65000"
peerASN: 65000
afi: ipv4
peerConfigRef:
name: "cilium-peer"
```

proposed config in the device (arista):
```
bgp listen range 10.2.0.0/24 peer-group SERVERS remote-as 65001
router bgp 65100
router-id 10.1.1.1
timers bgp 1 3
neighbor SERVERS local-as 65000 no-prepend replace-as
```

### Routes from Statedb

cilium agent should rely on statedb to get the default gateway to create the bgp session. we can inject statedb into neighbor reconcillation struct and when we loop through neighbors, we can populate the peer ip by looking for default gateway of 0.0.0.0/0 or ::0/0 route in routes table of statedb.


## Impacts / Key Questions

### Impact 1: Multi-Homing Scenario

In multi-homed environments, we will be breaking the design of one peer per bgp session because we will create two bgp sessions with one peer defined in the config. One of the workarounds could be to define two peers with same config and a unique name.

### Key Question

Is there a different way to accomplish this?


### Impact 2: Multiple BGP Instances

The proposal does not take into account multiple bgp instances so auto-discovery will work in that case.

### Key Question

What are use-cases of having multiple bgp instances? (bgp instances is a list in config)

### Impact 3: config knob for auto-discovery

config knob to enable auto-discovery will be crucial because that enables the users to understand the dependencies before making the changes. We can have documentation for auto-discover and provide the pre-requisites for enabling it.
Without the knob, the user may use auto-discovery without understanding the requirements for it.

### Key Question

are we okay to add a knob for auto-discovery or should we enable auto-discovery by default if peerAddress is not specified?


### Option 1:

If the user wants to create bgp sessions without specifying peerAddress, the user will remove peerAddress field, enable auto-discovery in the config and add ipv4 and/or ipv6 peers explicitly with "afi" field

#### Pros

Adds flexibility to create ipv4 and/or ipv6 sessions.
auto-discovery field will help users to understand the requirements provided in the documentation before using it. (User can obviously enable it without knowing the requirements, but, we can error out with clear error messages)

#### Cons

For multi-homed environments, it deviates from the current "one bgp session per configured peer" design if we decide to use only one bgp peer for both sessions.

### Option 2:

do nothing

#### Pros



#### Cons

User will have to manage bgp cluster files for all the k8s nodes and updates to these files can be cumbersome

## Future Milestones

_List things that this CFP will enable but that are out of scope for now. This can help understand the greater impact of a proposal without requiring to extend the scope of a CFP unnecessarily._

### Deferred Milestone 1

_Description of deferred milestone_

### Deferred Milestone 2