Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EVPN - Incorrect L3VNI routermac address installed #18190

Open
2 tasks done
packet-time opened this issue Feb 17, 2025 · 5 comments
Open
2 tasks done

EVPN - Incorrect L3VNI routermac address installed #18190

packet-time opened this issue Feb 17, 2025 · 5 comments
Labels
triage Needs further investigation

Comments

@packet-time
Copy link

packet-time commented Feb 17, 2025

Description

When using multiple VRFs, each with an L3 VNI, leaking routes between the VRFs in the IPv4 address family causes zebra to "confuse" the mac address for each L3VNI of a remote VTEP. Zebra will install all L3VNIs of the remote VTEP with a mac address of only one of the L3VNIs.

When routing packets, the kernel chooses the correct L3VNI device to send a packet based on the destination address, but when the packet arrives at the L3VNI at the remote VTEP, the destination mac address is for a different L3VNI, so the kernel silently drops it.

Version

FRRouting 10.2.1 (pod2) on Linux(6.8.12-8-pve).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--build=x86_64-linux-gnu' '--prefix=/usr' '--includedir=${prefix}/include' '--mandir=${prefix}/share/man' '--infodir=${prefix}/share/info' '--sysconfdir=/etc' '--localstatedir=/var' '--disable-option-checking' '--disable-silent-rules' '--libdir=${prefix}/lib/x86_64-linux-gnu' '--libexecdir=${prefix}/lib/x86_64-linux-gnu' '--disable-maintainer-mode' '--sbindir=/usr/lib/frr' '--with-vtysh-pager=/usr/bin/pager' '--libdir=/usr/lib/x86_64-linux-gnu/frr' '--with-moduledir=/usr/lib/x86_64-linux-gnu/frr/modules' '--disable-dependency-tracking' '--enable-rpki' '--disable-scripting' '--enable-pim6d' '--disable-grpc' '--with-libpam' '--enable-doc' '--enable-doc-html' '--enable-snmp' '--enable-fpm' '--disable-protobuf' '--disable-zeromq' '--enable-ospfapi' '--enable-bgp-vnc' '--enable-multipath=256' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frrvty' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' 'build_alias=x86_64-linux-gnu' 'PYTHON=python3'

How to reproduce

VTEP1

!
vrf vrf_dev
 vni 100
exit-vrf
!
vrf vrf_prod
 vni 200
exit-vrf
!
router bgp 64550
 bgp router-id 10.1.254.10
 neighbor VTEP peer-group
 neighbor VTEP remote-as 64550
 neighbor VTEP bfd
 neighbor 10.1.254.2 peer-group VTEP
 !
 address-family ipv4 unicast
  redistribute connected
  rd vpn export 64550:0
  rt vpn export 64550:0
  rt vpn import 64550:1 64550:2
  export vpn
  import vpn
 exit-address-family
 !
 address-family l2vpn evpn
  neighbor VTEP activate
  advertise-all-vni
 exit-address-family
exit
!
router bgp 64550 vrf vrf_dev
 bgp router-id 10.1.254.10
 !
 address-family ipv4 unicast
  redistribute connected
  rd vpn export 64550:1
  rt vpn export 64550:1
  rt vpn import 64550:0
  export vpn
  import vpn
 exit-address-family
exit
!
router bgp 64550 vrf vrf_prod
 bgp router-id 10.1.254.10
 !
 address-family ipv4 unicast
  redistribute connected
  rd vpn export 64550:2
  rt vpn export 64550:2
  rt vpn import 64550:0
  export vpn
  import vpn
 exit-address-family
exit

VTEP2 config is the exact same, just using VTEP1 as neighbor.

Expected behavior

Once EVPN type 2 routes are available from the remote VTEP, the local VTEP should install routes to both L3VNIs with the router mac of each type 2 route, which should be unique across VRFs.

Actual behavior

Remote VTEP VNIs:
sh vrf vni

VRF                                   VNI        VxLAN IF             L3-SVI               State Rmac
vrf_dev                               100        vrfvx_dev            vrfbr_dev            Up    6a:c5:7f:d8:64:26
vrf_prod                              200        vrfvx_prod           vrfbr_prod           Up    5e:e9:e9:55:be:99

Running ip nei on the local VTEP shows the correct rmac associated with the L3VNI at first (when only one type 2 route is installed for vrf_dev)

10.1.254.2 dev vrfbr_dev lladdr 6a:c5:7f:d8:64:26 extern_learn NOARP proto zebra

As soon as another type 2 route gets added in the other VRF, the routermac changes.

10.1.254.2 dev vrfbr_dev lladdr 5e:e9:e9:55:be:99 extern_learn NOARP proto zebra
10.1.254.2 dev vrfbr_prod lladdr 5e:e9:e9:55:be:99 extern_learn NOARP proto zebra

Here is the log line from zebra:

L3VNI 100 RMAC change(6a:c5:7f:d8:64:26 --> 5e:e9:e9:55:be:99) for nexthop 10.1.254.2

This is incorrect, as you can see -- any packets sent to either L3VNI will have a destination mac of 5e:e9:e9:55:be:99 which only belongs to vrf_prod. This causes a blackhole for any traffic destined for vrfbr_dev.

The question is - why does zebra think the rmac has changed. The VPN table clearly shows the correct rmac for each type 2 route.

If I remove any vpn import/export statements from the IPv4 address family, the issue goes away.

Additional context

No response

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@packet-time packet-time added the triage Needs further investigation label Feb 17, 2025
@chdxD1
Copy link

chdxD1 commented Feb 18, 2025

After looking at the EVPN code the last couple of days:
zebra is currently tracking the nexthops for the L3VNI which it gets from the VRF the route exists in (for zebra this is the VRF leaked to), not from the L3VNI bridge interface or the nexthop VRF.

I also remember a discussion with some Broadcom SONiC folks well in the past, while not prohibited in the RFC 9135, running different RMACs for a single VTEP isn't recommended and breaks stuff on other systems.

@packet-time
Copy link
Author

After looking at the EVPN code the last couple of days: zebra is currently tracking the nexthops for the L3VNI which it gets from the VRF the route exists in (for zebra this is the VRF leaked to), not from the L3VNI bridge interface or the nexthop VRF.

I also remember a discussion with some Broadcom SONiC folks well in the past, while not prohibited in the RFC 9135, running different RMACs for a single VTEP isn't recommended and breaks stuff on other systems.

I don’t understand the point about a single RMAC for a VTEP. There should be a unique RMAC per L3VNI and of course a single L3VNI per VRF, no? At least that’s the behavior FRR has today. In fact, the example EVPN configuration in the FRR documentation shows exactly this.

@packet-time packet-time changed the title EVPN - Incorrect L3VNI routermac mac address installed EVPN - Incorrect L3VNI routermac address installed Feb 18, 2025
@chdxD1
Copy link

chdxD1 commented Feb 18, 2025

The documentation is mentioning different MAC addresses for each L3VNI bridge. However it is also quite usual for devices (I see it with the ASR9ks and also with the SONiC switches we have) to have a single, system wide, RMAC, not one per L3VNI or VRF. You still have one L3VNI per VRF of course. In our deployment of FRR we've chosen to do the same, especially as we had compatibility issues with networking gear.

imho this is still a bug though and if an EVPN route is leaked it should not modify the RMACs of the VRF Leaked to.

@packet-time
Copy link
Author

Interesting, so you’ve configured all L3VNIs with the same MAC address?

@chdxD1
Copy link

chdxD1 commented Feb 20, 2025

Yes, all L3(!)-VNIs on one system(!) share the same MAC address in our deployment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

2 participants