Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

routes from iBGP not resolving next-hop using non-BGP #18188

Open
2 tasks done
jkroonza opened this issue Feb 16, 2025 · 2 comments
Open
2 tasks done

routes from iBGP not resolving next-hop using non-BGP #18188

jkroonza opened this issue Feb 16, 2025 · 2 comments
Assignees
Labels
triage Needs further investigation

Comments

@jkroonza
Copy link

Description

Hi,

All info trimmed, and IP addresses obfuscated where needed.

On router A:

kerberos# sh ip bgp sum
...
a.b.c.2     4     65512    566386    564281  2358673    0    0 5d19h36m           20   190980 cerberus
kerberos# sh ip route a.b.c.2
Routing entry for a.b.c.2/32
  Known via "ospf", distance 110, metric 30, best
  Last update 5d19h39m ago
  * 172.31.255.2, via bond0.2, weight 1
kerberos# sh ip bgp a.b.c.2
BGP routing table entry for a.b.c.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    a.b.c.2 (inaccessible) from a.b.c.2 (a.b.c.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Community: .....
      Last update: Tue Feb 11 00:31:21 2025

As a result of a.b.c.2 being inaccessible, this prefix is not being advertised where it should be:

kerberos# sh ip bgp neigh a.b.c.137 advertised-routes 
BGP table version is 2358883, local router ID is a.b.c.1, vrf id 0
Default local pref 100, local AS 65512
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

    Network          Next Hop            Metric LocPrf Weight Path
 *> a.b.c.1/32   0.0.0.0                  0         65512 i

Total number of prefixes 1

a.b.c.137 is an eBGP "downstream" peer.

Version

kerberos# show version
FRRouting 10.0.2-gentoo (kerberos) on Linux(...).
Copyright 1996-2005 Kunihiro Ishiguro, et al.
configured with:
    '--prefix=/usr' '--build=x86_64-pc-linux-gnu' '--host=x86_64-pc-linux-gnu' '--mandir=/usr/share/man' '--infodir=/usr/share/info' '--datadir=/usr/share' '--sysconfdir=/etc' '--localstatedir=/var/lib' '--datarootdir=/usr/share' '--disable-dependency-tracking' '--disable-silent-rules' '--disable-static' '--docdir=/usr/share/doc/frr-10.0.2' '--htmldir=/usr/share/doc/frr-10.0.2/html' '--with-sysroot=/' 'LEX=flex' '--with-pkg-extra-version=-gentoo' '--enable-configfile-mask=0640' '--enable-logfile-mask=0640' '--libdir=/usr/lib/frr' '--sbindir=/usr/lib/frr' '--libexecdir=/usr/lib/frr' '--sysconfdir=/etc/frr' '--localstatedir=/run/frr' '--with-moduledir=/usr/lib/frr/modules' '--enable-user=frr' '--enable-group=frr' '--enable-vty-group=frr' '--enable-multipath=64' '--disable-doc' '--disable-fpm' '--disable-grpc' '--enable-realms' '--disable-nhrpd' '--disable-rpki' '--disable-snmp' 'build_alias=x86_64-pc-linux-gnu' 'host_alias=x86_64-pc-linux-gnu' 'PKG_CONFIG=/usr/bin/pkg-config' 'PKG_CONFIG_PATH=/var/tmp/portage/net-misc/frr-10.0.2/temp/python3.12/pkgconfig' 'PYTHON=/usr/bin/python3.12'

An upgrade to 10.0.3 is queued for our next change-window later in the week, but I don't think that will fix this.

How to reproduce

Peer two iBGP routers via loopbacks, connected to the same subnet. Have both also advertise their loopback via iBGP so that when they're up these loopbacks can (if route maps permit) be advertised to over eBGP peerings.

Expected behavior

Since the remote loopback is reachable in the FIB, I expect the network originated loopbacks to be forward advertised on eBGP peers as determined by route-maps. For that to happen, the route has to be marked as valid and not inaccessible.

Actual behavior

Valid routes are marked inaccessible, preventing them from being forward advertised.

This works if the routers peer using their ethernet interface addresses rather than their loopback addresses. Since there are a number of fail-over paths available all our iBGP peerings uses loopbacks.

Additional context

a.b.c.1 and a.b.c.2 peers via loopbacks, which is exchanged using OSPF.

disable-connected-check seems like a sensible candidate for the problem, but the route here is received on iBGP not eBGP, and as such it does not seem to relate.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@jkroonza jkroonza added the triage Needs further investigation label Feb 16, 2025
@jkroonza
Copy link
Author

Smallest config I could use to reproduce. gns3 here has a very old frr version, but the same problem is seen on 10.0.

r1# sh run
Building configuration...

Current configuration:
!
frr version 8.2.2
frr defaults traditional
hostname frr
hostname r1
service integrated-vtysh-config
!
interface eth0
 ip address 192.168.1.1/24
exit
!
interface eth1
 ip address 192.168.0.1/24
exit
!
interface lo
 ip address 10.0.0.1/32
exit
!
router bgp 64512
 neighbor 10.0.0.2 remote-as 64512
 neighbor 10.0.0.2 description r2
 neighbor 10.0.0.2 update-source 10.0.0.1
 neighbor 192.168.0.2 remote-as 64513
 neighbor 192.168.0.2 description re
 !
 address-family ipv4 unicast
  network 10.0.0.1/32
  neighbor 192.168.0.2 route-map all in
  neighbor 192.168.0.2 route-map all out
 exit-address-family
exit
!
router ospf
 ospf router-id 10.0.0.1
 redistribute connected metric-type 1
 network 192.168.1.0/24 area 0
exit
!
route-map all permit 10
exit
!
end
r2# sh run
Building configuration...

Current configuration:
!
frr version 8.2.2
frr defaults traditional
hostname frr
hostname r2
service integrated-vtysh-config
!
interface eth0
 ip address 192.168.1.2/24
exit
!
interface lo
 ip address 10.0.0.2/32
exit
!
router bgp 64512
 neighbor 10.0.0.1 remote-as 64512
 neighbor 10.0.0.1 description r1
 neighbor 10.0.0.1 update-source 10.0.0.2
 !
 address-family ipv4 unicast
  network 10.0.0.2/32
 exit-address-family
exit
!
router ospf
 ospf router-id 10.0.0.2
 redistribute connected metric-type 1
 network 192.168.1.0/24 area 0
exit
!
end

And the "downstream" peer to which I want both 10.0.0.1/32 and 10.0.0.2/32 advertised.

re# sh run
Building configuration...

Current configuration:
!
frr version 8.2.2
frr defaults traditional
hostname frr
hostname re
service integrated-vtysh-config
!
interface eth0
 ip address 192.168.0.2/24
exit
!
router bgp 64513
 neighbor 192.168.0.1 remote-as 64512
 neighbor 192.168.0.1 description rX
 !
 address-family ipv4 unicast
  neighbor 192.168.0.1 route-map all in
  neighbor 192.168.0.1 route-map all out
 exit-address-family
exit
!
route-map all permit 10
exit
!
end

If I peer directly on 192.168.1.{1,2} then everything works. This leads me to believe that because the prefix and next-hop is the same there is some weirdness going on.

On r1:

r1# sh ip bgp
BGP table version is 1, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.0.1/32      0.0.0.0                  0         32768 i
  i10.0.0.2/32      10.0.0.2                 0    100      0 i

Displayed  2 routes and 2 total paths

r1# sh ip bgp 10.0.0.2/32
BGP routing table entry for 10.0.0.2/32, version 0
Paths: (1 available, no best path)
  Not advertised to any peer
  Local
    10.0.0.2 (inaccessible) from 10.0.0.2 (10.0.0.2)
      Origin IGP, metric 0, localpref 100, invalid, internal
      Last update: Mon Feb 17 15:35:17 2025

r1# sh ip route 10.0.0.2
Routing entry for 10.0.0.2/32
  Known via "ospf", distance 110, metric 120, best
  Last update 00:14:35 ago
  * 192.168.1.2, via eth0, weight 1

In comparison, after shutting the 10.0.0.X peers and peering directly between 192.168.1.X

r1# sh ip bgp
BGP table version is 2, local router ID is 10.0.0.1, vrf id 0
Default local pref 100, local AS 64512
Status codes:  s suppressed, d damped, h history, * valid, > best, = multipath,
               i internal, r RIB-failure, S Stale, R Removed
Nexthop codes: @NNN nexthop's vrf id, < announce-nh-self
Origin codes:  i - IGP, e - EGP, ? - incomplete
RPKI validation codes: V valid, I invalid, N Not found

   Network          Next Hop            Metric LocPrf Weight Path
*> 10.0.0.1/32      0.0.0.0                  0         32768 i
*>i10.0.0.2/32      192.168.1.2              0    100      0 i

Displayed  2 routes and 2 total paths

r1# sh ip bgp 10.0.0.2/32
BGP routing table entry for 10.0.0.2/32, version 2
Paths: (1 available, best #1, table default)
  Advertised to non peer-group peers:
  192.168.0.2
  Local
    192.168.1.2 from 192.168.1.2 (10.0.0.2)
      Origin IGP, metric 0, localpref 100, valid, internal, best (First path received)
      Last update: Mon Feb 17 15:47:39 2025

r1# sh ip route 10.0.0.2
Routing entry for 10.0.0.2/32
  Known via "bgp", distance 200, metric 0
  Last update 00:01:33 ago
    192.168.1.2, via eth0, weight 1

Routing entry for 10.0.0.2/32
  Known via "ospf", distance 110, metric 120, best
  Last update 00:18:33 ago
  * 192.168.1.2, via eth0, weight 1

@riw777 riw777 self-assigned this Feb 18, 2025
@ton31337
Copy link
Member

Please enable "debug bgp updates" and "debug bgp neighbor" and show the logs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

3 participants