FRR not advertising routes to BGP peers if a lot of netlink messages happen. #18098
Open
2 tasks done
Labels
triage
Needs further investigation
Description
We have hit a bug where FRR does not announce all routes to its BGP peer.
We suspect that this happens when there is a lot of routes being remove and re-added on linux (via netlink) and the advertisements timer fires at the same time.
In a reproducer test case, we start 2 frr instances, one for announcing prefixes to the other, the other to check what prefixes where received.
Then we run a binary that adds and deletes routes on demand (triggered via HTTP).
After the topology is setup we add 2500 routes, remove 1000, and re-add the 1000 again.
In the end we would expect to see 2500 routes on the side that receives routes, but most of the time not all routes are there.
Version
How to reproduce
The following repository provides a reproducer including a readme on how to run the reproducer.
https://github.com/lukedirtwalker/frr_reproducer
Expected behavior
All 2500 routes are announced and present at the frr3 side.
Actual behavior
Only a subset of all 2500 routes are announced to the frr3 side. All routes are in the routing table of frr2 for some it shows "Not advertised to any peer", as shown in an example below:
Additional context
No response
Checklist
The text was updated successfully, but these errors were encountered: