-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bgpd: fix default instance when leaving the hidden state (backport 10.3) #18161
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Donatas Abraitis <[email protected]> (cherry picked from commit c9a2928)
If we have IPv6-only network and no IPv4 addresses at all, then by default 0.0.0.0 is created which is treated as malformed according to RFC 6286. Signed-off-by: Donatas Abraitis <[email protected]> (cherry picked from commit 739f2b5)
Signed-off-by: Donatas Abraitis <[email protected]> (cherry picked from commit 48560b5)
…/pr-17959 bgpd: Do not start BGP session if BGP identifier is not set (backport FRRouting#17959)
Memory is being leaked when processing the eoiu marker. BGP is creating a dummy dest to contain the data but it was never freed. As well as the eoiu info was not being freed either. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit c6b7a99)
…/pr-18000 bgpd: Fix up memory leak in processing eoiu marker (backport FRRouting#18000)
msg_new takes a uint16_t, the length passed down variable is a unsigned int, thus 32 bit. It's possible, but highly unlikely, that the msglen could be greater than 16 bit. Let's just add some checks to ensure that this could not happen. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit 283cc51)
Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit 4b96752)
Grab the count of streams in ibuf when it is protected by a mutex. Since this data is written to it in another pthread. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit f94ad53)
The dg_update_list access is controlled by the dg_mutex in all other locations. Let's just add a mutex usage around the initialization of the dg_update_list even if it's part of the startup, just to keep things consistent. Signed-off-by: Donald Sharp <[email protected]> (cherry picked from commit 19af3f3)
Use a memory allocation specific type for filter names (to help detect memory leaks) and fix a memory leak when releasing peer memory. Signed-off-by: Rafael Zalamena <[email protected]> (cherry picked from commit d1440da)
…/pr-18038 pimd: fix memory leak and assign allocation type (backport FRRouting#18038)
…/pr-17865 Coverity 2024 new hotness (backport FRRouting#17865)
Issue: When there is no traffic for a group, the LHR and RP take the default KAT+Join timer expiry of a maximum of 480 seconds to clear the S,G . However, in the FHR, we update the state from JOINED to NOT Joined, downstream state from PPto NOINFO. This restarts the ET timer, causing S,G on FHR to take more than 10 minutes to age out. In other words, Consider a case where (S,G) is in Join state. When the traffic stops and the KAT (210) expires, the Join expiry timer restarts. At this time, if we receive a prune, the expectation is to set PPT to 0 (RFC 4601 sec 4.5.2). When the PPT expires, we move to the noinfo state and restart the expiry timer one more time. We remove the (S,G) entry only after ~10 minutes when there is no active traffic. Summary: KAT Join ET 210 + PP ET 210 + NOINFO ET 210. Solution: Delete the ifchannel when in noinfo state, and KAT is not running. Ticket: FRRouting#13703 Signed-off-by: Rajesh Varatharaj <[email protected]> (cherry picked from commit afed39e)
In case interface address is learnt during configuration, make sure to run DR election when configuring PIM/PIM passive on interface. Signed-off-by: Rafael Zalamena <[email protected]> (cherry picked from commit 8644524)
…/pr-14105 pimd: Fix for FHR mroute taking longer to age out (backport FRRouting#14105)
While the loop is currently exited in all cases after using nexthop, it is a footgun to have "nh" around to be reused in another iteration of the loop. This would leave nexthop with partial data from the previous use. Make it local where needed instead. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit ce7f5b2)
Zero out the 12 unused bytes (for the IPv6 address) when reading in an IPv4 address. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit 95cf0b2)
Doesn't seem to break anything but really poor style to pass potentially uninitialized data to hash_lookup. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit c88589f)
rmap_src wasn't initialized, so for IPv4 the unused 12 bytes would contain whatever junk is on the stack on function entry. Also move the IPv4 parse before the IPv6 parse so if it's successful we can be sure the other bytes haven't been touched. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit b666ee5)
When reading in a nexthop from ZAPI, only set the fields that actually have meaning. While it shouldn't happen to begin with, we can otherwise carry padding garbage into the unused leftover union bytes. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit 4a0e141)
We were hashing 4 bytes of the address. Even for IPv6 addresses. Oops. The reason this was done was to try to make it faster, but made a complex maze out of everything. Time for a refactor. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit 001fcfa)
In bgp route leak, when import vrf x is executed, it creates bgp instance as hidden with asn value as unspecified. When router bgp x is configured ensure the correct as, asnotation is applied otherwise running config shows asn value as 0. This can lead to frr-reload failure when any FRR config change. Fix: Move asn and asnotiation, as_pretty value in common done section, so when bgp_create gets existing instance but before returning update asn and required fields in common section. In bgp_create(): when returning for hidden at least update asn and required when bgp instance created implicitly due to vrf leak. if (hidden) { bgp = bgp_old; goto peer_init; <<< } Before fix: show running: router bgp 0 vrf purple bgp router-id 10.10.3.11 ! address-family ipv4 unicast redistribute static import vrf blue exit-address-family ! address-family ipv6 unicast import vrf blue exit-address-family ! address-family l2vpn evpn advertise ipv4 unicast advertise ipv6 unicast exit-address-family exit Testing: 1) following snippet config: router bgp 63420 vrf blue import vrf purple router bgp 63420 vrf purple import vrf blue 2) restart frr leads to the running config with 0 asn value. Signed-off-by: Chirag Shah <[email protected]> (cherry picked from commit 2ff08af)
Blocking all signals on non-main threads is not the way to go, at least the handlers for SIGSEGV, SIGBUS, SIGILL, SIGABRT and SIGFPE need to run so we get backtraces. Otherwise the process just exits. Signed-off-by: David Lamparter <[email protected]> (cherry picked from commit 13a6ac5)
…/pr-18081 bgpd: fix bgp vrf instance creation from implicit (backport FRRouting#18081)
Sometimes, NHRP receives L2 information on a cache entry with the 0.0.0.0 IP address. NHRP considers it as valid and updates the binding with the new IP address. > Feb 09 20:09:54 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x2 cache used 0 type 4 > Feb 09 20:10:35 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4 > Feb 09 20:10:48 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: del-neigh 10.2.114.238 dev dmvpn1 lladdr 162.251.180.10 nud 0x4 cache used 1 type 4 > Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: who-has 10.2.114.238 dev dmvpn1 lladdr (unspec) nud 0x1 cache used 1 type 4 > Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QVXNM-NVHEQ] Netlink: update binding for 10.2.114.238 dev dmvpn1 from c 162.251.180.10 peer.vc.nbma 162.251.180.10 to lladdr (unspec) > Feb 09 20:10:49 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x2 cache used 1 type 4 > Feb 09 20:11:30 aws-sin-vpn01 nhrpd[2695]: [QQ0NK-1H449] Netlink: new-neigh 10.2.114.238 dev dmvpn1 lladdr 0.0.0.0 nud 0x4 cache used 1 type 4 Actually, the 0.0.0.0 IP addressed mentiones in the 'who-has' message is wrong because the nud state value means that value is incomplete and should not be handled as a valid entry. Instead of considering it, fix this by by invalidating the current binding. This step is necessary in order to permit NHRP to trigger resolution requests again. Signed-off-by: Philippe Guibert <[email protected]> (cherry picked from commit 3202323)
When SRv6 is enabled and an SRv6 locator is specified in the BGP configuration, BGP may attempt to request SRv6 locator information from zebra before the connection is fully established. If this occurs, the request fails with the following error: ``` 2025/02/06 16:37:32 BGP: [HR66R-TWQYD][EC 100663302] srv6_manager_get_locator: invalid zclient socket ```` As a result, BGP is unable to obtain the locator information, preventing SRv6 VPN from working. This commit fixes the issue by ensuring BGP requests SRv6 locator information once the connection with zebra is successfully established. Signed-off-by: Carmine Scarpitta <[email protected]> (cherry picked from commit 16640b6)
…/pr-18069 bgpd: Request SRv6 locator after zebra connection (backport FRRouting#18069)
…/pr-18078 nhrpd: fix dont consider incomplete L2 entry (backport FRRouting#18078)
…/pr-18060 lib: crash handlers must be allowed on threads (backport FRRouting#18060)
The `show ipv6 route json` command displays the IPv6 routing table in JSON format, including SRv6 SIDs. For each SRv6 SID, it provides behavior and SID attributes. However, it does not include the SID structure. This commit adds the SID structure to the SRv6 SID JSON output. Signed-off-by: Carmine Scarpitta <[email protected]> (cherry picked from commit 312f7b3)
The `static_srv6_sids` topotest verifies that staticd correctly programs the SIDs in the zebra RIB. Currently, the topotest only validates the programmed behavior and SID attributes. This commit extends the topotest to also validate the SID structure. Signed-off-by: Carmine Scarpitta <[email protected]> (cherry picked from commit a6d02fe)
When a BGP instance with a manually assigned VPN label is deleted, the label is not released from the Zebra label registry. As a result, reapplying a configuration with the same manual label leads to VPN prefix export failures. For example, with the following configuration: > router bgp 65000 vrf BLUE > address-family ipv4 unicast > label vpn export <int> Release zebra label registry on unconfiguration. Fixes: d162d5f ("bgpd: fix hardset l3vpn label available in mpls pool") Signed-off-by: Louis Scalbert <[email protected]> (cherry picked from commit d636362)
…/pr-18121 bgpd: release manual vpn label on instance deletion (backport FRRouting#18121)
…/pr-18064 staticd: Fix SRv6 SID installation and deletion (backport FRRouting#18064)
…/pr-18023 lib: fix false context information for SRv6 route (backport FRRouting#18023)
When unconfiguring a default BGP instance with VPN SAFI configurations, the default BGP structure remains but enters a hidden state. Upon reconfiguration, the instance name incorrectly appears as "VIEW ?" instead of "VRF default". And the name_pretty pointer The name_pretty pointer is replaced by another one with the incorrect name. This also leads to a memory leak as the previous pointer is not properly freed. Do not rewrite the instance name. Fixes: 4d0e7a4 ("bgpd: VRF-Lite fix default bgp delete") Signed-off-by: Louis Scalbert <[email protected]>
bgp_process_queue_init() is not called in bgp_create() when leaving the BGP instance hidden state because of the following goto: > if (hidden) { > bgp = bgp_old; > goto peer_init; > } Upon reconfiguration of the default instance, the prefixes are never set into a meta queue by mq_add_handler(). They are never processed for zebra RIB installation and announcements of update/withdraw. Do not delete the BGP process_queue when hiding. Fixes: 4d0e7a4 ("bgpd: VRF-Lite fix default bgp delete") Signed-off-by: Louis Scalbert <[email protected]>
Upon reconfiguration of the default instance, free the previous pointer. > ================================================================= > ==1209420==ERROR: LeakSanitizer: detected memory leaks > > Direct leak of 4 byte(s) in 1 object(s) allocated from: > #0 0x7fbde0eaa037 in __interceptor_calloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:154 > #1 0x7fbde0874634 in qcalloc lib/memory.c:106 > #2 0x55dcca019263 in bgp_rtc_plist_entry_asn_new bgpd/bgp_rtc.c:474 > #3 0x55dcca0199f6 in bgp_rtc_plist_entry_add bgpd/bgp_rtc.c:556 > FRRouting#4 0x55dcca01b078 in bgp_rtc_plist_entry_set bgpd/bgp_rtc.c:700 > FRRouting#5 0x55dcca016421 in bgp_nlri_parse_rtc bgpd/bgp_rtc.c:56 > FRRouting#6 0x55dcc9f39f61 in bgp_nlri_parse bgpd/bgp_packet.c:352 > FRRouting#7 0x55dcc9f47628 in bgp_update_receive bgpd/bgp_packet.c:2485 > FRRouting#8 0x55dcc9f54867 in bgp_process_packet bgpd/bgp_packet.c:4114 > FRRouting#9 0x7fbde097aebc in event_call lib/event.c:1984 > FRRouting#10 0x7fbde084710f in frr_run lib/libfrr.c:1246 > FRRouting#11 0x55dcc9dd818b in main bgpd/bgp_main.c:557 > FRRouting#12 0x7fbde044fd79 in __libc_start_main ../csu/libc-start.c:308 Fixes: 4d0e7a4 ("bgpd: VRF-Lite fix default bgp delete") Signed-off-by: Louis Scalbert <[email protected]>
Test that leaving the hidden BGP instance state is working. Signed-off-by: Louis Scalbert <[email protected]>
Creates the default VRF instance after the other VRF instances. The default VRF instance is created in hidden state. Check that AS number in show run is correctly written. Signed-off-by: Louis Scalbert <[email protected]>
This pull request has conflicts, please resolve those before we can evaluate the pull request. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
backport of #18119