Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue: vrrp master switchover lead to disconnection of tcp connections #2255

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

maryoho
Copy link

@maryoho maryoho commented Mar 14, 2023

solve #2254
add new config param exec_before_vip to vrrp_instance
if this is set, notify_master script will be executed before set vip when master changed

@maryoho maryoho changed the title add configuration exec_before_vip to execute notify_master script before fix issue: vrrp master switchover lead to disconnection of tcp connections Mar 14, 2023
@pqarmitage
Copy link
Collaborator

Unfortunately we cannot merge this commit for a couple of reasons.

  1. The formatting of the code is not consistent with the style used for keepalived; not only does the added code not comply, but the formatting of some of the existing code has been modified so that it no longer complies. They key issue is that keepalived code uses tabs for indentation, with the tab width being 8 spaces. The code in the patch appears to be using 4 spaces per indent. This is cosmetic and straight forward to resolve.

In addition, system_call_script_0() should be before system_call_script() in the source file, since system_call_script() calls system_call_script_0(). I also think system_call_script_0() should be called something like system_call_script_wait() to be more descriptive.

  1. This point is fundamental. You cannot use wait() or any of its associated functions (e.g. waitpid()) if that might cause keepalived to be blocked. The reason for this is that keepalived might have several VRRP instances configured, and it needs to be able to continue operating all of them, even while scripts are being executed.

Also consider a VRRP instance with three servers, priorities say 100, 102, 104. If the system with priority 104 fails, the system with priority 102 should take over as master, but if there is a delay while a script is executed, then the system with priority 100 may take over as master initially, and then when the system with priority 102 completes executing the script, it would then become master.

system_call_script() already has the mechanism for handling actions to be taken after a script completes. If parameter func is not NULL, then func() is scheduled to be run once the script has completed executing. Suppose func is vrrp_master_after_notifies(), then it could be something like:

vrrp_master_after_notifies(thread_ref_t thread)
{
        vrrp_t vrrp = THREAD_ARG(thread);

        vrrp_state_become_master(vrrp);
        /*
          * If we catch the master transition
           * register a gratuitous arp thread delayed to garp_delay secs.
           */
        if (vrrp->garp_delay)
                 thread_add_timer(master, vrrp_gratuitous_arp_thread,
                                         vrrp, vrrp->garp_delay);
}

so that it has the actions in vrrp_master_state_tx() that would normally be executed in that function if exec_before_vip is not set. Obviously, vrrp_master_state_tx() would need to return after calling send_instance_notifies() if exec_before_vip is set.

BTW I prefer something like notify_script_before_master to your suggested exec_before_vip configuration option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants