-
Notifications
You must be signed in to change notification settings - Fork 478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PS-9453: percona_telemetry causes a long wait on COND_thd_list due to the absence of the root user #5452
Conversation
78a0b16
to
cc6a23b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
file permissions and typos need to be fixed, everything else - just suggestions for improvements.
cleaned up immediately by the caller. The caller does not take care of | ||
session_svc, because it is not aware of this structure. | ||
*/ | ||
std::shared_ptr<MYSQL_SESSION> mysql_session_close_guard( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was struggling with the same double indirection problem some time ago.
MYSQL_SESSION
is by itself a pointer to an opaque structure. So, I did this trick with
std::remove_pointer_t<>
;
https://github.com/percona/percona-server/blob/8.0/components/masking_functions/src/masking_functions/sql_context.cpp#L105
In my opinion this make the code more readable.
The only problem I see if you go my way, you won't be able to do mysql_session = nullptr
. You will need
to call something like mysql_session_close_guard.release()
but for this, in turns you will need std::unique_ptr<>
with more boilerplate instead of std::shared_ptr<>
.
In any case this is just a butification comment.
&mysql_h, [&srv = command_factory_service_](MYSQL_H *ptr) { | ||
srv.close(*ptr); | ||
}); | ||
|
||
mysql_service_status_t sstatus = command_factory_service_.init(&mysql_h); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another improvement I discovered recently, which may help simplify things a bit.
Oracle recently introduced a new MYSQL_COMMAND_LOCAL_THD_HANDLE
option for mysql_command_service
.
Basically, then you set this option with nullptr
value, it will create an internal THD object, initialize its session vars, calls my_thread_init() / my_thread_end(), sets current_thd
, etc. This THD object will be destroyed when you call mysql_command_service::close()
.
percona-ysorokin@c9a98d3#diff-3706ffbbf71661ebd0414b16d27fd6a1cc8f6960488d43d5ad51d9b15fe9ac00R51
… the absence of the root user https://perconadev.atlassian.net/browse/PS-9453 Problem: If there is no 'root' user (it was renamed) and Percona Telemetry is enabled, server shutdown stuck. Additionally SHOW PROCESSLIST reports increasing number of processes with root user in state 'login'. Cause: Percona Telemetry component used hard codded 'root' user used for querying the server for metrics. If the user is not present, mysql_command_factory service connect() method fails, however it leaves opened/orphaned internal MYSQL_SESSION. It is because after opening MYSQL_SESSION we check if the user exists. It doesn't exist, so the method returns with STATE_MACHINE_FAILED error. Caller does not know anything about underlying session, does cleanup, frees memory. Moreover, the same problem potentially exists even if the user exists, but cssm_begin_connect() exits with error for any reason. Creation of session registers THD object in Global_THD_manager. That's why increasing number of processes is visible in SHOW PROCESSLIST. Why it hangs during shutdown: During shutdown, the server waits for signal handler thread (signal_hand()) to join the main thread. Then the component infrastructure is deinitialized. At the same time signal_hand() waits for all threads to finish Global_THD_manager::wait_till_no_thd(). The above described bug related to orphaned mysql sessions cause that there are orphaned THDs that never end. This causes server to wait infinitively. Solution: 1. Handle the error in cssm_begin_connect() and close opened MYSQL_SESSION. This part fixes hangs caused by nonexistent user. 2. Percona Telemetry Component uses mysql.session user. When Percona Telemetry Component is enabled, the user is granted a few more permissions during the server startup. Note that even without this permissions, the component is able to work, but not able to report some metrics. Additionally fixed potential memory leak if connection setup in Percona Telemetry Component fails.
cc6a23b
to
ae464b1
Compare
… the absence of the root user https://perconadev.atlassian.net/browse/PS-9453 Adjusted MTR tests affected by the fix.
ae464b1
to
22edfb4
Compare
https://perconadev.atlassian.net/browse/PS-9453
Problem:
If there is no 'root' user (it was renamed) and Percona Telemetry is
enabled, server shutdown stuck.
Additionally SHOW PROCESSLIST reports increasing number of processes
with root user in state 'login'.
Cause:
Percona Telemetry component used hard codded 'root' user used for
querying the server for metrics. If the user is not present,
mysql_command_factory service connect() method fails, however it leaves
opened/orphaned internal MYSQL_SESSION. It is because after opening
MYSQL_SESSION we check if the user exists. It doesn't exist, so the
method returns with STATE_MACHINE_FAILED error.
Caller does not know anything about underlying session, does cleanup,
frees memory. Moreover, the same problem potentially exists even if
the user exists, but cssm_begin_connect() exits with error for any
reason.
Creation of session registers THD object in Global_THD_manager. That's
why increasing number of processes is visible in SHOW PROCESSLIST.
Why it hangs during shutdown:
During shutdown, the server waits for signal handler thread
(signal_hand()) to join the main thread. Then the component
infrastructure is deinitialized. At the same time signal_hand() waits
for all threads to finish Global_THD_manager::wait_till_no_thd().
The above described bug related to orphaned mysql sessions cause that
there are orphaned THDs that never end. This causes server to wait
infinitively.
Solution:
MYSQL_SESSION. This part fixes hangs caused by nonexistent user.
Telemetry Component is enabled, the user is granted a few more
permissions during the server startup. Note that even without this
permissions, the component is able to work, but not able to report some
metrics.
Additionally fixed potential memory leak if connection setup in Percona
Telemetry Component fails.