Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RUST-1439 Cursors not killed on drop when runtime is in the process of shutting down #719

Open
tgnottingham opened this issue Aug 5, 2022 · 3 comments
Labels
tracked-in-jira Ticket filed in Mongo's Jira system

Comments

@tgnottingham
Copy link

Versions/Environment

  1. What version of Rust are you using?
    1.64.0-nightly (c0bbd42ce 2022-07-03)
  2. What operating system are you using?
    Debian Bullseye in Docker container
  3. What versions of the driver and its dependencies are you using? (Run cargo pkgid mongodb & cargo pkgid bson)
    2.3.0 and 2.3.0
  4. What version of MongoDB are you using? (Check with the MongoDB shell using db.version())
    4.4.15
  5. What is your MongoDB topology (standalone, replica set, sharded cluster, serverless)?
    Sharded cluster

Describe the bug

Cursors are not killed when they are dropped if the tokio runtime is in the process of shutting down (for example, if tokio::runtime::Runtime::shutdown_timeout or the tokio::runtime::Runtime drop implementation is executing). Shutting down the runtime causes all unfinished futures to be dropped, which in turns causes any cursors existing in the corresponding tasks to be dropped. However, the cursors are not killed, presumably because the code that kills cursors attempts to execute a new task to do so, and tokio is unwilling to comply because it's trying to shutdown.

This is particularly a problem when you have a large number of cursors open, as it causes them to persist and take up mongo cluster resources. If your process is being restarted frequently, a very large number of open cursors can accumulate in a short period of time.

I should note that I've only tested this issue with cursors created by change streams.

(I'm also seeing these cursors persist for many hours, instead of being killed on the MongoDB side after 10 minutes of inactivity, but this may not be an issue specific to this driver.)

You can work around this problem by ensuring your cursors are dropped before shutting down the runtime. However, many users are unlikely to realize this needs to be done, so the issue would remain a continual hazard if not resolved. Also, ensuring the cursors are dropped before shutting down the runtime may be a significant challenge in some applications.

I'm unsure if a solution is possible, but it sure would be nice if it were. Thanks. 🙂

@kmahar
Copy link
Contributor

kmahar commented Aug 10, 2022

Hi @tgnottingham, thank you for reaching out! We agree this is a problematic situation and we're currently discussing as a team what options we might have to alleviate this; I'll post an update here on that soon.

In the meantime I wanted to ask for some clarification on this point:

I'm also seeing these cursors persist for many hours, instead of being killed on the MongoDB side after 10 minutes of inactivity, but this may not be an issue specific to this driver.

Just to clarify, you're saying you see the following:

  1. Application creates a change stream
  2. Application is shut down
  3. Several hours later, the change stream from step 1 is still alive on the server? How are you verifying that it is still alive?

Regarding timeouts, in earlier versions of MongoDB there was a default cursor timeout of 10 minutes, though as of MongoDB 4.4.8 (SERVER-6036), the lifetime of a cursor is determined by the session used to create it. Any command you run with the Rust driver will use a session - either an "explicit" session, if you are creating and passing one in yourself when you call watch(), or an "implicit" session - if you do not provide one yourself the driver will handle obtaining one to use.
The server should automatically clean up any session that has not been used for a command after logicalSessionTimeoutMinutes. This value defaults to 30 minutes, making the new default cursor timeout 30 minutes as well. So I would expect that any change stream created by a process that is shut down and does not get a chance for killCursors to execute would persist for at most that long.
(It is possible to configure an alternate value for logicalSessionTimeoutMinutes however so that could be one possible explanation for what you're seeing here; you can check the value for your cluster by doing db.runCommand({hello: 1}) in the MongoDB shell and inspecting the value in the server reply.)

If you are seeing cursors from a shut-down application remain alive for longer than logicalSessionTimeoutMinutes, it sounds like a bug on the server side.

The driver does pool and re-use sessions while an application is running, however when the application restarts that pool would recreated from scratch, so sessions would not be reused across application runs and so I do not believe a cursor could be incorrectly kept alive that way even if the killCursors failed to execute.

I'll also note that I discovered there is a spec requirement the Rust driver neglected to implement, that on client shutdown, the endSessions command should be run to end all sessions the driver created. This would also kill all cursors associated with all sessions used by the application. However, ensuring the endSessions command executes and finishes before the async runtime shuts down seems to suffer from the same difficulty as you describe above with killCursors executing before the runtime shuts down, so we are considering what the best way to address that would be as well.

@kmahar kmahar removed the triage label Aug 10, 2022
@kmahar
Copy link
Contributor

kmahar commented Aug 10, 2022

Hi! Quick update on the cleanup issue.

We think we can improve this situation with a fix I describe in RUST-1439. To summarize, we will spawn a task for the purpose of cleanup at the time a cursor/change stream is created, and have that task wait on a message from a one-shot channel to proceed with cleanup. The drop implementation can then send a message to the channel. Since the task will already exist when the tokio runtime handle is dropped, or when calling shutdown_timeout begins, we won't run into the issue with not being able to spawn new tasks.
For anyone who just drops their handle to the runtime, the runtime will wait for the existing tasks to finish before dropping completes and the program shuts down. There's no way to fully "fix" this for anyone using shutdown_timeout, because we can't force the cleanup work to finish within some arbitrary timeout window, but this will at least prevent the issue of not even being able to start the cleanup work.

@bajanam bajanam changed the title Cursors not killed on drop when runtime is in the process of shutting down RUST-1439 Cursors not killed on drop when runtime is in the process of shutting down Aug 30, 2022
@bajanam bajanam added the tracked-in-jira Ticket filed in Mongo's Jira system label Aug 30, 2022
@tgnottingham
Copy link
Author

Hi, thanks for the response!

Just to clarify, you're saying you see the following:

1. Application creates a change stream

2. Application is shut down

3. Several hours later, the change stream from step 1 is still alive on the server? How are you verifying that it is still alive?

I wouldn't say I've verified it exactly. 🙂 Just inferred it from Grafana and Prometheus metrics -- the query sum(mongodb_ss_metrics_cursor_open { csr_type = "total" }) in particular. I'm not too familiar, so maybe that query is wrong, or maybe there's an issue with the metrics themselves, or maybe if there's no activity for hours, they don't get updated, so I was just seeing old numbers. I haven't dug any deeper, sadly.

We think we can improve this situation with a fix I describe in RUST-1439. To summarize, we will spawn a task for the purpose of cleanup at the time a cursor/change stream is created, and have that task wait on a message from a one-shot channel to proceed with cleanup.

That might work, but I do wonder if it's possible for the sending or receiving task to be terminated before they do what they need to do. This might not be what you had in mind, but I found that the following doesn't always print, for example:

#[tokio::main]
async fn main() {
    let _s = SendOnDrop::new();
}

struct SendOnDrop {
    tx: Option<tokio::sync::oneshot::Sender<i32>>,
}

impl SendOnDrop  {
    fn new() -> SendOnDrop {
        let (tx, rx) = tokio::sync::oneshot::channel();

        tokio::spawn(async move {
            println!("{}", rx.await.unwrap());
        });

        SendOnDrop {
            tx: Some(tx),
        }
    }
}

impl Drop for SendOnDrop {
    fn drop(&mut self) {
        self.tx.take().unwrap().send(1).unwrap();
    }
}

Also tried a variation where SendOnDrop is spawned in its own task, and the send(1) sometimes panics due to the receiver having already hung up.

Anyway, something to watch out for. 🙂

@kmahar kmahar removed their assignment Jan 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tracked-in-jira Ticket filed in Mongo's Jira system
Projects
None yet
Development

No branches or pull requests

3 participants