Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isValid hangs regardless timeout #1128

Open
PiotrKlimczak opened this issue Aug 9, 2019 · 9 comments
Open

isValid hangs regardless timeout #1128

PiotrKlimczak opened this issue Aug 9, 2019 · 9 comments
Labels
Bug A bug in the driver. A high priority item that one can expect to be addressed quickly.

Comments

@PiotrKlimczak
Copy link

PiotrKlimczak commented Aug 9, 2019

Driver version

Driver 7.2.2-jre8

SQL Server version

SQL 2016 Enterprise with multi-subnet HA

Client Operating System

Centos 7.6

JAVA/JVM version

Oracle JDK 1.8.192

Table schema

N/A

Problem description

  1. Expected behaviour:
    Invoking isValid operation on connection with the timeout parameter should not take longer than the timeout configured

  2. Actual behaviour:
    If DB failover happens, sometimes we are getting TCP socket read hanging forever in JDBC Driver (see stacktrace below).
    This causes our connection pool to completely hang, as write lock acquired on connection pool object is never released during testing if connections in pool are healthy.
    Currently as per implementation isValid timeout is set as query timeout, but if I understand the code correctly, this is server-side timeout, while our problem happens on the client side.
    Therefore the current implementation of isValid timeout is not handling correctly client-side problems.

  3. Error message/stack trace:
    N/A as everything hangs

  4. Any other details that can be helpful:
    I believe this is similar problem to that one: Database connection hangs on failover from one node to another #764
    Except we are getting it almost every time.

I think the solution might be to temporarily change TCP Socket timeout from the one set by connection parameter, to the one set in isValid parameter to make sure TCP timeout equals to isValid timeout, then after successful validation change TCP timeout to original value.

JDBC trace logs

Thread dump:

java.lang.Thread.State: RUNNABLE
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.socketRead(SocketInputStream.java:116)
at java.net.SocketInputStream.read(SocketInputStream.java:171)
at java.net.SocketInputStream.read(SocketInputStream.java:141)
at com.microsoft.sqlserver.jdbc.TDSChannel.read(IOBuffer.java:2023)
at com.microsoft.sqlserver.jdbc.TDSReader.readPacket(IOBuffer.java:6418)
- locked <0x00000007266a5fd0> (a com.microsoft.sqlserver.jdbc.TDSReader)
at com.microsoft.sqlserver.jdbc.TDSCommand.startResponse(IOBuffer.java:7579)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.doExecuteStatement(SQLServerStatement.java:866)
at com.microsoft.sqlserver.jdbc.SQLServerStatement$StmtExecCmd.doExecute(SQLServerStatement.java:768)
at com.microsoft.sqlserver.jdbc.TDSCommand.execute(IOBuffer.java:7194)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.executeCommand(SQLServerConnection.java:2935)
- locked <0x00000007266a62b0> (a java.lang.Object)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeCommand(SQLServerStatement.java:248)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeStatement(SQLServerStatement.java:223)
at com.microsoft.sqlserver.jdbc.SQLServerStatement.executeQueryInternal(SQLServerStatement.java:700)
at com.microsoft.sqlserver.jdbc.SQLServerConnection.isValid(SQLServerConnection.java:5591)
at com.microsoft.sqlserver.jdbc.SQLServerConnectionPoolProxy.isValid(SQLServerConnectionPoolProxy.java:480)
at org.apache.aries.transaction.jdbc.internal.ValidatingDelegatingManagedConnectionFactory.isValidConnection(ValidatingDelegatingManagedConnectionFactory.java:59)
at org.apache.aries.transaction.jdbc.internal.ValidatingDelegatingManagedConnectionFactory.getInvalidConnections(ValidatingDelegatingManagedConnectionFactory.java:84)
at org.apache.aries.transaction.jdbc.internal.ValidatingGenericConnectionManager$ValidatingTask.run(ValidatingGenericConnectionManager.java:155)
at java.util.TimerThread.mainLoop(Timer.java:555)
at java.util.TimerThread.run(Timer.java:505)

Reproduction code

  1. Establish approx 100 connections to DB
  2. Failover multisubnet HA installation.
  3. During failover, keep invoking isValid on all connections
@PiotrKlimczak PiotrKlimczak added the Bug A bug in the driver. A high priority item that one can expect to be addressed quickly. label Aug 9, 2019
@PiotrKlimczak PiotrKlimczak changed the title isValid hangs regardless isValid hangs regardless timeout Aug 9, 2019
@ulvii
Copy link
Contributor

ulvii commented Aug 9, 2019

Hi @PiotrKlimczak ,

Thanks for creating the issue, we will investigate and get back to you soon.

@PiotrKlimczak
Copy link
Author

Just to update our findings.
We have implemented the fix ourselves in a copy of your 7.2.2 tag and it works correctly now.

Our implementation of is valid:

public boolean isValid(int timeout) throws SQLException {
    boolean isValid = false;

    Boolean disableIsValidTimeoutFix = Boolean.valueOf(System.getProperty("mssql.jdbc.isValid.timeout.fix.disable", "false"));
    int defaultTimeout = Integer.valueOf(System.getProperty("mssql.jdbc.isValid.default.timeout", "500")); //Why would it take any longer?

    int originalNetworkTimeout = 0;

    if (!disableIsValidTimeoutFix) {
        originalNetworkTimeout = getNetworkTimeout();
    }

    loggerExternal.entering(getClassNameLogging(), "isValid", timeout);

    // Throw an exception if the timeout is invalid
    if (timeout < 0) {
        MessageFormat form = new MessageFormat(SQLServerException.getErrString("R_invalidQueryTimeOutValue"));
        Object[] msgArgs = {timeout};
        SQLServerException.makeFromDriverError(this, this, form.format(msgArgs), null, true);
    } else if (timeout == 0) {
        timeout = defaultTimeout;
    }

    // Return false if the connection is closed
    if (isSessionUnAvailable())
        return false;

    try {
        SQLServerStatement stmt = new SQLServerStatement(this, ResultSet.TYPE_FORWARD_ONLY,
                ResultSet.CONCUR_READ_ONLY, SQLServerStatementColumnEncryptionSetting.UseConnectionSetting);

        // If asked, limit the time to wait for the query to complete.
        if (0 != timeout)
            stmt.setQueryTimeout(timeout);

        if (!disableIsValidTimeoutFix) {
            setNetworkTimeout(null, timeout);
        }

        // Try to execute the query. If this succeeds, then the connection is valid.
        // If it fails (throws an exception), then the connection is not valid.
        // If a timeout was provided, execution throws an "query timed out" exception
        // if the query fails to execute in that time.
        stmt.executeQueryInternal("SELECT 1");
        stmt.close();
        isValid = true;
    } catch (SQLException e) {
        // Do not propagate SQLExceptions from query execution or statement closure.
        // The connection is considered to be invalid if the statement fails to close,
        // even though query execution succeeded.
        connectionlogger.fine(toString() + " Exception checking connection validity: " + e.getMessage());
    }

    try {
        if (!disableIsValidTimeoutFix) {
            setNetworkTimeout(null, originalNetworkTimeout);
        }
    } catch (Exception e) {
        // no-op
    }

    loggerExternal.exiting(getClassNameLogging(), "isValid", isValid);
    return isValid;
}

@ulvii
Copy link
Contributor

ulvii commented Aug 12, 2019

Hi @PiotrKlimczak ,
Thanks for the update.
Would you also try your application with 7.0.0 version of the driver and let me know if the issue still occurs? We made changes to query timeouts in 7.2.2, I am wondering if is related.

@ulvii
Copy link
Contributor

ulvii commented Aug 26, 2019

Hi @PiotrKlimczak ,
I was checking if you had a chance to try 7.0 version of the driver?

@bdw429s
Copy link

bdw429s commented Oct 28, 2020

@ulvii @PiotrKlimczak Was this ever fixed? I am experiencing the exact same issue above on version 8.4.1 of the MSSQL JDBC driver when connecting to a Azure managed DB. Intermittent network dropouts will cause the isValid checks in the connection pool manager to hang and it brings down the entire server due to open locks in the connection pool manager.

@ulvii
Copy link
Contributor

ulvii commented Oct 30, 2020

Hi @bdw429s ,

Thanks for reporting the details and confirming that the issue happens with newer versions of the driver. Is using socketTimeout connection property an option for you? This can prevent the "hang" and throw an exception immediately if the connection is invalid.

@bdw429s
Copy link

bdw429s commented Oct 30, 2020

Hi @ulvii I can't directly control the socket timeout in use, and I'm currently in discussion with the developers of the app server I use to confirm what the socket time out is that's being used.

https://luceeserver.atlassian.net/browse/LDEV-3124

@ionutioio
Copy link

Hello, experiencing the same issue with driver version 9.4.0. Are there any plans for a fix?

@VeryVerySpicy
Copy link
Contributor

hi @ionutioio, sorry but there is no timeline for this fix at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug A bug in the driver. A high priority item that one can expect to be addressed quickly.
Projects
Status: Backlog
Development

No branches or pull requests

5 participants