Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix exponential memory allocation in Exec and improve performance #1296

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

charlievieth
Copy link
Contributor

@charlievieth charlievieth commented Nov 7, 2024

This commit changes SQLiteConn.Exec to use the raw Go query string instead of repeatedly converting it to a C string (which it would do for every statement in the provided query). This yields a ~20% performance improvement for a query containing one statement and a significantly larger improvement when the query contains multiple statements as is common when importing a SQL dump (our benchmark shows a 5x improvement for handling 1k SQL statements).

Additionally, this commit improves the performance of Exec by 2x or more and makes number and size of allocations constant when there are no bind parameters (the performance improvement scales with the number of SQL statements in the query). This is achieved by having the entire query processed in C code thus requiring only one CGO call.

The speedup for Exec'ing single statement queries means that wrapping simple statements in a transaction is now twice as fast.

This commit also improves the test coverage of Exec, which previously failed to test that Exec could process multiple statements like INSERT. It also adds some Exec specific benchmarks that highlight both the improvements here and the overhead of using a cancellable Context.

This commit is a slimmed down and improved version of PR #1133:

goos: darwin
goarch: arm64
pkg: github.com/mattn/go-sqlite3
cpu: Apple M1 Max
                                       │    b.txt     │                n.txt                │
                                       │    sec/op    │   sec/op     vs base                │
Suite/BenchmarkExec/Params-10             1.434µ ± 1%   1.186µ ± 0%  -17.27% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10          1267.5n ± 0%   759.2n ± 1%  -40.10% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10      2.886µ ± 0%   2.517µ ± 0%  -12.80% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10    2.605µ ± 1%   1.829µ ± 1%  -29.81% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               1852.6µ ± 1%   582.3µ ± 0%  -68.57% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        3053.3µ ± 3%   582.0µ ± 0%  -80.94% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                  4.126µ ± 2%   2.200µ ± 1%  -46.67% (p=0.000 n=10)
geomean                                   16.40µ        8.455µ       -48.44%

                                       │      b.txt      │                n.txt                │
                                       │      B/op       │    B/op     vs base                 │
Suite/BenchmarkExec/Params-10                 248.0 ± 0%   240.0 ± 0%    -3.23% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10              128.00 ± 0%   64.00 ± 0%   -50.00% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10          408.0 ± 0%   400.0 ± 0%    -1.96% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10        288.0 ± 0%   208.0 ± 0%   -27.78% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               5406674.50 ± 0%   64.00 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10         5566758.5 ± 0%   208.0 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                      712.0 ± 0%   520.0 ± 0%   -26.97% (p=0.000 n=10)
geomean                                     4.899Ki        189.7        -96.22%

                                       │     b.txt     │               n.txt                │
                                       │   allocs/op   │ allocs/op   vs base                │
Suite/BenchmarkExec/Params-10              10.000 ± 0%   9.000 ± 0%  -10.00% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10             7.000 ± 0%   4.000 ± 0%  -42.86% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10        12.00 ± 0%   11.00 ± 0%   -8.33% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10      9.000 ± 0%   6.000 ± 0%  -33.33% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               7000.000 ± 0%   4.000 ± 0%  -99.94% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        9001.000 ± 0%   6.000 ± 0%  -99.93% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                    27.00 ± 0%   18.00 ± 0%  -33.33% (p=0.000 n=10)
geomean                                     74.60        7.224       -90.32%

@charlievieth charlievieth force-pushed the cev/exec-allocs branch 2 times, most recently from 74a60e6 to 9fb17fc Compare November 9, 2024 03:27
@charlievieth charlievieth changed the title Fix exponential memory allocation in SQLiteConn.Exec Fix exponential memory allocation in Exec and improve performance Nov 9, 2024
@charlievieth
Copy link
Contributor Author

Note: This PR has been updated to include logic for Exec'ing queries that have no bind parameters.

@charlievieth charlievieth force-pushed the cev/exec-allocs branch 10 times, most recently from 0a54f03 to 89769f9 Compare November 12, 2024 04:38
@charlievieth
Copy link
Contributor Author

@rittneje Any chance I could get this reviewed?

This commit changes SQLiteConn.Exec to use the raw Go query string
instead of repeatedly converting it to a C string (which it would do for
every statement in the provided query). This yields a ~20% performance
improvement for a query containing one statement and a significantly
larger improvement when the query contains multiple statements as is
common when importing a SQL dump (our benchmark shows a 5x improvement
for handling 1k SQL statements).

Additionally, this commit improves the performance of Exec by 2x or more
and makes number and size of allocations constant when there are no bind
parameters (the performance improvement scales with the number of SQL
statements in the query). This is achieved by having the entire query
processed in C code thus requiring only one CGO call.

The speedup for Exec'ing single statement queries means that wrapping
simple statements in a transaction is now twice as fast.

This commit also improves the test coverage of Exec, which previously
failed to test that Exec could process multiple statements like INSERT.
It also adds some Exec specific benchmarks that highlight both the
improvements here and the overhead of using a cancellable Context.

This commit is a slimmed down and improved version of PR mattn#1133:
  mattn#1133

```
goos: darwin
goarch: arm64
pkg: github.com/mattn/go-sqlite3
cpu: Apple M1 Max
                                       │    b.txt     │                n.txt                │
                                       │    sec/op    │   sec/op     vs base                │
Suite/BenchmarkExec/Params-10             1.434µ ± 1%   1.186µ ± 0%  -17.27% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10          1267.5n ± 0%   759.2n ± 1%  -40.10% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10      2.886µ ± 0%   2.517µ ± 0%  -12.80% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10    2.605µ ± 1%   1.829µ ± 1%  -29.81% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               1852.6µ ± 1%   582.3µ ± 0%  -68.57% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        3053.3µ ± 3%   582.0µ ± 0%  -80.94% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                  4.126µ ± 2%   2.200µ ± 1%  -46.67% (p=0.000 n=10)
geomean                                   16.40µ        8.455µ       -48.44%

                                       │      b.txt      │                n.txt                │
                                       │      B/op       │    B/op     vs base                 │
Suite/BenchmarkExec/Params-10                 248.0 ± 0%   240.0 ± 0%    -3.23% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10              128.00 ± 0%   64.00 ± 0%   -50.00% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10          408.0 ± 0%   400.0 ± 0%    -1.96% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10        288.0 ± 0%   208.0 ± 0%   -27.78% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               5406674.50 ± 0%   64.00 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10         5566758.5 ± 0%   208.0 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                      712.0 ± 0%   520.0 ± 0%   -26.97% (p=0.000 n=10)
geomean                                     4.899Ki        189.7        -96.22%

                                       │     b.txt     │               n.txt                │
                                       │   allocs/op   │ allocs/op   vs base                │
Suite/BenchmarkExec/Params-10              10.000 ± 0%   9.000 ± 0%  -10.00% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10             7.000 ± 0%   4.000 ± 0%  -42.86% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10        12.00 ± 0%   11.00 ± 0%   -8.33% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10      9.000 ± 0%   6.000 ± 0%  -33.33% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               7000.000 ± 0%   4.000 ± 0%  -99.94% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        9001.000 ± 0%   6.000 ± 0%  -99.93% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                    27.00 ± 0%   18.00 ± 0%  -33.33% (p=0.000 n=10)
geomean                                     74.60        7.224       -90.32%
```
@mattn
Copy link
Owner

mattn commented Apr 1, 2025

@rittneje Could you please code-review this?


// The unsafe.StringData function was made available in Go 1.20 but it
// was not until Go 1.21 that Go was changed to interpret the Go version
// in go.mod (1.19 as of writing this) as the minimum version required
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mattn Currently, the GitHub workflow tests 1.19, 1.20, and 1.21, so it is pretty out of date. And the readme links to the offical Golang release policy, which (at this moment) only supports 1.23 and 1.24. Should we just change go.mod and GitHub workflow to 1.23+?

//go:build go1.21
// +build go1.21

// The unsafe.StringData function was made available in Go 1.20 but it
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused - if it was added in 1.20, why isn't the build constraint go1.20? The go mod directive historically was about enabling language features, not library functions.

}
// The return value of unsafe.StringData
// is unspecified if the string is empty.
return &placeHolder[0]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't quite follow why this is necessary. If the length of the string is empty, why does it matter (to C) what pointer we give it?

} while (rv == SQLITE_ROW);

// Only record the number of changes made by the last statement.
*changes = sqlite3_changes64(db);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we really do this if there was an error?

sqlite3_finalize(stmt);
return rv;
}
rv = sqlite3_finalize(stmt);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the most recent evaluation of the statement encountered no errors or if the statement is never been evaluated, then sqlite3_finalize() returns SQLITE_OK. If the most recent evaluation of statement S failed, then sqlite3_finalize(S) returns the appropriate error code or extended error code.

This implies that there is no need to inspect the return value here, and it can be done unconditionally (before checking rv).

return rv;
}

nBytes -= tail - zSql;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here you are assuming tail != NULL but above in _sqlite3_prepare_v2 you have explicit handling for that. I think we should be consistent.

@@ -858,54 +913,119 @@ func (c *SQLiteConn) Exec(query string, args []driver.Value) (driver.Result, err
}

func (c *SQLiteConn) exec(ctx context.Context, query string, args []driver.NamedValue) (driver.Result, error) {
start := 0
// Trim the query. This is mostly important for getting rid
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why?

stmtArgs[i].Ordinal = i + 1
}
var err error
res, err = s.exec(ctx, stmtArgs)
if err != nil && err != driver.ErrSkip {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does it mean if exec returns ErrSkip, and why is it safe to ignore it here?

}
query = strings.TrimSpace(query[sz:])
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why TrimSpace?

s.Close()
if tail == "" {
s.finalize()
if len(query) == 0 {
Copy link
Collaborator

@rittneje rittneje Apr 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there is an important nuance in play here. There are two ways to know you are done. The first is that pzTail is the empty string. The second is that ppStmt is null.

However, the flaw in the current code is that it assumes that just checking whether pzTail is whitespace is a sufficient stand-in. This is not the case, because you could have a trailing comment like "DELETE FROM MyTable WHERE ID = ?001; -- foo". In this case, pzTail will be " -- foo" after the first iteration, and then ppStmt will be null after the second iteration. But as written, I don't think that case is handled correctly by this code.

(Please also add a test case for the trailing comment situation.)

@@ -1090,6 +1091,67 @@ func TestExecer(t *testing.T) {
}
}

func TestExecDriverResult(t *testing.T) {
setup := func(t *testing.T) *sql.DB {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is only called once - can we just merge it into test?

Copy link
Collaborator

@rittneje rittneje left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see other comments

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants