Fix exponential memory allocation in Exec and improve performance #1296

charlievieth · 2024-11-07T04:19:52Z

This commit changes SQLiteConn.Exec to use the raw Go query string instead of repeatedly converting it to a C string (which it would do for every statement in the provided query). This yields a ~20% performance improvement for a query containing one statement and a significantly larger improvement when the query contains multiple statements as is common when importing a SQL dump (our benchmark shows a 5x improvement for handling 1k SQL statements).

Additionally, this commit improves the performance of Exec by 2x or more and makes number and size of allocations constant when there are no bind parameters (the performance improvement scales with the number of SQL statements in the query). This is achieved by having the entire query processed in C code thus requiring only one CGO call.

The speedup for Exec'ing single statement queries means that wrapping simple statements in a transaction is now twice as fast.

This commit also improves the test coverage of Exec, which previously failed to test that Exec could process multiple statements like INSERT. It also adds some Exec specific benchmarks that highlight both the improvements here and the overhead of using a cancellable Context.

This commit is a slimmed down and improved version of PR #1133:

goos: darwin
goarch: arm64
pkg: github.com/mattn/go-sqlite3
cpu: Apple M1 Max
                                       │    b.txt     │                n.txt                │
                                       │    sec/op    │   sec/op     vs base                │
Suite/BenchmarkExec/Params-10             1.434µ ± 1%   1.186µ ± 0%  -17.27% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10          1267.5n ± 0%   759.2n ± 1%  -40.10% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10      2.886µ ± 0%   2.517µ ± 0%  -12.80% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10    2.605µ ± 1%   1.829µ ± 1%  -29.81% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               1852.6µ ± 1%   582.3µ ± 0%  -68.57% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        3053.3µ ± 3%   582.0µ ± 0%  -80.94% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                  4.126µ ± 2%   2.200µ ± 1%  -46.67% (p=0.000 n=10)
geomean                                   16.40µ        8.455µ       -48.44%

                                       │      b.txt      │                n.txt                │
                                       │      B/op       │    B/op     vs base                 │
Suite/BenchmarkExec/Params-10                 248.0 ± 0%   240.0 ± 0%    -3.23% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10              128.00 ± 0%   64.00 ± 0%   -50.00% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10          408.0 ± 0%   400.0 ± 0%    -1.96% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10        288.0 ± 0%   208.0 ± 0%   -27.78% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               5406674.50 ± 0%   64.00 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10         5566758.5 ± 0%   208.0 ± 0%  -100.00% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                      712.0 ± 0%   520.0 ± 0%   -26.97% (p=0.000 n=10)
geomean                                     4.899Ki        189.7        -96.22%

                                       │     b.txt     │               n.txt                │
                                       │   allocs/op   │ allocs/op   vs base                │
Suite/BenchmarkExec/Params-10              10.000 ± 0%   9.000 ± 0%  -10.00% (p=0.000 n=10)
Suite/BenchmarkExec/NoParams-10             7.000 ± 0%   4.000 ± 0%  -42.86% (p=0.000 n=10)
Suite/BenchmarkExecContext/Params-10        12.00 ± 0%   11.00 ± 0%   -8.33% (p=0.000 n=10)
Suite/BenchmarkExecContext/NoParams-10      9.000 ± 0%   6.000 ± 0%  -33.33% (p=0.000 n=10)
Suite/BenchmarkExecStep-10               7000.000 ± 0%   4.000 ± 0%  -99.94% (p=0.000 n=10)
Suite/BenchmarkExecContextStep-10        9001.000 ± 0%   6.000 ± 0%  -99.93% (p=0.000 n=10)
Suite/BenchmarkExecTx-10                    27.00 ± 0%   18.00 ± 0%  -33.33% (p=0.000 n=10)
geomean                                     74.60        7.224       -90.32%

charlievieth · 2024-11-09T03:29:48Z

Note: This PR has been updated to include logic for Exec'ing queries that have no bind parameters.

charlievieth · 2024-12-08T06:41:11Z

@rittneje Any chance I could get this reviewed?

This commit changes SQLiteConn.Exec to use the raw Go query string instead of repeatedly converting it to a C string (which it would do for every statement in the provided query). This yields a ~20% performance improvement for a query containing one statement and a significantly larger improvement when the query contains multiple statements as is common when importing a SQL dump (our benchmark shows a 5x improvement for handling 1k SQL statements). Additionally, this commit improves the performance of Exec by 2x or more and makes number and size of allocations constant when there are no bind parameters (the performance improvement scales with the number of SQL statements in the query). This is achieved by having the entire query processed in C code thus requiring only one CGO call. The speedup for Exec'ing single statement queries means that wrapping simple statements in a transaction is now twice as fast. This commit also improves the test coverage of Exec, which previously failed to test that Exec could process multiple statements like INSERT. It also adds some Exec specific benchmarks that highlight both the improvements here and the overhead of using a cancellable Context. This commit is a slimmed down and improved version of PR mattn#1133: mattn#1133 ``` goos: darwin goarch: arm64 pkg: github.com/mattn/go-sqlite3 cpu: Apple M1 Max │ b.txt │ n.txt │ │ sec/op │ sec/op vs base │ Suite/BenchmarkExec/Params-10 1.434µ ± 1% 1.186µ ± 0% -17.27% (p=0.000 n=10) Suite/BenchmarkExec/NoParams-10 1267.5n ± 0% 759.2n ± 1% -40.10% (p=0.000 n=10) Suite/BenchmarkExecContext/Params-10 2.886µ ± 0% 2.517µ ± 0% -12.80% (p=0.000 n=10) Suite/BenchmarkExecContext/NoParams-10 2.605µ ± 1% 1.829µ ± 1% -29.81% (p=0.000 n=10) Suite/BenchmarkExecStep-10 1852.6µ ± 1% 582.3µ ± 0% -68.57% (p=0.000 n=10) Suite/BenchmarkExecContextStep-10 3053.3µ ± 3% 582.0µ ± 0% -80.94% (p=0.000 n=10) Suite/BenchmarkExecTx-10 4.126µ ± 2% 2.200µ ± 1% -46.67% (p=0.000 n=10) geomean 16.40µ 8.455µ -48.44% │ b.txt │ n.txt │ │ B/op │ B/op vs base │ Suite/BenchmarkExec/Params-10 248.0 ± 0% 240.0 ± 0% -3.23% (p=0.000 n=10) Suite/BenchmarkExec/NoParams-10 128.00 ± 0% 64.00 ± 0% -50.00% (p=0.000 n=10) Suite/BenchmarkExecContext/Params-10 408.0 ± 0% 400.0 ± 0% -1.96% (p=0.000 n=10) Suite/BenchmarkExecContext/NoParams-10 288.0 ± 0% 208.0 ± 0% -27.78% (p=0.000 n=10) Suite/BenchmarkExecStep-10 5406674.50 ± 0% 64.00 ± 0% -100.00% (p=0.000 n=10) Suite/BenchmarkExecContextStep-10 5566758.5 ± 0% 208.0 ± 0% -100.00% (p=0.000 n=10) Suite/BenchmarkExecTx-10 712.0 ± 0% 520.0 ± 0% -26.97% (p=0.000 n=10) geomean 4.899Ki 189.7 -96.22% │ b.txt │ n.txt │ │ allocs/op │ allocs/op vs base │ Suite/BenchmarkExec/Params-10 10.000 ± 0% 9.000 ± 0% -10.00% (p=0.000 n=10) Suite/BenchmarkExec/NoParams-10 7.000 ± 0% 4.000 ± 0% -42.86% (p=0.000 n=10) Suite/BenchmarkExecContext/Params-10 12.00 ± 0% 11.00 ± 0% -8.33% (p=0.000 n=10) Suite/BenchmarkExecContext/NoParams-10 9.000 ± 0% 6.000 ± 0% -33.33% (p=0.000 n=10) Suite/BenchmarkExecStep-10 7000.000 ± 0% 4.000 ± 0% -99.94% (p=0.000 n=10) Suite/BenchmarkExecContextStep-10 9001.000 ± 0% 6.000 ± 0% -99.93% (p=0.000 n=10) Suite/BenchmarkExecTx-10 27.00 ± 0% 18.00 ± 0% -33.33% (p=0.000 n=10) geomean 74.60 7.224 -90.32% ```

mattn · 2025-04-01T14:01:04Z

@rittneje Could you please code-review this?

rittneje · 2025-04-05T19:04:33Z

unsafe_go121.go

+
+// The unsafe.StringData function was made available in Go 1.20 but it
+// was not until Go 1.21 that Go was changed to interpret the Go version
+// in go.mod (1.19 as of writing this) as the minimum version required


@mattn Currently, the GitHub workflow tests 1.19, 1.20, and 1.21, so it is pretty out of date. And the readme links to the offical Golang release policy, which (at this moment) only supports 1.23 and 1.24. Should we just change go.mod and GitHub workflow to 1.23+?

rittneje · 2025-04-05T19:05:52Z

unsafe_go121.go

+//go:build go1.21
+// +build go1.21
+
+// The unsafe.StringData function was made available in Go 1.20 but it


I'm confused - if it was added in 1.20, why isn't the build constraint go1.20? The go mod directive historically was about enabling language features, not library functions.

rittneje · 2025-04-05T19:07:47Z

unsafe_go121.go

+	}
+	// The return value of unsafe.StringData
+	// is unspecified if the string is empty.
+	return &placeHolder[0]


I don't quite follow why this is necessary. If the length of the string is empty, why does it matter (to C) what pointer we give it?

rittneje · 2025-04-05T19:13:11Z

sqlite3.go

+		} while (rv == SQLITE_ROW);
+
+		// Only record the number of changes made by the last statement.
+		*changes = sqlite3_changes64(db);


Should we really do this if there was an error?

rittneje · 2025-04-05T19:14:50Z

sqlite3.go

+			sqlite3_finalize(stmt);
+			return rv;
+		}
+		rv = sqlite3_finalize(stmt);


If the most recent evaluation of the statement encountered no errors or if the statement is never been evaluated, then sqlite3_finalize() returns SQLITE_OK. If the most recent evaluation of statement S failed, then sqlite3_finalize(S) returns the appropriate error code or extended error code.

This implies that there is no need to inspect the return value here, and it can be done unconditionally (before checking rv).

rittneje · 2025-04-05T19:16:23Z

sqlite3.go

+			return rv;
+		}
+
+		nBytes -= tail - zSql;


Here you are assuming tail != NULL but above in _sqlite3_prepare_v2 you have explicit handling for that. I think we should be consistent.

rittneje · 2025-04-05T19:16:47Z

sqlite3.go

@@ -858,54 +913,119 @@ func (c *SQLiteConn) Exec(query string, args []driver.Value) (driver.Result, err
 }

 func (c *SQLiteConn) exec(ctx context.Context, query string, args []driver.NamedValue) (driver.Result, error) {
-	start := 0
+	// Trim the query. This is mostly important for getting rid


rittneje · 2025-04-05T19:20:29Z

sqlite3.go

+				stmtArgs[i].Ordinal = i + 1
+			}
+			var err error
+			res, err = s.exec(ctx, stmtArgs)
 			if err != nil && err != driver.ErrSkip {


What does it mean if exec returns ErrSkip, and why is it safe to ignore it here?

rittneje · 2025-04-05T19:23:25Z

sqlite3.go

 		}
+		query = strings.TrimSpace(query[sz:])


Why TrimSpace?

rittneje · 2025-04-05T19:35:22Z

sqlite3.go

-		s.Close()
-		if tail == "" {
+		s.finalize()
+		if len(query) == 0 {


I think there is an important nuance in play here. There are two ways to know you are done. The first is that pzTail is the empty string. The second is that ppStmt is null.

However, the flaw in the current code is that it assumes that just checking whether pzTail is whitespace is a sufficient stand-in. This is not the case, because you could have a trailing comment like "DELETE FROM MyTable WHERE ID = ?001; -- foo". In this case, pzTail will be " -- foo" after the first iteration, and then ppStmt will be null after the second iteration. But as written, I don't think that case is handled correctly by this code.

(Please also add a test case for the trailing comment situation.)

rittneje · 2025-04-05T19:39:20Z

sqlite3_test.go

@@ -1090,6 +1091,67 @@ func TestExecer(t *testing.T) {
 	}
 }

+func TestExecDriverResult(t *testing.T) {
+	setup := func(t *testing.T) *sql.DB {


This is only called once - can we just merge it into test?

rittneje

see other comments

charlievieth mentioned this pull request Nov 7, 2024

sqlite3: reduce C to Go string conversions in SQLiteConn.{query,exec} #1133

Open

charlievieth force-pushed the cev/exec-allocs branch 2 times, most recently from 74a60e6 to 9fb17fc Compare November 9, 2024 03:27

charlievieth changed the title ~~Fix exponential memory allocation in SQLiteConn.Exec~~ Fix exponential memory allocation in Exec and improve performance Nov 9, 2024

charlievieth force-pushed the cev/exec-allocs branch 10 times, most recently from 0a54f03 to 89769f9 Compare November 12, 2024 04:38

charlievieth force-pushed the cev/exec-allocs branch from 89769f9 to 604aac6 Compare December 13, 2024 01:04

charlievieth added 2 commits December 12, 2024 20:04

test: add Exec tests and benchmarks

008b48b

charlievieth force-pushed the cev/exec-allocs branch from 604aac6 to 4974017 Compare December 13, 2024 01:04

rittneje reviewed Apr 5, 2025

View reviewed changes

sqlite3.go

}

query = strings.TrimSpace(query[sz:])

Copy link

Collaborator

rittneje Apr 5, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why TrimSpace?

rittneje reviewed Apr 5, 2025

View reviewed changes

rittneje requested changes Apr 5, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix exponential memory allocation in Exec and improve performance #1296

Fix exponential memory allocation in Exec and improve performance #1296

charlievieth commented Nov 7, 2024 •

edited

Loading

charlievieth commented Nov 9, 2024

charlievieth commented Dec 8, 2024

mattn commented Apr 1, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025

rittneje Apr 5, 2025 •

edited

Loading

rittneje Apr 5, 2025

rittneje left a comment

Fix exponential memory allocation in Exec and improve performance #1296

Are you sure you want to change the base?

Fix exponential memory allocation in Exec and improve performance #1296

Conversation

charlievieth commented Nov 7, 2024 • edited Loading

charlievieth commented Nov 9, 2024

charlievieth commented Dec 8, 2024

mattn commented Apr 1, 2025

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rittneje Apr 5, 2025 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rittneje left a comment

Choose a reason for hiding this comment

charlievieth commented Nov 7, 2024 •

edited

Loading

rittneje Apr 5, 2025 •

edited

Loading