Skip to content

Commit 51f0833

Browse files
blagininalamb
andauthored
Add insta / snapshot testing to CLI & set up AWS mock (#13672)
* Do not normalize values * Fix tests & update docs * Prettier * Lowercase config params * Add snap to CLI & set up AWS mock * Refactor tests * Unify transform and parse * Fix tests * Setup CLI * Show minio output * Format Cargo.toml * Do not hardcode AWS params * Test options parsing * Add allow http * Fix aws build * Fix ip * Remove slash ☠️ * Format cargo toml * Remove integration_setup.bash * Update docs * Do not hardcode test names * Relock cargo * Remove aws sdk and set up minio in-place * Nit: Add missing ready local to the docs * Fix backslash test * Add missing backslash * put integration scripts in a separate folder * Move s3 tests from extended to rust flow * Reorganise the docs * Prettier * Do not use rust container to get docker * Add missing protobuf * revert change to extended.yml --------- Co-authored-by: Andrew Lamb <[email protected]>
1 parent f31ddd6 commit 51f0833

24 files changed

+572
-25
lines changed

.github/workflows/rust.yml

+26-6
Original file line numberDiff line numberDiff line change
@@ -171,21 +171,41 @@ jobs:
171171
name: cargo test (amd64)
172172
needs: linux-build-lib
173173
runs-on: ubuntu-latest
174-
container:
175-
image: amd64/rust
176174
steps:
177175
- uses: actions/checkout@v4
178176
with:
179177
submodules: true
180178
fetch-depth: 1
181179
- name: Setup Rust toolchain
182-
uses: ./.github/actions/setup-builder
183-
with:
184-
rust-version: stable
180+
run: rustup toolchain install stable
181+
- name: Install Protobuf Compiler
182+
run: sudo apt-get install -y protobuf-compiler
183+
- name: Setup Minio - S3-compatible storage
184+
run: |
185+
docker run -d --name minio-container \
186+
-p 9000:9000 \
187+
-e MINIO_ROOT_USER=TEST-DataFusionLogin -e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \
188+
-v $(pwd)/datafusion/core/tests/data:/source quay.io/minio/minio \
189+
server /data
190+
docker exec minio-container /bin/sh -c "\
191+
mc ready local
192+
mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \
193+
mc mb localminio/data && \
194+
mc cp -r /source/* localminio/data"
185195
- name: Run tests (excluding doctests)
186-
run: cargo test --profile ci --exclude datafusion-examples --exclude ffi_example_table_provider --exclude datafusion-benchmarks --workspace --lib --tests --bins --features serde,avro,json,backtrace,integration-tests
196+
env:
197+
RUST_BACKTRACE: 1
198+
AWS_ENDPOINT: http://127.0.0.1:9000
199+
AWS_ACCESS_KEY_ID: TEST-DataFusionLogin
200+
AWS_SECRET_ACCESS_KEY: TEST-DataFusionPassword
201+
TEST_STORAGE_INTEGRATION: 1
202+
AWS_ALLOW_HTTP: true
203+
run: cargo test --profile ci --exclude datafusion-examples --exclude ffi_example_table_provider --exclude datafusion-benchmarks --workspace --lib --tests --bins --features avro,json,backtrace,integration-tests
187204
- name: Verify Working Directory Clean
188205
run: git diff --exit-code
206+
- name: Minio Output
207+
if: ${{ !cancelled() }}
208+
run: docker logs minio-container
189209

190210
linux-test-example:
191211
name: cargo examples (amd64)

Cargo.lock

+49
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

datafusion-cli/CONTRIBUTING.md

+75
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,75 @@
1+
<!---
2+
Licensed to the Apache Software Foundation (ASF) under one
3+
or more contributor license agreements. See the NOTICE file
4+
distributed with this work for additional information
5+
regarding copyright ownership. The ASF licenses this file
6+
to you under the Apache License, Version 2.0 (the
7+
"License"); you may not use this file except in compliance
8+
with the License. You may obtain a copy of the License at
9+
10+
http://www.apache.org/licenses/LICENSE-2.0
11+
12+
Unless required by applicable law or agreed to in writing,
13+
software distributed under the License is distributed on an
14+
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
15+
KIND, either express or implied. See the License for the
16+
specific language governing permissions and limitations
17+
under the License.
18+
-->
19+
20+
# Development instructions
21+
22+
## Running Tests
23+
24+
Tests can be run using `cargo`
25+
26+
```shell
27+
cargo test
28+
```
29+
30+
## Running Storage Integration Tests
31+
32+
By default, storage integration tests are not run. To run them you will need to set `TEST_STORAGE_INTEGRATION=1` and
33+
then provide the necessary configuration for that object store.
34+
35+
For some of the tests, [snapshots](https://datafusion.apache.org/contributor-guide/testing.html#snapshot-testing) are used.
36+
37+
### AWS
38+
39+
To test the S3 integration against [Minio](https://github.com/minio/minio)
40+
41+
First start up a container with Minio and load test files.
42+
43+
```shell
44+
docker run -d \
45+
--name datafusion-test-minio \
46+
-p 9000:9000 \
47+
-e MINIO_ROOT_USER=TEST-DataFusionLogin \
48+
-e MINIO_ROOT_PASSWORD=TEST-DataFusionPassword \
49+
-v $(pwd)/../datafusion/core/tests/data:/source \
50+
quay.io/minio/minio server /data
51+
52+
docker exec datafusion-test-minio /bin/sh -c "\
53+
mc ready local
54+
mc alias set localminio http://localhost:9000 TEST-DataFusionLogin TEST-DataFusionPassword && \
55+
mc mb localminio/data && \
56+
mc cp -r /source/* localminio/data"
57+
```
58+
59+
Setup environment
60+
61+
```shell
62+
export TEST_STORAGE_INTEGRATION=1
63+
export AWS_ACCESS_KEY_ID=TEST-DataFusionLogin
64+
export AWS_SECRET_ACCESS_KEY=TEST-DataFusionPassword
65+
export AWS_ENDPOINT=http://127.0.0.1:9000
66+
export AWS_ALLOW_HTTP=true
67+
```
68+
69+
Note that `AWS_ENDPOINT` is set without slash at the end.
70+
71+
Run tests
72+
73+
```shell
74+
cargo test
75+
```

datafusion-cli/Cargo.toml

+2
Original file line numberDiff line numberDiff line change
@@ -67,5 +67,7 @@ url = { workspace = true }
6767
[dev-dependencies]
6868
assert_cmd = "2.0"
6969
ctor = { workspace = true }
70+
insta = { version = "1.41.1", features = ["glob", "filters"] }
71+
insta-cmd = "0.6.0"
7072
predicates = "3.0"
7173
rstest = { workspace = true }

datafusion-cli/tests/cli_integration.rs

+104-19
Original file line numberDiff line numberDiff line change
@@ -17,46 +17,131 @@
1717

1818
use std::process::Command;
1919

20-
use assert_cmd::prelude::{CommandCargoExt, OutputAssertExt};
21-
use predicates::prelude::predicate;
2220
use rstest::rstest;
2321

22+
use insta::{glob, Settings};
23+
use insta_cmd::{assert_cmd_snapshot, get_cargo_bin};
24+
use std::{env, fs};
25+
26+
fn cli() -> Command {
27+
Command::new(get_cargo_bin("datafusion-cli"))
28+
}
29+
30+
fn make_settings() -> Settings {
31+
let mut settings = Settings::clone_current();
32+
settings.set_prepend_module_to_snapshot(false);
33+
settings.add_filter(r"Elapsed .* seconds\.", "[ELAPSED]");
34+
settings.add_filter(r"DataFusion CLI v.*", "[CLI_VERSION]");
35+
settings
36+
}
37+
2438
#[cfg(test)]
2539
#[ctor::ctor]
2640
fn init() {
2741
// Enable RUST_LOG logging configuration for tests
2842
let _ = env_logger::try_init();
2943
}
3044

31-
// Disabled due to https://github.com/apache/datafusion/issues/10793
32-
#[cfg(not(target_family = "windows"))]
3345
#[rstest]
34-
#[case::exec_from_commands(
35-
["--command", "select 1", "--format", "json", "-q"],
36-
"[{\"Int64(1)\":1}]\n"
37-
)]
3846
#[case::exec_multiple_statements(
39-
["--command", "select 1; select 2;", "--format", "json", "-q"],
40-
"[{\"Int64(1)\":1}]\n[{\"Int64(2)\":2}]\n"
47+
"statements",
48+
["--command", "select 1; select 2;", "-q"],
4149
)]
4250
#[case::exec_backslash(
43-
["--file", "tests/data/backslash.txt", "--format", "json", "-q"],
44-
"[{\"Utf8(\\\"\\\\\\\")\":\"\\\\\",\"Utf8(\\\"\\\\\\\\\\\")\":\"\\\\\\\\\",\"Utf8(\\\"\\\\\\\\\\\\\\\\\\\\\\\")\":\"\\\\\\\\\\\\\\\\\\\\\",\"Utf8(\\\"dsdsds\\\\\\\\\\\\\\\\\\\")\":\"dsdsds\\\\\\\\\\\\\\\\\",\"Utf8(\\\"\\\\t\\\")\":\"\\\\t\",\"Utf8(\\\"\\\\0\\\")\":\"\\\\0\",\"Utf8(\\\"\\\\n\\\")\":\"\\\\n\"}]\n"
51+
"backslash",
52+
["--file", "tests/sql/backslash.sql", "--format", "json", "-q"],
4553
)]
4654
#[case::exec_from_files(
47-
["--file", "tests/data/sql.txt", "--format", "json", "-q"],
48-
"[{\"Int64(1)\":1}]\n"
55+
"files",
56+
["--file", "tests/sql/select.sql", "-q"],
4957
)]
5058
#[case::set_batch_size(
51-
["--command", "show datafusion.execution.batch_size", "--format", "json", "-q", "-b", "1"],
52-
"[{\"name\":\"datafusion.execution.batch_size\",\"value\":\"1\"}]\n"
59+
"batch_size",
60+
["--command", "show datafusion.execution.batch_size", "-q", "-b", "1"],
5361
)]
5462
#[test]
5563
fn cli_quick_test<'a>(
64+
#[case] snapshot_name: &'a str,
5665
#[case] args: impl IntoIterator<Item = &'a str>,
57-
#[case] expected: &str,
5866
) {
59-
let mut cmd = Command::cargo_bin("datafusion-cli").unwrap();
67+
let mut settings = make_settings();
68+
settings.set_snapshot_suffix(snapshot_name);
69+
let _bound = settings.bind_to_scope();
70+
71+
let mut cmd = cli();
6072
cmd.args(args);
61-
cmd.assert().stdout(predicate::eq(expected));
73+
74+
assert_cmd_snapshot!(cmd);
75+
}
76+
77+
#[rstest]
78+
#[case("csv")]
79+
#[case("tsv")]
80+
#[case("table")]
81+
#[case("json")]
82+
#[case("nd-json")]
83+
#[case("automatic")]
84+
#[test]
85+
fn test_cli_format<'a>(#[case] format: &'a str) {
86+
let mut settings = make_settings();
87+
settings.set_snapshot_suffix(format);
88+
let _bound = settings.bind_to_scope();
89+
90+
let mut cmd = cli();
91+
cmd.args(["--command", "select 1", "-q", "--format", format]);
92+
93+
assert_cmd_snapshot!(cmd);
94+
}
95+
96+
#[tokio::test]
97+
async fn test_cli() {
98+
if env::var("TEST_STORAGE_INTEGRATION").is_err() {
99+
eprintln!("Skipping external storages integration tests");
100+
return;
101+
}
102+
103+
let settings = make_settings();
104+
let _bound = settings.bind_to_scope();
105+
106+
glob!("sql/integration/*.sql", |path| {
107+
let input = fs::read_to_string(path).unwrap();
108+
assert_cmd_snapshot!(cli().pass_stdin(input))
109+
});
110+
}
111+
112+
#[tokio::test]
113+
async fn test_aws_options() {
114+
// Separate test is needed to pass aws as options in sql and not via env
115+
116+
if env::var("TEST_STORAGE_INTEGRATION").is_err() {
117+
eprintln!("Skipping external storages integration tests");
118+
return;
119+
}
120+
121+
let settings = make_settings();
122+
let _bound = settings.bind_to_scope();
123+
124+
let access_key_id =
125+
env::var("AWS_ACCESS_KEY_ID").expect("AWS_ACCESS_KEY_ID is not set");
126+
let secret_access_key =
127+
env::var("AWS_SECRET_ACCESS_KEY").expect("AWS_SECRET_ACCESS_KEY is not set");
128+
let endpoint_url = env::var("AWS_ENDPOINT").expect("AWS_ENDPOINT is not set");
129+
130+
let input = format!(
131+
r#"CREATE EXTERNAL TABLE CARS
132+
STORED AS CSV
133+
LOCATION 's3://data/cars.csv'
134+
OPTIONS(
135+
'aws.access_key_id' '{}',
136+
'aws.secret_access_key' '{}',
137+
'aws.endpoint' '{}',
138+
'aws.allow_http' 'true'
139+
);
140+
141+
SELECT * FROM CARS limit 1;
142+
"#,
143+
access_key_id, secret_access_key, endpoint_url
144+
);
145+
146+
assert_cmd_snapshot!(cli().env_clear().pass_stdin(input));
62147
}
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
---
2+
source: tests/cli_integration.rs
3+
info:
4+
program: datafusion-cli
5+
args: []
6+
stdin: "CREATE EXTERNAL TABLE CARS\nSTORED AS CSV\nLOCATION 's3://data/cars.csv'\nOPTIONS(\n 'aws.access_key_id' 'TEST-DataFusionLogin',\n 'aws.secret_access_key' 'TEST-DataFusionPassword',\n 'aws.endpoint' 'http://127.0.0.1:9000',\n 'aws.allow_http' 'true'\n);\n\nSELECT * FROM CARS limit 1;\n"
7+
---
8+
success: true
9+
exit_code: 0
10+
----- stdout -----
11+
[CLI_VERSION]
12+
0 row(s) fetched.
13+
[ELAPSED]
14+
15+
+-----+-------+---------------------+
16+
| car | speed | time |
17+
+-----+-------+---------------------+
18+
| red | 20.0 | 1996-04-12T12:05:03 |
19+
+-----+-------+---------------------+
20+
1 row(s) fetched.
21+
[ELAPSED]
22+
23+
\q
24+
25+
----- stderr -----

0 commit comments

Comments
 (0)