Skip to content

Commit 0796022

Browse files
authored
Add files via upload
0 parents  commit 0796022

10 files changed

+1391
-0
lines changed

.DS_Store

6 KB
Binary file not shown.

.gitignore

+14
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
target/
2+
.classpath
3+
.settings/org.eclipse.core.resources.prefs
4+
.settings/org.eclipse.m2e.core.prefs
5+
.settings/org.eclipse.jdt.core.prefs
6+
.vscode
7+
.project
8+
.checkstyle
9+
*.DS_Store
10+
.idea
11+
SQLancer.iml
12+
dependency-reduced-pom.xml
13+
database0.db
14+
databaseconnectiontest.db

CODE_OF_CONDUCT.md

+76
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,76 @@
1+
# Contributor Covenant Code of Conduct
2+
3+
## Our Pledge
4+
5+
In the interest of fostering an open and welcoming environment, we as
6+
contributors and maintainers pledge to making participation in our project and
7+
our community a harassment-free experience for everyone, regardless of age, body
8+
size, disability, ethnicity, sex characteristics, gender identity and expression,
9+
level of experience, education, socio-economic status, nationality, personal
10+
appearance, race, religion, or sexual identity and orientation.
11+
12+
## Our Standards
13+
14+
Examples of behavior that contributes to creating a positive environment
15+
include:
16+
17+
* Using welcoming and inclusive language
18+
* Being respectful of differing viewpoints and experiences
19+
* Gracefully accepting constructive criticism
20+
* Focusing on what is best for the community
21+
* Showing empathy towards other community members
22+
23+
Examples of unacceptable behavior by participants include:
24+
25+
* The use of sexualized language or imagery and unwelcome sexual attention or
26+
advances
27+
* Trolling, insulting/derogatory comments, and personal or political attacks
28+
* Public or private harassment
29+
* Publishing others' private information, such as a physical or electronic
30+
address, without explicit permission
31+
* Other conduct which could reasonably be considered inappropriate in a
32+
professional setting
33+
34+
## Our Responsibilities
35+
36+
Project maintainers are responsible for clarifying the standards of acceptable
37+
behavior and are expected to take appropriate and fair corrective action in
38+
response to any instances of unacceptable behavior.
39+
40+
Project maintainers have the right and responsibility to remove, edit, or
41+
reject comments, commits, code, wiki edits, issues, and other contributions
42+
that are not aligned to this Code of Conduct, or to ban temporarily or
43+
permanently any contributor for other behaviors that they deem inappropriate,
44+
threatening, offensive, or harmful.
45+
46+
## Scope
47+
48+
This Code of Conduct applies both within project spaces and in public spaces
49+
when an individual is representing the project or its community. Examples of
50+
representing a project or community include using an official project e-mail
51+
address, posting via an official social media account, or acting as an appointed
52+
representative at an online or offline event. Representation of a project may be
53+
further defined and clarified by project maintainers.
54+
55+
## Enforcement
56+
57+
Instances of abusive, harassing, or otherwise unacceptable behavior may be
58+
reported by contacting the project team at [email protected]. All
59+
complaints will be reviewed and investigated and will result in a response that
60+
is deemed necessary and appropriate to the circumstances. The project team is
61+
obligated to maintain confidentiality with regard to the reporter of an incident.
62+
Further details of specific enforcement policies may be posted separately.
63+
64+
Project maintainers who do not follow or enforce the Code of Conduct in good
65+
faith may face temporary or permanent repercussions as determined by other
66+
members of the project's leadership.
67+
68+
## Attribution
69+
70+
This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4,
71+
available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html
72+
73+
[homepage]: https://www.contributor-covenant.org
74+
75+
For answers to common questions about this code of conduct, see
76+
https://www.contributor-covenant.org/faq

CONTRIBUTING.md

+111
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Development
2+
3+
## Working with Eclipse
4+
5+
Developing SQLancer using Eclipse is expected to work well. You can import SQLancer with a single step:
6+
7+
```
8+
File -> Import -> Existing Maven Projects -> Select the SQLancer directory as root directory -> Finish
9+
```
10+
If you do not find an option to import Maven projects, you might need to install the [M2Eclipse plugin](https://www.eclipse.org/m2e/).
11+
12+
13+
## Implementing Support for a New DBMS
14+
15+
The DuckDB implementation provides a good template for a new implementation. The `DuckDBProvider` class is the central class that manages the creation of the databases and executes the selected test oracles. Try to copy its structure for the new DBMS that you want to implement, and start by generate databases (without implementing a test oracle). As part of this, you will also need to implement the equivalent of `DuckDBSchema`, which represents the database schema of the generated database. After you can successfully generate databases, the next step is to generate one of the test oracles. For example, you might want to implement NoREC (see `DuckDBNoRECOracle` or `DuckDBQueryPartitioningWhereTester` for TLP). As part of this, you must also implement a random expression generator (see `DuckDBExpressionGenerator`) and a visitor to derive the textual representation of an expression (see `DuckDBToStringVisitor`).
16+
17+
Please consider the following suggestions when creating a PR to contribute a new DBMS:
18+
* Ensure that `mvn verify -DskipTests=true` does not result in style violations.
19+
* Add a [CI test](https://github.com/sqlancer/sqlancer/blob/master/.github/workflows/main.yml) to ensure that future changes to SQLancer are unlikely to break the newly-supported DBMS. It is reasonable to do this in a follow-up PR—please indicate whether you plan to do so in the PR description.
20+
* Add the DBMS' name to the [check_names.py](https://github.com/sqlancer/sqlancer/blob/master/src/check_names.py) script, which ensures adherence to a common prefix in the Java classes.
21+
* Add the DBMS' name to the [README.md](https://github.com/sqlancer/sqlancer/blob/master/README.md#supported-dbms) file.
22+
* It would be easier to review multiple smaller PRs, than one PR that contains the complete implementation. Consider contributing parts of your implementation as you work on their implementation.
23+
24+
### Expected Errors
25+
26+
Most statements have an [ExpectedError](https://github.com/sqlancer/sqlancer/blob/aa0c0eccba4eefa75bfd518f608c9222c692c11d/src/sqlancer/common/query/ExpectedErrors.java) object associated with them. This object essentially contains a list of errors, one of which the database system might return if it cannot successfully execute the statement. These errors are typically added through a trial-and-error process while considering various tradeoffs. For example, consider the [DuckDBInsertGenerator](https://github.com/sqlancer/sqlancer/blob/aa0c0eccba4eefa75bfd518f608c9222c692c11d/src/sqlancer/duckdb/gen/DuckDBInsertGenerator.java#L38) class, whose expected errors are specified in [DuckDBErrors](https://github.com/sqlancer/sqlancer/blob/aa0c0eccba4eefa75bfd518f608c9222c692c11d/src/sqlancer/duckdb/DuckDBErrors.java#L90). When implementing such a generator, the list of expected errors might first be empty. When running the generator for the first time, you might receive an error such as "create unique index, table contains duplicate data", indicating that creating the index failed due to duplicate data. In principle, this error could be avoided by first checking whether the column contains any duplicate values. However, checking this would be expensive and error-prone (e.g., consider string similarity, which might depend on collations); thus, the obvious choice would be to add this string to the list of expected errors, and run the generator again to check for any other expected errors. In other cases, errors might be best addressed through improvements in the generators. For example, it is typically straightforward to generate syntactically-valid statements, which is why syntax errors should not be ignored. This approach is effective in uncovering internal errors; rather than ignoring them as an expected error, report them, and see [Unfixed Bugs](#unfixed-bugs) below.
27+
28+
### Bailing Out While Generating a Statement
29+
30+
In some cases, it might be undesirable or even impossible to generate a specific statement type. For example, consider that SQLancer tries to execute a `DROP TABLE` statement (e.g., see [TiDBDropTableGenerator](https://github.com/sqlancer/sqlancer/blob/30948f34acc2354d6be18a70bdeeebff1e73fa48/src/sqlancer/tidb/gen/TiDBDropTableGenerator.java)), but the database contains only a single table. Dropping the table would result in all subsequent attempts to insert data or query it to fail. Thus, in such a case, it might be more efficient to "bail out" by abandoning the current attempt to generate the statement. This can be achieved by throwing a `IgnoreMeException`. Unlike for other exceptions, SQLancer silently continues execution rather than reporting this exception to the user.
31+
32+
33+
### Typed vs. Untyped Expression Generation
34+
35+
Each DBMS implementation provides an expression generator used, for example, to generate expressions used in `WHERE` clauses. We found that DBMS can be roughly classified into "permissive" ones, which apply implicit type conversions when needed and "strict" ones, which provide only few implicit conversions and output an error when the type is unexpected. For example, consider the following test case:
36+
37+
```sql
38+
CREATE TABLE t0(c0 TEXT);
39+
INSERT INTO t0 VALUES ('1');
40+
SELECT * FROM t0 WHERE c0;
41+
```
42+
43+
If the test case is executed using MySQL, which is a permissive DBMS, the `SELECT` fetches a single row, since the content of the `c0` value is interpreted as a boolean. If the test case is executed using PostgreSQL, which is a strict DBMS, the `SELECT` is not accepted as a valid query, and PostgreSQL outputs an error `"argument of WHERE must be type boolean"`. The implementation of the expression generator depends on whether we are dealing with a permissive or a strict DBMS. Since SQLancer's main goal is to find logic bugs, we want to generate as many valid queries as possible.
44+
45+
For a permissive DBMS, implementing the expression generator is easier, since the expression generator does not need to care about the type of the expression, since the DBMS will apply any necessary conversions implicitly. For MySQL, the main `generateExpression` method thus does not accept any type as an argument (see [MySQLExpressionGenerator](https://github.com/sqlancer/sqlancer/blob/86647df8aa2dd8d167b5c3ce3297290f5b0b2bcd/src/sqlancer/mysql/gen/MySQLExpressionGenerator.java#L54)). This method can be called when a expression is required for, for example, a `WHERE` clause. In principle, this approach can also be used for strict DBMS, by adding errors such as `argument of WHERE must be type boolean` to the list of expected errors. However, using such an "untyped" expression generator for a strict DBMS will result in many semantically invalid queries being generated.
46+
47+
For a strict DBMS, the better approach is typically to attempt to generate expressions of the expected type. For PostgreSQL, the expression generator thus expects an additional type argument (see [PostgreSQLExpressionGenerator](https://github.com/sqlancer/sqlancer/blob/86647df8aa2dd8d167b5c3ce3297290f5b0b2bcd/src/sqlancer/postgres/gen/PostgresExpressionGenerator.java#L251)). This type is propagated recursively. For example, if we require a predicate for the `WHERE` clause, we pass boolean as a type. The expression generator then calls a method `generateBooleanExpression` that attempts to produce a boolean expression, by, for example, generating a comparison (e.g., `<=`). For the comparison's operands, a random type is then selected and propagated. For example, if an integer type is selected, then `generateExpression` is called with this type once for the left operand, and once for the right operand. Note that this process does not guarantee that the expression will indeed have the expected type. It might happen, for example, that the expression generator attempts to produce an integer value, but that it produces a double value instead, namely when an integer overflow occurs, which, depending on the DBMS, implicitly converts the result to a floating-point value.
48+
49+
### Unfixed Bugs
50+
51+
Often, some bugs are fixed only after an extended period, meaning that SQLancer will repeatedly report the same bug. In such cases, it might be possible to avoid generating the problematic pattern, or adding an expected error with the internal error message. Rather than, for example, commenting out the code with the bug-inducing pattern, a pattern implemented by the [TiDBBugs class](https://github.com/sqlancer/sqlancer/blob/4c20a94b3ad2c037e1a66c0b637184f8c20faa7e/src/sqlancer/tidb/TiDBBugs.java) should be applied. The core idea is to use a public, static flag for each issue, which is set to true as long as the issue persists (e.g., see [bug35652](https://github.com/sqlancer/sqlancer/blob/4c20a94b3ad2c037e1a66c0b637184f8c20faa7e/src/sqlancer/tidb/TiDBBugs.java#L55)). The work-around code is then executed—or the problematic pattern should not be generated—if the flag is set to true (e.g., [an expected error is added for bug35652](https://github.com/sqlancer/sqlancer/blob/59564d818d991d54b32fa5a79c9f733799c090f2/src/sqlancer/tidb/TiDBErrors.java#L47)). This makes it easy to later on identify and remove all such work-around code once the issue has been fixed.
52+
53+
## Options
54+
55+
SQLancer uses [JCommander](https://jcommander.org/) for handling options. The `MainOptions` class contains options that are expected to be supported by all DBMS-testing implementations. Furthermore, each `*Provider` class provides a method to return an additional set of supported options.
56+
57+
An option can include lowercase alphanumeric characters, and hyphens. The format of the options is checked by a unit test.
58+
59+
## Continuous Integration and Test Suite
60+
61+
To improve and maintain SQLancer's code quality, we use multiple tools:
62+
* The [Eclipse code formatter](https://code.revelc.net/formatter-maven-plugin/), to ensure a consistent formatting (Run `mvn formatter:format` to format all files).
63+
* [Checkstyle](https://checkstyle.sourceforge.io/), to enforce a consistent coding standard.
64+
* [PMD](https://pmd.github.io/), which finds programming flaws using static analysis.
65+
* [SpotBugs](https://spotbugs.github.io/), which also uses static analysis to find bugs and programming flaws.
66+
67+
You can run them using the following command:
68+
69+
```
70+
mvn verify
71+
```
72+
73+
We use [GitHub Actions](https://github.com/sqlancer/sqlancer/blob/master/.github/workflows/main.yml) to automatically check PRs.
74+
75+
76+
## Testing
77+
78+
As part of the GitHub Actions check, we use smoke testing by running SQLancer on each supported DBMS for some minutes, to test that nothing is obviously broken. For DBMS for which all bugs have been fixed, we verify that SQLancer cannot find any further bugs (i.e., the return code is zero).
79+
80+
In addition, we use [unit tests](https://github.com/sqlancer/sqlancer/tree/master/test/sqlancer) to test SQLancer's core functionality, such as random string and number generation as well as option passing. When fixing a bug, add a unit test, if it is easily possible.
81+
82+
You can run the tests using the following command:
83+
84+
```
85+
mvn test
86+
```
87+
88+
Note that per default, the smoke testing is performed only for embedded DBMS (e.g., DuckDB and SQLite). To run smoke tests also for the other DBMS, you need to set environment variables. For example, you can run the MySQL smoke testing (and no other tests) using the following command:
89+
90+
```
91+
MYSQL_AVAILABLE=true mvn -Dtest=TestMySQL test
92+
```
93+
94+
For up-to-date testing commands, check out the `.github/workflows/main.yml` file.
95+
96+
## Reviewing
97+
98+
Reviewing is an effective way of improving code quality. Everyone is welcome to review any PRs. Currently, all PRs are reviewed at least by the main contributor, @mrigger. Contributions by @mrigger are currently not (necessarily) reviewed, which is not ideal. If you are willing to regularly and timely review PRs, indicate so in the SQLancer Slack workspace.
99+
100+
## Naming Conventions
101+
102+
Each class specific to a DBMS is prefixed by the DBMS name. For example, each class specific to SQLite is prefixed by `SQLite3`. The naming convention is [automatically checked](src/check_names.py).
103+
104+
## Commit History
105+
106+
Please pay attention to good commit messages (in particular subject lines). As basic guidelines, we recommend a blog post on [How to Write a Git Commit Message](https://chris.beams.io/posts/git-commit/) written Chris Beams, which provides 7 useful rules. Implement at least the following of those rules:
107+
1. Capitalize the subject line. For example, write "**R**efactor the handling of indexes" rather than "**r**efactor the handling of indexes".
108+
2. Do not end the subject line with a period. For example, write "Refactor the handling of indexes" rather than "Refactor the handling of indexes.".
109+
3. Use the imperative mood in the subject line. For example, write "Refactor the handling of indexes" rather than "Refactoring" or "Refactor**ed** the handling of indexes".
110+
111+
Please also pay attention to a clean commit history. Rather than merging with the main branch, use `git rebase` to rebase your commits on the main branch. Sometimes, it might happen that you discover an issue only after having already created a commit, for example, when an issue is found by `mvn verify` in the CI checks. Do not introduce a separate commit for such issues. If the issue was introduced by the last commit, you can fix the issue, and use `git commit --amend` to change the latest commit. If the change was introduced by one of the previous commits, you can use `git rebase -i` to change the respective commit. If you already have a number of such commits, you can use `git squash` to "collapse" multiple commits into one. For more information, you might want to read [How (and Why!) to Keep Your Git Commit History Clean](https://about.gitlab.com/blog/2018/06/07/keeping-git-commit-history-clean/) written by Kushal Pandya.

Dockerfile

+9
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
FROM ubuntu:21.04
2+
3+
RUN apt-get update --yes && env DEBIAN_FRONTEND=noninteractive apt-get install openjdk-15-jdk maven --yes --no-install-recommends
4+
5+
# assumes that the project has already been built
6+
COPY target/sqlancer-*.jar sqlancer.jar
7+
COPY target/lib/*.jar /lib/
8+
9+
ENTRYPOINT ["java", "-jar", "sqlancer.jar"]

LICENSE.md

+8
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
Copyright 2020 SQLancer Contributors (see https://github.com/sqlancer/sqlancer)
2+
3+
Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
4+
5+
The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
6+
7+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
8+

0 commit comments

Comments
 (0)