Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add SQL Parser #1051

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions AutomationScripts/SQL Parser/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
## SQL Parser

## Short description of package/script:
Extracts column names and tables used by the query. Automatically conduct column alias resolution, sub queries aliases resolution as well as tables aliases resolving.

Provides also a helper for normalization of SQL queries.

## List out the libraries imported:
````
pip install sql-metadata
````

## Example extracting raw sql
## Input

```sql
select id, name, sum(amount) as total_amt from schema.foo a
left join ( select id, name from schema.bar limit 10 ) b on a.id = b.id
-- left join schema_b.bars c on b.id = c.id
left join schema_b.foos c on b.id = c.id
group by id, name
limit 1000
```

## Output

### sql_parser.columns
````
['id', 'name', 'amount', 'schema.foo.id', 'schema_b.foos.id']
````

### sql_parser.tables
````
['schema.foo', 'schema.bar', 'schema_b.foos']
````

### sql_parser.columns_aliases
````
{'total_amt': 'amount'}
````

### sql_parser.subqueries
````
{'b': 'select id, name from schema.bar limit 10'}
````
1 change: 1 addition & 0 deletions AutomationScripts/SQL Parser/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
import sql-metadata
32 changes: 32 additions & 0 deletions AutomationScripts/SQL Parser/sql_parser.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
from sql_metadata import Parser

rawsql = """
select id, name, sum(amount) as total_amt from schema.foo a
left join ( select id, name from schema.bar limit 10 ) b on a.id = b.id
-- left join schema_b.bars c on b.id = c.id
left join schema_b.foos c on b.id = c.id
group by id, name
limit 1000
"""

# initial Parser
sql_parser = Parser(rawsql)

# example sql_parser
sql_parser_columns = sql_parser.columns
print("## exact columns form sql")
print(sql_parser_columns)

sql_parser_tables = sql_parser.tables
print("## exact schema and table form sql")
print(sql_parser_tables)

sql_parser_columns_aliases = sql_parser.columns_aliases
print("## exact columns_aliases form sql")
print(sql_parser_columns_aliases)

sql_parser_subqueries = sql_parser.subqueries
print("## exact subqueries form sql")
print(sql_parser_subqueries)