Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Union schema compatibility #21

Merged
merged 15 commits into from
Jul 24, 2024
3 changes: 2 additions & 1 deletion .buildkite/hooks/pre-command
Original file line number Diff line number Diff line change
Expand Up @@ -21,4 +21,5 @@ export CI_SNOWFLAKE_DBT_USER=$(gcloud secrets versions access latest --secret="C
export CI_SNOWFLAKE_DBT_WAREHOUSE=$(gcloud secrets versions access latest --secret="CI_SNOWFLAKE_DBT_WAREHOUSE" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HOST=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HOST" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_HTTP_PATH=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_HTTP_PATH" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_TOKEN=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_TOKEN" --project="dbt-package-testing-363917")
export CI_DATABRICKS_DBT_CATALOG=$(gcloud secrets versions access latest --secret="CI_DATABRICKS_DBT_CATALOG" --project="dbt-package-testing-363917")
1 change: 1 addition & 0 deletions .buildkite/pipeline.yml
Original file line number Diff line number Diff line change
Expand Up @@ -69,5 +69,6 @@ steps:
- "CI_DATABRICKS_DBT_HOST"
- "CI_DATABRICKS_DBT_HTTP_PATH"
- "CI_DATABRICKS_DBT_TOKEN"
- "CI_DATABRICKS_DBT_CATALOG"
commands: |
bash .buildkite/scripts/run_models.sh databricks
11 changes: 11 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,14 @@
# dbt_app_reporting v0.4.0
[PR #21](https://github.com/fivetran/dbt_app_reporting/pull/21) includes the following updates:

## 🚨 Breaking hanges 🚨
- Identifier variables for the following packages have been updated for consistency with the source name and compatibility with the union schema feature. See the package's changelog for a full list of changes.
- [dbt_apple_store](https://github.com/fivetran/dbt_linkedin/blob/main/CHANGELOG.md#dbt_apple_store-v040)
- [dbt_google_play](https://github.com/fivetran/dbt_microsoft_ads/blob/main/CHANGELOG.md#dbt_google_play-v040)

## Feature update 🎉
- Unioning capability! This adds the ability to union source data from multiple app_reporting connectors. Refer to the [README](https://github.com/fivetran/dbt_app_reporting/blob/main/README.md#union-multiple-connectors) for more details.

# dbt_app_reporting v0.3.2
## Bug Fixes
[PR #19](https://github.com/fivetran/dbt_app_reporting/pull/19) includes the following update:
Expand Down
30 changes: 23 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ Include the following github package version in your `packages.yml`
```yaml
packages:
- package: fivetran/app_reporting
version: [">=0.3.0", "<0.4.0"] # we recommend using ranges to capture non-breaking changes automatically
version: [">=0.4.0", "<0.5.0"] # we recommend using ranges to capture non-breaking changes automatically
```

Do NOT include the individual app platform packages in this file. The app reporting package itself has dependencies on these packages and will install them as well.
Expand Down Expand Up @@ -114,15 +114,31 @@ models:
> Provide a blank `+schema: ` to write to the `target_schema` without any suffix.

## (Optional) Step 7: Additional configurations
<details><summary>Expand to view configurations</summary>
<details open><summary>Expand/collapse configurations</summary>

### Union multiple connectors
If you have multiple app reporting connectors in Fivetran and would like to use this package on all of them simultaneously, we have provided functionality to do so. The package will union all of the data together and pass the unioned table into the transformations. You will be able to see which source it came from in the `source_relation` column of each model. To use this functionality, you will need to set either the `<package_name>_union_schemas` OR `<package_name>_union_databases` variables (cannot do both) in your root `dbt_project.yml` file. Below are the variables and examples for each connector:

```yml
vars:
apple_store_union_schemas: ['apple_store_usa','apple_store_canada']
apple_store_union_databases: ['apple_store_usa','apple_store_canada']

google_play_union_schemas: ['google_play_usa','google_play_canada']
google_play_union_databases: ['google_play_usa','google_play_canada']
```
Please be aware that the native `source.yml` connection set up in the package will not function when the union schema/database feature is utilized. Although the data will be correctly combined, you will not observe the sources linked to the package models in the Directed Acyclic Graph (DAG). This happens because the package includes only one defined `source.yml`.

To connect your multiple schema/database sources to the package models, follow the steps outlined in the [Union Data Defined Sources Configuration](https://github.com/fivetran/dbt_fivetran_utils/tree/releases/v0.4.latest#union_data-source) section of the Fivetran Utils documentation for the union_data macro. This will ensure a proper configuration and correct visualization of connections in the DAG.

### Change the source table references
If an individual source table has a different name than the package expects, add the table name as it appears in your destination to the respective variable:
> IMPORTANT: See the Apple Store [`dbt_project.yml`](https://github.com/fivetran/dbt_apple_store_source/blob/main/dbt_project.yml) and Google Play [`dbt_project.yml`](https://github.com/fivetran/dbt_google_play_source/blob/main/dbt_project.yml) variable declarations to see the expected names.

```yml
vars:
<default_source_table_name>_identifier: your_table_name
apple_store_<default_source_table_name>_identifier: your_table_name
google_play_<default_source_table_name>_identifier: your_table_name
```

</details>
Expand All @@ -143,16 +159,16 @@ This dbt package is dependent on the following dbt packages. For more informatio
```yml
packages:
- package: fivetran/apple_store
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]

- package: fivetran/apple_store_source
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]

- package: fivetran/google_play
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]

- package: fivetran/google_play_source
version: [">=0.3.0", "<0.4.0"]
version: [">=0.4.0", "<0.5.0"]

- package: fivetran/fivetran_utils
version: [">=0.4.0", "<0.5.0"]
Expand Down
2 changes: 1 addition & 1 deletion dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
name: 'app_reporting'
version: '0.3.2'
version: '0.4.0'
config-version: 2
models:
app_reporting:
Expand Down
2 changes: 1 addition & 1 deletion integration_tests/ci/sample.profiles.yml
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ integration_tests:
schema: app_reporting_integrations_test_5
threads: 8
databricks:
catalog: null
catalog: "{{ env_var('CI_DATABRICKS_DBT_CATALOG') }}"
host: "{{ env_var('CI_DATABRICKS_DBT_HOST') }}"
http_path: "{{ env_var('CI_DATABRICKS_DBT_HTTP_PATH') }}"
schema: app_reporting_integrations_test_5
Expand Down
69 changes: 35 additions & 34 deletions integration_tests/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -1,46 +1,46 @@
name: 'app_reporting_integration_tests'
version: '0.3.2'
version: '0.4.0'
profile: 'integration_tests'
config-version: 2
vars:
google_play_schema: app_reporting_integrations_test_5
apple_store_schema: app_reporting_integrations_test_5
google_play_source:
stats_installs_app_version_identifier: "stats_installs_app_version"
stats_crashes_app_version_identifier: "stats_crashes_app_version"
stats_ratings_app_version_identifier: "stats_ratings_app_version"
stats_installs_device_identifier: "stats_installs_device"
stats_ratings_device_identifier: "stats_ratings_device"
stats_installs_os_version_identifier: "stats_installs_os_version"
stats_ratings_os_version_identifier: "stats_ratings_os_version"
stats_crashes_os_version_identifier: "stats_crashes_os_version"
stats_installs_country_identifier: "stats_installs_country"
stats_ratings_country_identifier: "stats_ratings_country"
stats_store_performance_country_identifier: "stats_store_performance_country"
stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
stats_installs_overview_identifier: "stats_installs_overview"
stats_crashes_overview_identifier: "stats_crashes_overview"
stats_ratings_overview_identifier: "stats_ratings_overview"
earnings_identifier: "earnings"
financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"
google_play_stats_installs_app_version_identifier: "stats_installs_app_version"
google_play_stats_crashes_app_version_identifier: "stats_crashes_app_version"
google_play_stats_ratings_app_version_identifier: "stats_ratings_app_version"
google_play_stats_installs_device_identifier: "stats_installs_device"
google_play_stats_ratings_device_identifier: "stats_ratings_device"
google_play_stats_installs_os_version_identifier: "stats_installs_os_version"
google_play_stats_ratings_os_version_identifier: "stats_ratings_os_version"
google_play_stats_crashes_os_version_identifier: "stats_crashes_os_version"
google_play_stats_installs_country_identifier: "stats_installs_country"
google_play_stats_ratings_country_identifier: "stats_ratings_country"
google_play_stats_store_performance_country_identifier: "stats_store_performance_country"
google_play_stats_store_performance_traffic_source_identifier: "stats_store_performance_traffic_source"
google_play_stats_installs_overview_identifier: "stats_installs_overview"
google_play_stats_crashes_overview_identifier: "stats_crashes_overview"
google_play_stats_ratings_overview_identifier: "stats_ratings_overview"
google_play_earnings_identifier: "earnings"
google_play_financial_stats_subscriptions_country_identifier: "financial_stats_subscriptions_country"

apple_store_source:
app_identifier: "app"
app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
app_store_source_type_device_report_identifier: "app_store_source_type_device"
app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
crashes_app_version_device_report_identifier: "crashes_app_version"
crashes_platform_version_device_report_identifier: "crashes_platform_version"
downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
downloads_source_type_device_report_identifier: "downloads_source_type_device"
downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
sales_account_identifier: "sales_account"
sales_subscription_event_summary_identifier: "sales_subscription_events"
sales_subscription_summary_identifier: "sales_subscription_summary"
usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
usage_source_type_device_report_identifier: "usage_source_type_device"
usage_territory_source_type_report_identifier: usage_territory_source_type
apple_store_app_identifier: "app"
apple_store_app_store_platform_version_source_type_report_identifier: "app_store_platform_version_source_type"
apple_store_app_store_source_type_device_report_identifier: "app_store_source_type_device"
apple_store_app_store_territory_source_type_report_identifier: "app_store_territory_source_type"
apple_store_crashes_app_version_device_report_identifier: "crashes_app_version"
apple_store_crashes_platform_version_device_report_identifier: "crashes_platform_version"
apple_store_downloads_platform_version_source_type_report_identifier: "downloads_platform_version_source_type"
apple_store_downloads_source_type_device_report_identifier: "downloads_source_type_device"
apple_store_downloads_territory_source_type_report_identifier: "downloads_territory_source_type"
apple_store_sales_account_identifier: "sales_account"
apple_store_sales_subscription_event_summary_identifier: "sales_subscription_events"
apple_store_sales_subscription_summary_identifier: "sales_subscription_summary"
apple_store_usage_app_version_source_type_report_identifier: "usage_app_version_source_type"
apple_store_usage_platform_version_source_type_report_identifier: "usage_platform_version_source_type"
apple_store_usage_source_type_device_report_identifier: "usage_source_type_device"
apple_store_usage_territory_source_type_report_identifier: usage_territory_source_type

apple_store__subscription_events:
- 'Renew'
Expand All @@ -55,6 +55,7 @@ models:
+persist_docs:
relation: "{{ false if target.type in ('spark','databricks') else true }}"
columns: "{{ false if target.type in ('spark','databricks') else true }}"
+schema: "app_reporting_{{ var('directed_schema','dev') }}" ## To be used for validation testing

seeds:
app_reporting_integration_tests:
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__app_version_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__app_version_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__country_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__country_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__device_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__device_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
{{ config(
tags="fivetran_validations",
enabled=var('fivetran_validation_tests_enabled', false)
) }}

-- this test ensures the daily_activity end model matches the prior version
with prod as (
select *
from {{ target.schema }}_app_reporting_prod.app_reporting__os_version_report
),

dev as (
select *
from {{ target.schema }}_app_reporting_dev.app_reporting__os_version_report
),

prod_not_in_dev as (
-- rows from prod not found in dev
select * from prod
except distinct
select * from dev
),

dev_not_in_prod as (
-- rows from dev not found in prod
select * from dev
except distinct
select * from prod
),

final as (
select
*,
'from prod' as source
from prod_not_in_dev

union all -- union since we only care if rows are produced

select
*,
'from dev' as source
from dev_not_in_prod
)

select *
from final
Loading