-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[5/n][dagster-fivetran] Implement FivetranWorkspaceData
to FivetranConnectorTableProps
method
#25797
Conversation
b17bd45
to
252be0c
Compare
19493b4
to
d5a03b3
Compare
FivetranWorkspaceData
to FivetranConnectorTableProps
method
252be0c
to
56205b0
Compare
1a573eb
to
16e3497
Compare
56205b0
to
6c50270
Compare
16e3497
to
aad712a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some more design stuff
@@ -0,0 +1,30 @@ | |||
import uuid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; why is this in translator tests? Shouldn't it be in the same place as the previous workspace method tests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 697e460
raise NotImplementedError() | ||
data: List[FivetranConnectorTableProps] = [] | ||
|
||
for connector_id, connector_data in self.connectors_by_id.items(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In airlift, from the raw API data we retrieved, we constructed a "lookup" object which built up the actual usable asset data from a set of cacheable properties. See
dagster/examples/experimental/dagster-airlift/dagster_airlift/core/serialization/compute.py
Line 114 in 0a0c77b
class FetchedAirflowData: |
Since this is over the "cacheable" boundary, I'm wondering if there's a likelihood if it being called a bunch, and if so I think the cacheable structure likely makes sense. At the very least, I think that the conversion to FivetranConnectorTableProps
should probably be cached, but it might also make sense to cache the stuff from 73-83 here as properties if there's potential for reuse (but if not, then maybe that's not worth the effort)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. This method is meant to be used only in FivetranWorkspaceDefsLoader.defs_from_state
, see implement in next PR #25807.
But I think caching it makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I cached the method in b272d07
for schema in schemas_data.values(): | ||
if schema["enabled"]: | ||
schema_name = schema["name_in_destination"] | ||
schema_tables: Dict[str, Dict[str, Any]] = cast( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
strong code smell from this one. The complexity from the previous PR feels like it's spilling over into this one with all the casting and raw string munging we need to do. Let's strongly type all of these objects, I think things will feel much better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed. This comment applies here as well - this is taken from the legacy code to replicate the behavior. It will be updated in the same PR refactoring how we are storing destinations and connectors.
table_name = table["name_in_destination"] | ||
data.append( | ||
FivetranConnectorTableProps( | ||
table=f"{schema_name}.{table_name}", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it should be a standalone method / something reusable so that we're consistent about how it's constructed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done in 0040d59
@@ -80,7 +117,7 @@ def get_asset_spec(self, props: FivetranConnectorTableProps) -> AssetSpec: | |||
schema_name, table_name = props.table.split(".") | |||
schema_entry = next( | |||
schema | |||
for schema in props.schemas["schemas"].values() | |||
for schema in props.schema_config["schemas"].values() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why the name change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To reflect Fivetran's ontology, see here
6c50270
to
55248a3
Compare
aad712a
to
dc9a490
Compare
55248a3
to
5c21300
Compare
dc9a490
to
cfcd238
Compare
5c21300
to
011989c
Compare
cfcd238
to
916f3ea
Compare
011989c
to
71bf953
Compare
916f3ea
to
1d5bf36
Compare
71bf953
to
7dd39c7
Compare
1d5bf36
to
c407973
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits for your consideration.
@@ -381,6 +382,16 @@ | |||
} | |||
|
|||
|
|||
@pytest.fixture(name="api_key") | |||
def api_key_fixture() -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit; I feel like this makes more sense to be a constant than a fixture, personally. I feel like when I see a mountain of fixtures, I get overwhelmed by a test. I know that this doesn't originate with this PR, but I personally prefer TEST_API_KEY, TEST_API_SECRET, etc
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's fair - done in ad9fd4e
api_key: str, | ||
api_secret: str, | ||
connector_id: str, | ||
destination_id: str, | ||
group_id: str, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yea these all feel like they could be constants and then the test feels a little less overwhelming, lol.
…nectorTableProps method
c407973
to
4913f6e
Compare
Summary & Motivation
This PR implements
FivetranWorkspaceData.to_fivetran_connector_table_props_data()
, a method that converts aFivetranWorkspaceData
object to a list of FivetranConnectorTableProps.To create the asset spec, we need one
FivetranConnectorTableProps
object per connector table. This method parses the API raw data to create the FivetranConnectorTablePropsobject, which is compatible with the
DagsterFivetranTranslator`.This will be used in the
defs_from_state
method of the stated-backed defs loader in a subsequent PR.How I Tested These Changes
Additional unit test