-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
docs: ♻️ convert from PlantUML into pseudocode #1019
Conversation
extracts properties from the file into a `ResourceProperties` object. This | ||
function is often followed by `edit_resource_properties()` to fill in any | ||
remaining missing fields, like the `path` property field. Usually, you use |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(We'll have to think about how this works exactly, but it would be nice if the path
was assigned automatically when linking the resource (properties) to a package (properties).)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yea, agreed. I think once we use it in the guides/examples, we'll have a better idea for it.
data_path: The path to a raw data file of a supported format. | ||
|
||
Returns: | ||
Outputs a `ResourceProperties` object. Use `write_resource_properties()` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll just have to make sure the flow still works if we do #995
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, for sure, I already have it in my mind to refactor those once we've discussed/decided! 😀
check_is_supported_format(data_path) | ||
# Make use of frictionless here? | ||
properties = extract_properties_from_file(data_path) | ||
return check_resource_properties(properties) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And probably ignore some expected errors like a missing path
.
Or maybe there is also an argument for allowing bad resource properties to be returned by this function in general? If, for whatever reason, the extraction doesn't yield a correct set of properties, it is still useful to me as a user to have something I can fix/edit and not have to type it all out by hand.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmmm, that's a very good point! But yea, ignore some things. Maybe check some basic things that should definitely be there after extracting.
""" | ||
check_is_file(data_path) | ||
check_is_supported_format(data_path) | ||
# Make use of frictionless here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah this is a good question!
If we're okay with using frictionless, we could get around the task of writing a converter from their classes to our properties classes by having them output a dict
and feeding that dict
to our ResourceProperties.from_dict
. On paper it sounds great!
The other, unquestionably more arduous, road would be to turn the data into a pandas dataframe (polars not yet supported for this), use pandera to extract the metadata, and convert that output to ResourceProperties
.
Description
This PR converts the
.puml
file into Python pseudocode.This PR needs an in-depth review.
Checklist