-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Pipelines with multiple inputs #29
Comments
@lschr Thanks for this great initiative! I would personally start extending the |
This should be easy enough. The only problem I have come across so far is attribute propagation. A few possibilities:
I'd strongly prefer the first or second solution. The third option is cumbersome to use in my opinion; the propagated attribute would need different treatment from the original attribute. The fourth option is messy altogether. Anyone needing non-trivial treatment of attributes can do so by subclassing Any thoughts? |
Since there are so many options, maybe use a kwarg to specify how this is done, a la pandas’ join()? Defaults to “right” but you could also implement “left.” Then if you discover cases where others are needed you could add them.
Nathan
On Oct 10, 2017, 6:47 AM -0700, lschr <[email protected]>, wrote:
This should be easy enough. The only problem I have come across so far is attribute propagation. A few possibilities:
* Propagate only attributes from the first ancestor
* Propagate attributes from all ancestors as long as there are no name conflicts. If there are conflicts, use the attribute from the first ancestor that has the attribute.
* Return a tuple containing the respective attribute values from all ancestors.
* Do some name mangling to avoid conflicts.
I'd strongly prefer the first or second solution. The third option is cumbersome to use in my opinion; the propagated attribute would need different treatment from the original attribute. The fourth option is messy altogether. Anyone needing non-trivial treatment of attributes can do so by subclassing Pipeline.
Any thoughts?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#29 (comment)>, or mute the thread<https://github.com/notifications/unsubscribe-auth/AC2NbfTEdRkEF0madYBA9YSkPPbANYBrks5sq3VqgaJpZM4PyYoz>.
|
#30 is an initial implementation. I suggest moving the discussion over there. |
The implementation in #30 is finished now. |
I was thinking about implementing support for multiple inputs (and maybe outputs) to pipelines. For example, it would be nice to be able to
Before I start I wanted to ask how to go about it. A few possibilities, decreasing in elegance (in my opinion) but increasing in API compatibility:
Pipeline
class (andpipeline
decorator)__init__
arguments:def __init__(proc_func, *ancestors, propagate_attrs=None)
. This breaks the API.__init__
arguments in order:def __init__(*args, propagate_attrs=None)
, whereargs[-1]
isproc_func
. This turnspropagate_attrs
into a keyword-only andproc_func
andancestor
into positional arguments but otherwise preserves the API.def __init__(ancestor, proc_func, propagate_attrs=None, *other_ancestors)
Which is the preferred way?
The text was updated successfully, but these errors were encountered: