Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Check input dimensions against the initialized model_ #143

Open
wants to merge 21 commits into
base: master
Choose a base branch
from

Conversation

adriangb
Copy link
Owner

Closes #106

@codecov-io
Copy link

codecov-io commented Nov 29, 2020

Codecov Report

Merging #143 (56918eb) into master (e419d8e) will increase coverage by 0.01%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #143      +/-   ##
==========================================
+ Coverage   99.52%   99.53%   +0.01%     
==========================================
  Files           5        5              
  Lines         627      646      +19     
==========================================
+ Hits          624      643      +19     
  Misses          3        3              
Impacted Files Coverage Δ
scikeras/_utils.py 100.00% <100.00%> (ø)
scikeras/wrappers.py 99.22% <100.00%> (+0.03%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e419d8e...56918eb. Read the comment docs.

@adriangb adriangb marked this pull request as ready for review November 30, 2020 02:40
@adriangb
Copy link
Owner Author

@stsievert tagging you to take a look whenever you have a chance. This is addressing an issue you had opened (#106)

# generic transformers just to make them SciKeras compatible
n_out_expect = getattr(self, "n_outputs_expected_", None)
if n_out_expect:
if n_out_expect != len(self.model_.outputs):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Style nit: I'd collapse this if-statement:

n_out_expect = getattr(self, "n_outputs_expected_", None)
if n_out_expect and n_out_expect != len(self.model_.outputs):
    ...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep I'll implement that.

Do you think we should redefine n_outputs_expected_: int -> expected_output_shape_: Union[List[int], Dict[str, int]] so that each element/key can map to the expected output shape for each output?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we should redefine n_outputs_expected_: int -> expected_output_shape_: Union[List[int], Dict[str, int]] so that each element/key can map to the expected output shape for each output?

I don't think so. An integer suffices for basic use cases (regressors and classifiers). I can only see output shape being relevant for autoencoders.

Copy link
Owner Author

@adriangb adriangb Nov 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of multi-input or multi-output models (as to why it would be a list/dict).

As to why we'd need the expected output shape specifically, for classifiers, the output shape depends on the classes and type of problem (i.e. Dense(1, activation="sigmoid") and Dense(2, activation="softmax") are generally equivalent), which is why we can't just check the shape of X like we do with the input layer.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm thinking of multi-input or multi-output models (as to why it would be a list/dict).

Maybe that's not a bad idea. I think the type of n_outputs_expected should be Union[int, List[int], Dict[str, int]]. That supports more complex models and simple user-defined transformers.

In this function that will mean something like this:

n_out = getattr(self, "n_outputs_expected", None)
if isinstance(n_out, int):
    n_out = [n_out]
if n_out and n_out != len(self.model_.outputs):
    ...

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another alternative is of course to revert to c7e6723 and try to continues this conversation in it's own issue/PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I like checking the model better. But why is n_outputs_expected_ necessary?

Another alternative is of course to revert to c7e6723

Do you mean delete the release from PyPI/GitHub? If I were handling releases, I think I'd rather delay any checks on n_outputs_expected_ to a later release.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not required. There's other (perhaps better) ways to achieve the same thing.

That commit is in this branch. I was suggesting we revert to that commit in this branch and table this discussion for another issue/PR since the rest of the changes here stand on their own.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That commit is in this branch. I was suggesting we revert to that commit in this branch and table this discussion for another issue/PR since the rest of the changes here stand on their own.

Oh, I see. I think reverting and a separate issue is a good idea.

It's not required. There's other (perhaps better) ways to achieve the same thing.

I think those (perhaps better) ideas are worth exploring.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. I reverted to c7e6723 and implemented the collapsed logic in #143 (comment). I'll resolve this thread and we can continue the discussion on the rest of the PR. I also opened #148 to track the discussion surrounding output validation. Thank you for the feedback thus far!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Validate inputs to Keras model
3 participants