-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TST: compare to sklearn.neural_network #155
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #155 +/- ##
==========================================
Coverage ? 99.35%
==========================================
Files ? 5
Lines ? 618
Branches ? 0
==========================================
Hits ? 614
Misses ? 4
Partials ? 0 Continue to review full report at Codecov.
|
Co-authored-by: Scott Sievert <[email protected]>
.github/workflows/tests.yaml
Outdated
tf-version: [2.2.0] | ||
tf-version: [2.4.0] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These version bumps are needed in order to be able to get rid of _windows_upcast_ints
and have the same tests run on all platforms (otherwise there would've been a mess of pytest.skipif
).
This also coincides with the same version bump in #128 , so I think it makes sense to do. We could perhaps do the version bump in a separate PR from those two to keep the diffs saner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What will fail when this PR is run with TF v2.2 on Windows? Will Windows users notice a difference?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
test_input_dtype_conversion
will fail because on windows _windows_upcast_ints
converts all ints to int64 before passing them to Keras.
This could be worked around by checking the dtype at a point before that (but the very fact that might work is an argument for not doing that) or by using pytest.skipif(os.platform == "nt")
or something.
Windows users won't notice any difference (I doubt the type conversion mattered in terms of performance).
I'm going to make a develop
branch where I'll do the version bump separately and then redirect this PR, #128 and #143 to develop
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the only difference a Windows user will notice with TF 2.2 is that the input and output dtypes aren't exactly the same? They'll input int32
and get out int64
or vice versa?
Is that fixed under TF 2.4?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Previously, when users/the data transformer handed data of dtype int32
we transformed it to int64
since TF crashed with int32
. Since we never returned data to the users, they did not see the difference. That seems to be fixed in 2.4.0, so we can remove the extra conversion without users noticing any difference.
This behavior/conversion never affected the output dtypes, that is completely separate. The reason it was originally in this PR (I since moved it to a separate commit to keep this PR focused) is that this PR implements a test that was not compatible with the ad-hoc dtype conversion being done in _windows_upcast_ints
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, good. If there was never a test for windows_upcast_ints don't worry about it. I can't parse all the test changes, which is why I asked.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if this answers your question on test deletion.
Why not delete a test if the [private function] has been removed? (paraphrased)
Why do you write tests? I write tests to ensure the software works as expected for the end user when they pick it up, and in every corner case too.
There's an argument for testing private behavior, or unit testing. But that's only appropriate for behemoth projects with many developers.
One of corner case is Windows for SciKeras. The fact that code has been deleted is irrelevant to test deletion. Again, why do you write tests? To ensure SciKeras works well for Windows users.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do realize the test diff is pretty large, and not particularly easy to parse. I am happy to try to rework this into multiple PRs, write out the new tests in pseudcode or anything else that might make review easier, just let me know what you think might help.
Regarding the question about test deletion, thank you for answering it even though it was a pretty poor question on my part. I think we both agree that writing unit tests for private functions like _windows_upcast_ints
is not a good approach. Luckily, you already convinced me that that approach to testing is not optimal, so we never implemented test_windows_upcast_ints
.
Because of your help and suggestions, I was able to remove _windows_upcast_ints
in 1c1e17b (after bumping the TF version in 294183a) without having to change any tests. Thus all of the test changes in this PR are purely to re-organize our tests and implement comparisons to sklearn MLP{Classifier,Regressor}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do realize the test diff is pretty large, and not particularly easy to parse.
Don't worry about the diff. I only wrote that to let you know I only skimmed the tests. It sure sounds like you've got the testing under control.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just checking, would you like me to leave this open for a more in-depth review of the tests, or should I move forward with this and/or #128, which are both internal refactors? No problem either way, just let me know.
Co-authored-by: Scott Sievert <[email protected]>
…scikeras into mlp-input-output-tests
* TST: compare to sklearn.neural_network (#155) * MAINT/ENH: SaveModel based serialization (#128) * Bump min TensorFlow version to 2.4.0 to accomodate other changes Co-authored-by: Scott <[email protected]>
Co-authored-by: Scott Sievert <[email protected]>
Closes #119.
An overhaul of
test_input_outputs.py
to combine tests and compare toMLPRegressor
/MLPClassifier
.A couple of interesting notes:
0/1
. But there is something to be said for returning the same dtype as the input.