-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
No default loss for KerasClassifier #206
Comments
Keras itself does not provide a default loss. The loss depends on the dataset, so I don't think it makes sense to provide a concrete loss as a default. At most, we could accept a sting like But this also sounds like something users could implement, or that could be shown in notebooks. If we had to implement a default, I guess it would be Do you have any suggestions for a default loss you would like for |
Why should the user specify the Aren't particular classes of loss function pretty commonly used for classification regardless of dataset? (like cross entropy?). How does the Scikit-learn API choose the loss for their classifier when the loss function can be varied? (e.g., SGDClassifier). |
Because the loss function depends not only on the input data (which is not to hard to parse, especially since Scikit-Learn has tools like
Scikit-Learn has full control of the internals. They can choose/build a model to match a loss function; Scikeras/Keras have to work with whatever Model users create. That said, I don't really know what Scikit-Learn does internally, so I could be missing something. I don't mean to be overly negative on the idea. I do think that abstracting things like the loss function from users is good. The simpler it is for users to get started, the better. But I would like to work on core parts of SciKeras (like #167) that enable Scikit-Learn/Keras interoperability first before adding more convenience layers. |
To be clear, I think
Correct me where I'm wrong:
My question: why can't you perform some heuristic processing to determine the right loss function? I am more than okay with |
You are mostly right. But the devil is in the details. One-hot encoded targets are a good example. Some other sticky points I can think of:
So to answer your question, there is no particular reason why we can't implement a heuristic for this, I just think that the implementation is harder than it sounds, and it will be hard to think of all use cases and sharp edges. |
I'm not proposing to cover every possible scenario. I think covering the majority of the use cases would really enhance the use of SciKeras. Wouldn't |
I agree that we don't need to cover every scenario, but we at least have to think of how the failure might happen, and what the errors might look like to users. The last thing we want is to make it more confusing for users. If you want to tackle this in the short term, a PR would be welcome. You might have ideas about implementation that I can't think of.
Theoretically, for a single output, yes. In practice, |
Why does KerasClassifier not have a default loss?
scikeras/scikeras/wrappers.py
Lines 1229 to 1231 in d83bffa
Shouldn't there be a least a sensible default that works in most use cases?
Reference issues/PRs
dask/dask-ml#794
The text was updated successfully, but these errors were encountered: