-
Notifications
You must be signed in to change notification settings - Fork 84
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WIP] distribution classes #390
Open
pavanramkumar
wants to merge
24
commits into
glm-tools:master
Choose a base branch
from
pavanramkumar:distributions
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
24 commits
Select commit
Hold shift + click to select a range
78ef281
migrate gaussian distr to subclass of BaseDistribution
pavanramkumar 5cf1a01
add Poisson class
pavanramkumar 5592298
add PoissonSoftplus class
pavanramkumar f53e5f3
add NegBinomialSoftplus class
pavanramkumar a93165d
add Binomial and Probit classes
pavanramkumar be79bdc
add GammaSoftplus class
pavanramkumar 9e43a02
TST add test for distributions
jasmainak 3818291
DOC: add example of cloglog
jasmainak f6536d1
Simplify Poisson
jasmainak 63ee2ca
No need of init
jasmainak 4cfe839
revert API of gradhess_log_likelihood_1d to accept z
pavanramkumar 6a8b7f4
make most tests pass
pavanramkumar 051c1d9
re-add gradhess_log_likelihood_1d for softplus
pavanramkumar e0c2f31
bug in cdfast update for z
pavanramkumar e6e6d22
style fix for test_compare_sklearn
pavanramkumar 888e5b8
private method _set_distr derives self.distr_ object from self.distr str
pavanramkumar d33c6ca
remove legacy functions
pavanramkumar c495dfc
remove more comments
pavanramkumar be4cd0b
simulate methods for distribution classes
pavanramkumar b2a9cce
flake8
pavanramkumar 179e1c6
Flake8 and style
jasmainak 615fcf9
DOC update api and refactor softplus
jasmainak 9bdab76
FIX probit
jasmainak 081fa04
TST make basedistribution happy
jasmainak File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,76 @@ | ||
""" | ||
====================== | ||
Rolling out custom GLM | ||
====================== | ||
|
||
This is an example demonstrating rolling out your custom | ||
GLM class using Pyglmnet. | ||
""" | ||
######################################################## | ||
|
||
# Author: Pavan Ramkumar <[email protected]> | ||
# License: MIT | ||
|
||
from sklearn.model_selection import train_test_split | ||
from pyglmnet import GLMCV, datasets | ||
|
||
######################################################## | ||
# Download and preprocess data files | ||
|
||
X, y = datasets.fetch_community_crime_data() | ||
n_samples, n_features = X.shape | ||
|
||
######################################################## | ||
# Split the data into training and test sets | ||
|
||
X_train, X_test, y_train, y_test = \ | ||
train_test_split(X, y, test_size=0.33, random_state=0) | ||
|
||
######################################################## | ||
# Now we define our own distribution class. This must | ||
# inherit from BaseDistribution. The BaseDistribution | ||
# class requires defining the following methods: | ||
# - mu: inverse link function | ||
# - grad_mu: gradient of the inverse link function | ||
# - log_likelihood: the log likelihood function | ||
# - grad_log_likelihood: the gradient of the log | ||
# likelihood. | ||
# All distributions in pyglmnet inherit from BaseDistribution | ||
# | ||
# This is really powerful. For instance, we can start from | ||
# the existing Binomial distribution and override mu and grad_mu | ||
# if we want to use the cloglog link function. | ||
|
||
import numpy as np # noqa: E402 | ||
from pyglmnet.distributions import Binomial # noqa: E402 | ||
|
||
|
||
class CustomBinomial(Binomial): | ||
"""Custom binomial distribution.""" | ||
|
||
def mu(self, z): | ||
"""clogclog inverse link""" | ||
mu = 1 - np.exp(-np.exp(z)) | ||
return mu | ||
|
||
def grad_mu(self, z): | ||
"""Gradient of inverse link.""" | ||
grad_mu = np.exp(1 - np.exp(z)) | ||
return grad_mu | ||
|
||
|
||
distr = CustomBinomial() | ||
|
||
######################################################## | ||
# Now we pass it to the GLMCV class just as before. | ||
|
||
# use the default value for reg_lambda | ||
glm = GLMCV(distr=distr, alpha=0.05, score_metric='pseudo_R2', cv=3, | ||
tol=1e-4) | ||
|
||
# fit model | ||
glm.fit(X_train, y_train) | ||
|
||
# score the test set prediction | ||
y_test_hat = glm.predict_proba(X_test) | ||
print("test set pseudo $R^2$ = %f" % glm.score(X_test, y_test)) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pavanramkumar I'm not sure this is legit. The
log_likelihood
function forBinomial
does not callself.mu
or does it? Is the log likelihood the log likelihood taking into account the link function?