Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] distribution classes #390

Open
wants to merge 24 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
24 commits
Select commit Hold shift + click to select a range
78ef281
migrate gaussian distr to subclass of BaseDistribution
pavanramkumar Aug 16, 2020
5cf1a01
add Poisson class
pavanramkumar Aug 18, 2020
5592298
add PoissonSoftplus class
pavanramkumar Aug 18, 2020
f53e5f3
add NegBinomialSoftplus class
pavanramkumar Aug 18, 2020
a93165d
add Binomial and Probit classes
pavanramkumar Aug 18, 2020
be79bdc
add GammaSoftplus class
pavanramkumar Aug 18, 2020
9e43a02
TST add test for distributions
jasmainak Aug 18, 2020
3818291
DOC: add example of cloglog
jasmainak Aug 18, 2020
f6536d1
Simplify Poisson
jasmainak Aug 18, 2020
63ee2ca
No need of init
jasmainak Aug 18, 2020
4cfe839
revert API of gradhess_log_likelihood_1d to accept z
pavanramkumar Aug 19, 2020
6a8b7f4
make most tests pass
pavanramkumar Aug 19, 2020
051c1d9
re-add gradhess_log_likelihood_1d for softplus
pavanramkumar Aug 19, 2020
e0c2f31
bug in cdfast update for z
pavanramkumar Aug 19, 2020
e6e6d22
style fix for test_compare_sklearn
pavanramkumar Aug 19, 2020
888e5b8
private method _set_distr derives self.distr_ object from self.distr str
pavanramkumar Aug 19, 2020
d33c6ca
remove legacy functions
pavanramkumar Aug 19, 2020
c495dfc
remove more comments
pavanramkumar Aug 19, 2020
be4cd0b
simulate methods for distribution classes
pavanramkumar Aug 19, 2020
b2a9cce
flake8
pavanramkumar Aug 19, 2020
179e1c6
Flake8 and style
jasmainak Aug 19, 2020
615fcf9
DOC update api and refactor softplus
jasmainak Aug 19, 2020
9bdab76
FIX probit
jasmainak Aug 19, 2020
081fa04
TST make basedistribution happy
jasmainak Aug 19, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
30 changes: 24 additions & 6 deletions doc/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -6,14 +6,32 @@ API Documentation

.. currentmodule:: pyglmnet

Classes
=======
GLM Classes
===========

.. autoclass:: GLM
:members:
.. currentmodule:: pyglmnet

.. autosummary::
:toctree: generated/

GLM
GLMCV

Distribution Classes
====================

.. currentmodule:: pyglmnet.distributions

.. autosummary::
:toctree: generated/

.. autoclass:: GLMCV
:members:
BaseDistribution
Poisson
PoissonSoftplus
NegBinomialSoftplus
Binomial
Probit
GammaSoftplus


Datasets
Expand Down
76 changes: 76 additions & 0 deletions examples/plot_custom_distributions.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
"""
======================
Rolling out custom GLM
======================

This is an example demonstrating rolling out your custom
GLM class using Pyglmnet.
"""
########################################################

# Author: Pavan Ramkumar <[email protected]>
# License: MIT

from sklearn.model_selection import train_test_split
from pyglmnet import GLMCV, datasets

########################################################
# Download and preprocess data files

X, y = datasets.fetch_community_crime_data()
n_samples, n_features = X.shape

########################################################
# Split the data into training and test sets

X_train, X_test, y_train, y_test = \
train_test_split(X, y, test_size=0.33, random_state=0)

########################################################
# Now we define our own distribution class. This must
# inherit from BaseDistribution. The BaseDistribution
# class requires defining the following methods:
# - mu: inverse link function
# - grad_mu: gradient of the inverse link function
# - log_likelihood: the log likelihood function
# - grad_log_likelihood: the gradient of the log
# likelihood.
# All distributions in pyglmnet inherit from BaseDistribution
#
# This is really powerful. For instance, we can start from
# the existing Binomial distribution and override mu and grad_mu
# if we want to use the cloglog link function.

import numpy as np # noqa: E402
from pyglmnet.distributions import Binomial # noqa: E402


class CustomBinomial(Binomial):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pavanramkumar I'm not sure this is legit. The log_likelihood function for Binomial does not call self.mu or does it? Is the log likelihood the log likelihood taking into account the link function?

"""Custom binomial distribution."""

def mu(self, z):
"""clogclog inverse link"""
mu = 1 - np.exp(-np.exp(z))
return mu

def grad_mu(self, z):
"""Gradient of inverse link."""
grad_mu = np.exp(1 - np.exp(z))
return grad_mu


distr = CustomBinomial()

########################################################
# Now we pass it to the GLMCV class just as before.

# use the default value for reg_lambda
glm = GLMCV(distr=distr, alpha=0.05, score_metric='pseudo_R2', cv=3,
tol=1e-4)

# fit model
glm.fit(X_train, y_train)

# score the test set prediction
y_test_hat = glm.predict_proba(X_test)
print("test set pseudo $R^2$ = %f" % glm.score(X_test, y_test))
2 changes: 1 addition & 1 deletion pyglmnet/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@
__version__ = '1.2.dev0'


from .pyglmnet import GLM, GLMCV, _grad_L2loss, _L2loss, simulate_glm, _gradhess_logloss_1d, _loss, ALLOWED_DISTRS
from .pyglmnet import GLM, GLMCV, _grad_L2loss, _L2loss, simulate_glm, _loss, ALLOWED_DISTRS
from .utils import softmax, label_binarizer, set_log_level
from .datasets import fetch_tikhonov_data, fetch_rgc_spike_trains, fetch_community_crime_data, fetch_group_lasso_data
from . import externals
Loading