Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Uniformize initialization for all algorithms #195

Merged
Show file tree
Hide file tree
Changes from 4 commits
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
a2ae9e1
initiate PR
Apr 23, 2019
5e626d5
Revert "initiate PR"
Apr 24, 2019
ffcfa2d
FEAT: uniformize init for NCA and RCA
Apr 24, 2019
27eb74b
Let the check of num_dims be done in the other PR
Apr 24, 2019
4395c13
Add metric initialization for algorithms that learn a mahalanobis matrix
May 2, 2019
09fda87
Add initialization for MLKR
May 2, 2019
0e59d72
FIX: fix error message for dimension
May 2, 2019
60ca662
FIX fix StringRepr for MLKR
May 2, 2019
71a75ed
FIX tests by reshaping to the right dataset size
May 3, 2019
1b2d296
Remove lda in docstring of MLKR
May 3, 2019
bd709e9
MAINT: Add deprecation for previous initializations
May 9, 2019
e162e6a
Update tests with new initialization
May 9, 2019
d1e88af
Make random init for mahalanobis metric generate an SPD matrix
May 9, 2019
eb98eff
Ensure the input mahalanobis metric initialization is symmetric, and …
May 9, 2019
508d94e
various fixes
May 9, 2019
bbf31cb
MAINT: various refactoring
May 9, 2019
aafa8e2
FIX fix default covariance for SDML in tests
May 9, 2019
748459e
Enhance docstring
May 10, 2019
06a55da
Set random state for SDML
May 10, 2019
d321319
Merge branch 'master' into feat/uniformized_initial_metric
May 13, 2019
26fb9e7
Fix merge remove_spaces that was forgotten
May 13, 2019
5e3daa4
Fix indent
May 13, 2019
e86b61b
XP: try to change the way we choose n_components to see if it fixes t…
May 13, 2019
0b69e7e
Revert "XP: try to change the way we choose n_components to see if it…
May 13, 2019
95a86a9
Be more tolerant in test
May 13, 2019
d622fae
Add test for singular covariance matrix
May 13, 2019
d2cc7ce
Fix test_singular_covariance_init
May 14, 2019
a7d2791
DOC: update docstring saying pseudo-inverse
May 14, 2019
3590cfa
Revert "Fix test_singular_covariance_init"
May 14, 2019
503a715
Ensure definiteness before returning the inverse
May 15, 2019
32bbdf3
wip deal with non definiteness
May 15, 2019
fdad8c2
Rename init to prior for SDML and LSML
May 16, 2019
5b048b4
Update error messages with either prior or init
May 17, 2019
d96930d
Remove message
May 17, 2019
2de3d4c
A few nitpicks
May 18, 2019
499a296
PEP8 errors + change init in test
May 18, 2019
c371d0c
STY: PEP8 fixes
May 18, 2019
b63d017
Address and remove TODOs
May 20, 2019
a5a6af8
Replace init by prior for ITML
Jun 3, 2019
9c4d70d
TST: fix ITML test with init changed into prior
Jun 3, 2019
8cb9c42
Add precision for MMC
Jun 4, 2019
b40e75e
Add ChangedBehaviorWarning for the algorithms that changed
Jun 5, 2019
0f5b9ed
Merge branch 'master' into feat/uniformized_initial_metric
Jun 5, 2019
cec35ab
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 5, 2019
617ab0a
Remove the warnings check since we now have a ChangedBehaviorWarning
Jun 5, 2019
a5b13f2
Be more precise: it should not raise any ConvergenceWarningError
Jun 5, 2019
bd43168
Merge branch 'master' into feat/uniformized_initial_metric
Jun 5, 2019
0ea0aa6
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 6, 2019
6e452ed
FIX remaining comment
Jun 6, 2019
4f822a8
TST: update test error message
Jun 6, 2019
c19ca4c
Improve readability
Jun 6, 2019
d8181d0
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 7, 2019
21e20c6
Merge branch 'master' into feat/uniformized_initial_metric
Jun 7, 2019
e27d8a1
TST: Fix docsting lmnn
Jun 7, 2019
4a861c8
Fix warning messages
Jun 7, 2019
dd2b8c7
Fix warnings messages changed
Jun 7, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
123 changes: 121 additions & 2 deletions metric_learn/_util.py
Original file line number Diff line number Diff line change
@@ -1,10 +1,13 @@
import warnings
import numpy as np
import six
from numpy.linalg import LinAlgError
from sklearn.decomposition import PCA
from sklearn.utils import check_array
from sklearn.utils.validation import check_X_y
from sklearn.utils.validation import check_X_y, check_random_state
from metric_learn.exceptions import PreprocessorError
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
import sys
import time

# hack around lack of axis kwarg in older numpy versions
try:
Expand Down Expand Up @@ -405,3 +408,119 @@ def validate_vector(u, dtype=None):
if u.ndim > 1:
raise ValueError("Input vector should be 1-D.")
return u


def _initialize_transformer(num_dims, X, y=None, init='auto', verbose=False,
random_state=None):
"""Returns the initial transformer to be used depending on the arguments.

Parameters
----------
num_dims : int
The number of components to take. (Note: it should have been checked
before, meaning it should not be None and it should be a value in
[1, X.shape[1]])

X : array-like
The input samples.

y : array-like or None
The input labels (or not if there are no labels).

init : array-like or None or str
The initial matrix.

verbose : bool
Whether to print the details of the initialization or not.

random_state: int or `numpy.RandomState` or None, optional (default=None)
A pseudo random number generator object or a seed for it if int. If
``init='random'``, ``random_state`` is used to initialize the random
transformation. If ``init='pca'``, ``random_state`` is passed as an
argument to PCA when initializing the transformation.

Returns
-------
init_transformer : `numpy.ndarray`
The initial transformer to use.
"""

if isinstance(init, np.ndarray):
init = check_array(init)

# Assert that init.shape[1] = X.shape[1]
if init.shape[1] != X.shape[1]:
raise ValueError('The input dimensionality ({}) of the given '
'linear transformation `init` must match the '
'dimensionality of the given inputs `X` ({}).'
.format(init.shape[1], X.shape[1]))

# Assert that init.shape[0] <= init.shape[1]
if init.shape[0] > init.shape[1]:
raise ValueError('The output dimensionality ({}) of the given '
'linear transformation `init` cannot be '
'greater than its input dimensionality ({}).'
.format(init.shape[0], init.shape[1]))

if num_dims is not None:
# Assert that self.num_dims = init.shape[0]
if num_dims != init.shape[0]:
raise ValueError('The preferred dimensionality of the '
'projected space `num_dims` ({}) does'
' not match the output dimensionality of '
'the given linear transformation '
'`init` ({})!'
.format(num_dims,
init.shape[0]))
elif init in ['auto', 'pca', 'lda', 'identity', 'random']:
pass
else:
raise ValueError(
"`init` must be 'auto', 'pca', 'lda', 'identity', 'random' "
"or a numpy array of shape (num_dims, n_features).")

random_state = check_random_state(random_state)
transformation = init
if isinstance(init, np.ndarray):
pass
else:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially for long functions with lots of nesting like this one, I prefer the "no else" style:

if condition:
  return foo
# everything else at original indent

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's right, it's better, done

n_samples, n_features = X.shape
num_dims = num_dims or n_features
if init == 'auto':
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This might be simpler to test if we broke out pieces into standalone functions. For example, the "auto-select" logic could be it's own function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, done, and tested the function

n_classes = len(np.unique(y))
if num_dims <= min(n_features, n_classes - 1):
init = 'lda'
elif num_dims < min(n_features, n_samples):
init = 'pca'
else:
init = 'identity'
if init == 'identity':
transformation = np.eye(num_dims, X.shape[1])
elif init == 'random':
transformation = random_state.randn(num_dims,
X.shape[1])
elif init in {'pca', 'lda'}:
init_time = time.time()
if init == 'pca':
pca = PCA(n_components=num_dims,
random_state=random_state)
if verbose:
print('Finding principal components... ')
sys.stdout.flush()
pca.fit(X)
transformation = pca.components_
elif init == 'lda':
lda = LinearDiscriminantAnalysis(n_components=num_dims)
if verbose:
print('Finding most discriminative components... ')
sys.stdout.flush()
lda.fit(X, y)
transformation = lda.scalings_.T[:num_dims]
if verbose:
print('done in {:5.2f}s'.format(time.time() - init_time))
return transformation


def _initialize_metric_mahalanobis():
"""Returns the initial metric from arguments"""
raise NotImplementedError
2 changes: 1 addition & 1 deletion metric_learn/base_metric.py
Original file line number Diff line number Diff line change
Expand Up @@ -288,7 +288,7 @@ def get_mahalanobis_matrix(self):

Returns
-------
M : `numpy.ndarray`, shape=(n_components, n_features)
M : `numpy.ndarray`, shape=(num_dims, n_features)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't this always be (n_features, n_features)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that's right, I didn't pay attention, thanks

The copy of the learned Mahalanobis matrix.
"""
return self.transformer_.T.dot(self.transformer_)
Expand Down
2 changes: 1 addition & 1 deletion metric_learn/covariance.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ class Covariance(MahalanobisMixin, TransformerMixin):

Attributes
----------
transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)
"""
Expand Down
4 changes: 2 additions & 2 deletions metric_learn/itml.py
Original file line number Diff line number Diff line change
Expand Up @@ -145,7 +145,7 @@ class ITML(_BaseITML, _PairsClassifierMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)

Expand Down Expand Up @@ -213,7 +213,7 @@ class ITML_Supervised(_BaseITML, TransformerMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)
"""
Expand Down
68 changes: 62 additions & 6 deletions metric_learn/lmnn.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,18 +16,61 @@
from six.moves import xrange
from sklearn.metrics import euclidean_distances
from sklearn.base import TransformerMixin

from metric_learn._util import _initialize_transformer
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from ._util import ...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, done

from .base_metric import MahalanobisMixin


# commonality between LMNN implementations
class _base_LMNN(MahalanobisMixin, TransformerMixin):
def __init__(self, k=3, min_iter=50, max_iter=1000, learn_rate=1e-7,
regularization=0.5, convergence_tol=0.001, use_pca=True,
verbose=False, preprocessor=None):
def __init__(self, init='auto', k=3, min_iter=50, max_iter=1000,
learn_rate=1e-7, regularization=0.5, convergence_tol=0.001,
use_pca=True, num_dims=None,
verbose=False, preprocessor=None, random_state=None):
"""Initialize the LMNN object.

Parameters
----------
init : string or numpy array, optional (default='auto')
Initialization of the linear transformation. Possible options are
'auto', 'pca', 'lda', 'identity', 'random', and a numpy array of shape
(n_features_a, n_features_b).

'auto'
Depending on ``num_dims``, the most reasonable initialization
will be chosen. If ``num_dims <= n_classes`` we use 'lda', as
it uses labels information. If not, but
``num_dims < min(n_features, n_samples)``, we use 'pca', as
it projects data in meaningful directions (those of higher
variance). Otherwise, we just use 'identity'.

'pca'
``num_dims`` principal components of the inputs passed
to :meth:`fit` will be used to initialize the transformation.
(See `sklearn.decomposition.PCA`)

'lda'
``min(num_dims, n_classes)`` most discriminative
components of the inputs passed to :meth:`fit` will be used to
initialize the transformation. (If ``num_dims > n_classes``,
the rest of the components will be zero.) (See
`sklearn.discriminant_analysis.LinearDiscriminantAnalysis`)

'identity'
If ``num_dims`` is strictly smaller than the
dimensionality of the inputs passed to :meth:`fit`, the identity
matrix will be truncated to the first ``num_dims`` rows.

'random'
The initial transformation will be a random array of shape
`(num_dims, n_features)`. Each value is sampled from the
standard normal distribution.

numpy array
n_features_b must match the dimensionality of the inputs passed to
:meth:`fit` and n_features_a must be less than or equal to that.
If ``num_dims`` is not None, n_features_a must match it.

k : int, optional
Number of neighbors to consider, not including self-edges.

Expand All @@ -37,15 +80,24 @@ def __init__(self, k=3, min_iter=50, max_iter=1000, learn_rate=1e-7,
preprocessor : array-like, shape=(n_samples, n_features) or callable
The preprocessor to call to get tuples from indices. If array-like,
tuples will be formed like this: X[indices].

random_state : int or numpy.RandomState or None, optional (default=None)
A pseudo random number generator object or a seed for it if int. If
``init='random'``, ``random_state`` is used to initialize the random
transformation. If ``init='pca'``, ``random_state`` is passed as an
argument to PCA when initializing the transformation.
"""
self.init = init
self.k = k
self.min_iter = min_iter
self.max_iter = max_iter
self.learn_rate = learn_rate
self.regularization = regularization
self.convergence_tol = convergence_tol
self.use_pca = use_pca
self.num_dims = num_dims # FIXME Tmp fix waiting for #167 to be merged:
self.verbose = verbose
self.random_state = random_state
super(_base_LMNN, self).__init__(preprocessor)


Expand All @@ -60,13 +112,15 @@ def fit(self, X, y):
X, y = self._prepare_inputs(X, y, dtype=float,
ensure_min_samples=2)
num_pts, num_dims = X.shape
# FIXME Tmp fix waiting for #167 to be merged:
n_dims = self.num_dims if self.num_dims is not None else num_dims
unique_labels, label_inds = np.unique(y, return_inverse=True)
if len(label_inds) != num_pts:
raise ValueError('Must have one label per point.')
self.labels_ = np.arange(len(unique_labels))
if self.use_pca:
warnings.warn('use_pca does nothing for the python_LMNN implementation')
self.transformer_ = np.eye(num_dims)
self.transformer_ = _initialize_transformer(n_dims, X, y, self.init,
self.verbose,
self.random_state)
required_k = np.bincount(label_inds).min()
if self.k > required_k:
raise ValueError('not enough class labels for specified k'
Expand Down Expand Up @@ -99,6 +153,8 @@ def fit(self, X, y):
self._loss_grad(X, L, dfG, impostors, 1, k, reg, target_neighbors, df,
a1, a2))

it = 1 # we already made one iteration
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a no-op line. Maybe just update the "main loop" comment?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure I understand, I think the it=1 still useful for coherence, if one puts max_iter=1, it would break otherwise (variable not defined)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I didn't see that we're referencing it after the loop.


# main loop
for it in xrange(2, self.max_iter):
# then at each iteration, we try to find a value of L that has better
Expand Down
4 changes: 2 additions & 2 deletions metric_learn/lsml.py
Original file line number Diff line number Diff line change
Expand Up @@ -139,7 +139,7 @@ class LSML(_BaseLSML, _QuadrupletsClassifierMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)
"""
Expand Down Expand Up @@ -175,7 +175,7 @@ class LSML_Supervised(_BaseLSML, TransformerMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)
"""
Expand Down
4 changes: 2 additions & 2 deletions metric_learn/mmc.py
Original file line number Diff line number Diff line change
Expand Up @@ -356,7 +356,7 @@ class MMC(_BaseMMC, _PairsClassifierMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)

Expand Down Expand Up @@ -406,7 +406,7 @@ class MMC_Supervised(_BaseMMC, TransformerMixin):
n_iter_ : `int`
The number of iterations the solver has run.

transformer_ : `numpy.ndarray`, shape=(num_dims, n_features)
transformer_ : `numpy.ndarray`, shape=(n_features, n_features)
The linear transformation ``L`` deduced from the learned Mahalanobis
metric (See function `transformer_from_metric`.)
"""
Expand Down
Loading