-
Notifications
You must be signed in to change notification settings - Fork 230
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[MRG] Uniformize initialization for all algorithms #195
[MRG] Uniformize initialization for all algorithms #195
Conversation
I am thinking for algorithms that learn a transformation |
By the way I was thinking it could be cool to have an option to give any estimator as init, as long as it has an |
For algorithms learning the transformation, it seems a good idea to re-use as much of the code you wrote for NCA. These algorithms are typically nonconvex so a good initialization can make a large difference. I remember that some benchmarks you had done for NCA seemed to show that LDA init was good, so let's go for that. For algorithms learning the metric matrix: these are typically convex so the init is not as important (it mostly affects convergence speed). We also have to be careful with initialization that would lead to singular matrices (such as LDA) which could potentially make some solvers fail. My 2 cents: I would focus on 'identity', 'covariance' (corresponding to the inverse covariance), 'random' and numpy array, with the default to 'identity' unless some algorithms have a more natural init (e.g., SDML could be initialized to 'covariance'). I would not introduce LDA without a proper benchmark showing some concrete benefits on convergence speed and ensuring that it does not cause any issue. |
I guess you mean that this estimator would be fitted within our own fit method? Could be convenient indeed (and PCA/LDA init could be obtained that way) but I think not a high priority and there may be some implementation subtleties. Better keep this for later |
This reverts commit a2ae9e1.
test/test_mahalanobis_mixin.py
Outdated
@@ -138,9 +139,6 @@ def test_embed_dim(estimator, build_dataset): | |||
assert str(raised_error.value) == err_msg | |||
# we test that the shape is also OK when doing dimensionality reduction | |||
if type(model).__name__ in {'LFDA', 'MLKR', 'NCA', 'RCA'}: | |||
# TODO: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be changed (for good) in #193 (see https://github.com/metric-learn/metric-learn/pull/193/files#diff-2d2fa8dfd3b4bf0d149166d5e3efa526)
@@ -475,9 +475,7 @@ def test_singleton_class(self): | |||
|
|||
EPS = np.finfo(float).eps | |||
A = np.zeros((X.shape[1], X.shape[1])) | |||
np.fill_diagonal(A, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have to double check here I don't think that's what I should have done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done, I updated with a random matrix as init
test/metric_learn_test.py
Outdated
np.fill_diagonal(A, | ||
1. / (np.maximum(X.max(axis=0) - X.min(axis=0), EPS))) | ||
nca = NCA(max_iter=30, num_dims=X.shape[1]) | ||
nca = NCA(init=A, max_iter=30, num_dims=X.shape[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
test/metric_learn_test.py
Outdated
@@ -488,9 +486,7 @@ def test_one_class(self): | |||
y = self.iris_labels[self.iris_labels == 0] | |||
EPS = np.finfo(float).eps | |||
A = np.zeros((X.shape[1], X.shape[1])) | |||
np.fill_diagonal(A, | |||
1. / (np.maximum(X.max(axis=0) - X.min(axis=0), EPS))) | |||
nca = NCA(max_iter=30, num_dims=X.shape[1]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
mmc = MMC(convergence_threshold=0.01) | ||
mmc.fit(*wrap_pairs(self.iris_points, [a,b,c,d])) | ||
n_features = self.iris_points.shape[1] | ||
mmc = MMC(convergence_threshold=0.01, init=np.eye(n_features) / 10) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous default init was identity divided by 10
…say it should be SPD
- MLKR: update default test init - SDML: refactor prior_inv
@@ -366,7 +366,7 @@ def test_sdml_works_on_non_spd_pb_with_skggm(self): | |||
X, y = load_iris(return_X_y=True) | |||
sdml = SDML_Supervised(balance_param=0.5, sparsity_param=0.01, | |||
init='covariance') | |||
sdml.fit(X, y) | |||
sdml.fit(X, y, random_state=np.random.RandomState(42)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixing this random state makes the test pass (otherwise there is the error RuntimeError: There was a problem in SDML when using skggm's graphical lasso solver.
). I tried to understand why but I couldn't even reproduce the "no-bug" from this travis build for instance (https://travis-ci.org/metric-learn/metric-learn/jobs/527783562), locally, so I must be doing something wrong (I created a virtualenv with the same versions of python and packages but it still fails so I don't know what could go wrong)
I just pushed a commit that warns the user with a
|
# Conflicts: # metric_learn/itml.py # test/metric_learn_test.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor corrections (I did not have time to look at the tests). Thanks for the huge work, it will be very nice to initialize in a uniform way and will also facilitate the development of future algorithms
The only remaining point (besides making the tests pass) is to have both prior
and init
for LSML and SDML (with default init
set to the value of prior
)
metric_learn/_util.py
Outdated
@@ -335,28 +341,37 @@ def check_collapsed_pairs(pairs): | |||
def _check_sdp_from_eigen(w, tol=None): | |||
"""Checks if some of the eigenvalues given are negative, up to a tolerance | |||
level, with a default value of the tolerance depending on the eigenvalues. | |||
It also returns whether the matrix is definite. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should say "positive definite", and in fact this is also up to tolerance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, done, yes I used the same tolerance as for checking if the eigenvalues are negative (the tolerance is a sort of threshold to detect zeros eigenvalues that can be used in any side of 0, + or -). Do you think I should allow to set a different threshold for returning if the matrix is PSD ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No I think it is fine like this
metric_learn/_util.py
Outdated
Returns | ||
------- | ||
is_definite : bool | ||
Whether the matrix is definite or not. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
metric_learn/_util.py
Outdated
|
||
'auto' | ||
Depending on ``num_dims``, the most reasonable initialization will | ||
be chosen. If ``num_dims <= n_classes`` we use 'lda' (if possible, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
remove "if possible, "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, done
metric_learn/_util.py
Outdated
be chosen. If ``num_dims <= n_classes`` we use 'lda' (if possible, | ||
see the description of 'lda' init), as it uses labels information. | ||
If not, but ``num_dims < min(n_features, n_samples)``, we use | ||
'pca', as it projects data in meaningful directions (those of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in --> onto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done
metric_learn/_util.py
Outdated
This initialization is possible only if `has_classes == True`. | ||
|
||
'identity' | ||
If ``num_dims`` is strictly smaller than the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
although it should be obvious, maybe start by clearly saying that that this uses identity matrix
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, done
metric_learn/mmc.py
Outdated
The (pseudo-)inverse of the covariance matrix. | ||
|
||
'random' | ||
The initial transformation will be a random SPD matrix of shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
metric_learn/mmc.py
Outdated
random_state : int or numpy.RandomState or None, optional (default=None) | ||
A pseudo random number generator object or a seed for it if int. If | ||
``init='random'``, ``random_state`` is used to initialize the random | ||
transformation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
metric_learn/sdml.py
Outdated
The inverse covariance matrix. | ||
|
||
'random' | ||
The initial transformation will be a random PD matrix of shape |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
again
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
metric_learn/sdml.py
Outdated
msg = ("Warning, as of version 0.5.0, the default prior is now " | ||
"'identity', instead of 'covariance'. If you still want to use " | ||
"the inverse of the covariance matrix as a prior, " | ||
"set 'prior'=='covariance' (it was the default in previous " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the content of the parenthesis "it was the default..." is redundant, you can remove it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right, done
metric_learn/mmc.py
Outdated
self.A_ /= 10.0 | ||
else: | ||
self.A_ = check_array(self.A0) | ||
msg = ("Warning, as of version 0.5.0, the default prior is now " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
init instead of prior
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks, done
# Conflicts: # test/metric_learn_test.py
@bellet Regarding the init that default as prior if possible, I think I have to parse a bit the algorithms to understand which part in the computations is the actual prior and which part is the initial matrix for the iterative algorithms: I'd be more comfortable in making a guess at it in a new PR so that it's easier to comment and talk about it, without polluting this PR, what do you think we merge this PR first and do this in a next PR ? Also the PR has become quite big now so I guess it would improve the readability of the next changes |
If you agree, I addressed all your comments, so I guess we can merge this PR |
metric_learn/_util.py
Outdated
if tol < 0: | ||
raise ValueError("tol should be positive.") | ||
if any(w < - tol): | ||
raise ValueError("Matrix is not positive semidefinite (PSD).") | ||
raise NonPSDError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're missing parens here: raise NonPSDError()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried with a command line python 2 and 3 and it worked without the parenthesis, is it a best practice to put the parenthesis ? Done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Without parens you're raise
-ing the class itself, rather than an instance of the class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, it makes sense
metric_learn/_util.py
Outdated
random_state = check_random_state(random_state) | ||
if isinstance(init, np.ndarray): | ||
return init | ||
else: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Especially for long functions with lots of nesting like this one, I prefer the "no else" style:
if condition:
return foo
# everything else at original indent
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, it's better, done
metric_learn/_util.py
Outdated
return init | ||
else: | ||
n_samples = input.shape[0] | ||
if init == 'auto': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might be simpler to test if we broke out pieces into standalone functions. For example, the "auto-select" logic could be it's own function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, done, and tested the function
metric_learn/lmnn.py
Outdated
@@ -19,18 +19,61 @@ | |||
from six.moves import xrange | |||
from sklearn.metrics import euclidean_distances | |||
from sklearn.base import TransformerMixin | |||
|
|||
from metric_learn._util import _initialize_transformer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
from ._util import ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, done
@@ -102,6 +156,8 @@ def fit(self, X, y): | |||
self._loss_grad(X, L, dfG, impostors, 1, k, reg, target_neighbors, df, | |||
a1, a2)) | |||
|
|||
it = 1 # we already made one iteration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like a no-op line. Maybe just update the "main loop" comment?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure I understand, I think the it=1
still useful for coherence, if one puts max_iter=1
, it would break otherwise (variable not defined)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I didn't see that we're referencing it
after the loop.
"the inverse of the covariance matrix as a prior, " | ||
"set 'prior'=='covariance'. This warning will disappear in " | ||
"v0.6.0.") | ||
warnings.warn(msg, ChangedBehaviorWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This warning will be annoying for users that want to use identity initialization intentionally. We could instead keep using None
as the default and only warn when the user hasn't set the prior explicitly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, let me know what you think of the new error messages (says that init was not set because is None, and that in the next version it will be set to 'identity' for instance, and that the warning will disappear)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this is good. One small nit: instead of "set 'prior'=='covariance'.", we should do "set prior='covariance'
."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
"'auto', instead of 'pca'. If you still want to use " | ||
"PCA as an init, set 'init'=='pca'. This warning will " | ||
"disappear in v0.6.0.") | ||
warnings.warn(msg, ChangedBehaviorWarning) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue here with intentional / explicit parameter vs default value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
set init='pca'
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, will do
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@perimosocordiae thanks for your review ! I addressed all your comments (except #195 (comment)) |
Sounds good. This is not a major feature anyway (in most cases it is natural to use the prior as the init) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, once the conflicts from merging #193 are fixed
# Conflicts: # metric_learn/_util.py # metric_learn/covariance.py # metric_learn/itml.py # metric_learn/lmnn.py # metric_learn/lsml.py # metric_learn/mlkr.py # metric_learn/mmc.py # metric_learn/nca.py # metric_learn/sdml.py # test/metric_learn_test.py # test/test_base_metric.py # test/test_utils.py
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to merge once minor docstring tweaks are made.
@bellet @perimosocordiae I addressed all your comments so I guess the PR is ready to merge now :) |
Merged! |
Fixes #124
Fixes #202
TODO:
[ ] Remove the remaining num_dims if there are, after merging with Dimensionality reduction for algorithms learning Mahalanobis matrix M #167we should first merge this PR