Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG] Uniformize initialization for all algorithms #195

Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
56 commits
Select commit Hold shift + click to select a range
a2ae9e1
initiate PR
Apr 23, 2019
5e626d5
Revert "initiate PR"
Apr 24, 2019
ffcfa2d
FEAT: uniformize init for NCA and RCA
Apr 24, 2019
27eb74b
Let the check of num_dims be done in the other PR
Apr 24, 2019
4395c13
Add metric initialization for algorithms that learn a mahalanobis matrix
May 2, 2019
09fda87
Add initialization for MLKR
May 2, 2019
0e59d72
FIX: fix error message for dimension
May 2, 2019
60ca662
FIX fix StringRepr for MLKR
May 2, 2019
71a75ed
FIX tests by reshaping to the right dataset size
May 3, 2019
1b2d296
Remove lda in docstring of MLKR
May 3, 2019
bd709e9
MAINT: Add deprecation for previous initializations
May 9, 2019
e162e6a
Update tests with new initialization
May 9, 2019
d1e88af
Make random init for mahalanobis metric generate an SPD matrix
May 9, 2019
eb98eff
Ensure the input mahalanobis metric initialization is symmetric, and …
May 9, 2019
508d94e
various fixes
May 9, 2019
bbf31cb
MAINT: various refactoring
May 9, 2019
aafa8e2
FIX fix default covariance for SDML in tests
May 9, 2019
748459e
Enhance docstring
May 10, 2019
06a55da
Set random state for SDML
May 10, 2019
d321319
Merge branch 'master' into feat/uniformized_initial_metric
May 13, 2019
26fb9e7
Fix merge remove_spaces that was forgotten
May 13, 2019
5e3daa4
Fix indent
May 13, 2019
e86b61b
XP: try to change the way we choose n_components to see if it fixes t…
May 13, 2019
0b69e7e
Revert "XP: try to change the way we choose n_components to see if it…
May 13, 2019
95a86a9
Be more tolerant in test
May 13, 2019
d622fae
Add test for singular covariance matrix
May 13, 2019
d2cc7ce
Fix test_singular_covariance_init
May 14, 2019
a7d2791
DOC: update docstring saying pseudo-inverse
May 14, 2019
3590cfa
Revert "Fix test_singular_covariance_init"
May 14, 2019
503a715
Ensure definiteness before returning the inverse
May 15, 2019
32bbdf3
wip deal with non definiteness
May 15, 2019
fdad8c2
Rename init to prior for SDML and LSML
May 16, 2019
5b048b4
Update error messages with either prior or init
May 17, 2019
d96930d
Remove message
May 17, 2019
2de3d4c
A few nitpicks
May 18, 2019
499a296
PEP8 errors + change init in test
May 18, 2019
c371d0c
STY: PEP8 fixes
May 18, 2019
b63d017
Address and remove TODOs
May 20, 2019
a5a6af8
Replace init by prior for ITML
Jun 3, 2019
9c4d70d
TST: fix ITML test with init changed into prior
Jun 3, 2019
8cb9c42
Add precision for MMC
Jun 4, 2019
b40e75e
Add ChangedBehaviorWarning for the algorithms that changed
Jun 5, 2019
0f5b9ed
Merge branch 'master' into feat/uniformized_initial_metric
Jun 5, 2019
cec35ab
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 5, 2019
617ab0a
Remove the warnings check since we now have a ChangedBehaviorWarning
Jun 5, 2019
a5b13f2
Be more precise: it should not raise any ConvergenceWarningError
Jun 5, 2019
bd43168
Merge branch 'master' into feat/uniformized_initial_metric
Jun 5, 2019
0ea0aa6
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 6, 2019
6e452ed
FIX remaining comment
Jun 6, 2019
4f822a8
TST: update test error message
Jun 6, 2019
c19ca4c
Improve readability
Jun 6, 2019
d8181d0
Address https://github.com/metric-learn/metric-learn/pull/195#pullreq…
Jun 7, 2019
21e20c6
Merge branch 'master' into feat/uniformized_initial_metric
Jun 7, 2019
e27d8a1
TST: Fix docsting lmnn
Jun 7, 2019
4a861c8
Fix warning messages
Jun 7, 2019
dd2b8c7
Fix warnings messages changed
Jun 7, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion metric_learn/sdml.py
Original file line number Diff line number Diff line change
Expand Up @@ -119,7 +119,7 @@ def _fit(self, pairs, y):
"positive semi-definite (PSD). The algorithm may diverge, "
"and lead to degenerate solutions. "
"To prevent that, try to decrease the balance parameter "
"`balance_param` and/or to set use_cov=False.",
"`balance_param` and/or to set init='identity'.",
ConvergenceWarning)
w -= min_eigval # we translate the eigenvalues to make them all positive
w += 1e-10 # we add a small offset to avoid definiteness problems
Expand Down
31 changes: 16 additions & 15 deletions test/metric_learn_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -231,7 +231,7 @@ def test_sdml_raises_warning_msg_not_installed_skggm(self):
# because it will return a non SPD matrix
pairs = np.array([[[-10., 0.], [10., 0.]], [[0., 50.], [0., -60]]])
y_pairs = [1, -1]
sdml = SDML(use_cov=False, balance_param=100, verbose=True)
sdml = SDML(init='identity', balance_param=100, verbose=True)

msg = ("There was a problem in SDML when using scikit-learn's graphical "
"lasso solver. skggm's graphical lasso can sometimes converge on "
Expand All @@ -254,7 +254,7 @@ def test_sdml_raises_warning_msg_installed_skggm(self):
# because it will return non finite values
pairs = np.array([[[-10., 0.], [10., 0.]], [[0., 50.], [0., -60]]])
y_pairs = [1, -1]
sdml = SDML(use_cov=False, balance_param=100, verbose=True)
sdml = SDML(init='identity', balance_param=100, verbose=True)

msg = ("There was a problem in SDML when using skggm's graphical "
"lasso solver.")
Expand All @@ -277,7 +277,7 @@ def test_sdml_supervised_raises_warning_msg_installed_skggm(self):
# pathological case)
X = np.array([[-10., 0.], [10., 0.], [5., 0.], [3., 0.]])
y = [0, 0, 1, 1]
sdml_supervised = SDML_Supervised(balance_param=0.5, use_cov=False,
sdml_supervised = SDML_Supervised(balance_param=0.5, init='identity',
sparsity_param=0.01)
msg = ("There was a problem in SDML when using skggm's graphical "
"lasso solver.")
Expand All @@ -295,11 +295,11 @@ def test_raises_no_warning_installed_skggm(self):
y_pairs = [1, -1]
X, y = make_classification(random_state=42)
with pytest.warns(None) as record:
sdml = SDML()
sdml = SDML(init='covariance')
sdml.fit(pairs, y_pairs)
assert len(record) == 0
with pytest.warns(None) as record:
sdml = SDML_Supervised(use_cov=False, balance_param=1e-5)
sdml = SDML_Supervised(init='identity', balance_param=1e-5)
sdml.fit(X, y)
assert len(record) == 0

Expand All @@ -308,7 +308,7 @@ def test_iris(self):
# TODO: un-flake it!
rs = np.random.RandomState(5555)

sdml = SDML_Supervised(num_constraints=1500, use_cov=False,
sdml = SDML_Supervised(num_constraints=1500, init='identity',
balance_param=5e-5)
sdml.fit(self.iris_points, self.iris_labels, random_state=rs)
csep = class_separation(sdml.transform(self.iris_points),
Expand All @@ -320,7 +320,7 @@ def test_deprecation_num_labeled(self):
# initialization
# TODO: remove in v.0.6
X, y = make_classification(random_state=42)
sdml_supervised = SDML_Supervised(num_labeled=np.inf, use_cov=False,
sdml_supervised = SDML_Supervised(num_labeled=np.inf, init='identity',
balance_param=5e-5)
msg = ('"num_labeled" parameter is not used.'
' It has been deprecated in version 0.5.0 and will be'
Expand All @@ -337,7 +337,7 @@ def test_sdml_raises_warning_non_psd(self):
"positive semi-definite (PSD). The algorithm may diverge, "
"and lead to degenerate solutions. "
"To prevent that, try to decrease the balance parameter "
"`balance_param` and/or to set use_cov=False.")
"`balance_param` and/or to set init='identity'.")
with pytest.warns(ConvergenceWarning) as raised_warning:
try:
sdml.fit(pairs, y)
Expand All @@ -352,7 +352,7 @@ def test_sdml_converges_if_psd(self):
pseudo-covariance matrix is PSD"""
pairs = np.array([[[-10., 0.], [10., 0.]], [[0., -55.], [0., -60]]])
y = [1, -1]
sdml = SDML(use_cov=True, sparsity_param=0.01, balance_param=0.5)
sdml = SDML(init='covariance', sparsity_param=0.01, balance_param=0.5)
sdml.fit(pairs, y)
assert np.isfinite(sdml.get_mahalanobis_matrix()).all()

Expand All @@ -365,7 +365,7 @@ def test_sdml_works_on_non_spd_pb_with_skggm(self):
it should work, but scikit-learn's graphical_lasso does not work"""
X, y = load_iris(return_X_y=True)
sdml = SDML_Supervised(balance_param=0.5, sparsity_param=0.01,
use_cov=True)
init='covariance')
sdml.fit(X, y)

def test_deprecation_use_cov(self):
Expand Down Expand Up @@ -400,7 +400,7 @@ def test_verbose_has_installed_skggm_sdml(capsys):
# TODO: remove if we don't need skggm anymore
pairs = np.array([[[-10., 0.], [10., 0.]], [[0., -55.], [0., -60]]])
y_pairs = [1, -1]
sdml = SDML(verbose=True)
sdml = SDML(verbose=True, init='covariance')
sdml.fit(pairs, y_pairs)
out, _ = capsys.readouterr()
assert "SDML will use skggm's graphical lasso solver." in out
Expand All @@ -414,7 +414,7 @@ def test_verbose_has_installed_skggm_sdml_supervised(capsys):
# skggm's solver is used (when they use SDML_Supervised)
# TODO: remove if we don't need skggm anymore
X, y = make_classification(random_state=42)
sdml = SDML_Supervised(verbose=True)
sdml = SDML_Supervised(verbose=True, init='covariance')
sdml.fit(X, y)
out, _ = capsys.readouterr()
assert "SDML will use skggm's graphical lasso solver." in out
Expand Down Expand Up @@ -443,7 +443,7 @@ def test_verbose_has_not_installed_skggm_sdml_supervised(capsys):
# skggm's solver is used (when they use SDML_Supervised)
# TODO: remove if we don't need skggm anymore
X, y = make_classification(random_state=42)
sdml = SDML_Supervised(verbose=True, balance_param=1e-5, use_cov=False)
sdml = SDML_Supervised(verbose=True, balance_param=1e-5, init='identity')
sdml.fit(X, y)
out, _ = capsys.readouterr()
assert "SDML will use scikit-learn's graphical lasso solver." in out
Expand Down Expand Up @@ -646,8 +646,9 @@ def test_iris(self):
c, d = np.nonzero(np.triu(~mask, k=1))

# Full metric
mmc = MMC(convergence_threshold=0.01)
mmc.fit(*wrap_pairs(self.iris_points, [a,b,c,d]))
n_features = self.iris_points.shape[1]
mmc = MMC(convergence_threshold=0.01, init=np.eye(n_features) / 10)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The previous default init was identity divided by 10

mmc.fit(*wrap_pairs(self.iris_points, [a, b, c, d]))
expected = [[+0.000514, +0.000868, -0.001195, -0.001703],
[+0.000868, +0.001468, -0.002021, -0.002879],
[-0.001195, -0.002021, +0.002782, +0.003964],
Expand Down
4 changes: 2 additions & 2 deletions test/test_fit_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -65,13 +65,13 @@ def test_lmnn(self):
def test_sdml_supervised(self):
seed = np.random.RandomState(1234)
sdml = SDML_Supervised(num_constraints=1500, balance_param=1e-5,
use_cov=False)
init='identity')
sdml.fit(self.X, self.y, random_state=seed)
res_1 = sdml.transform(self.X)

seed = np.random.RandomState(1234)
sdml = SDML_Supervised(num_constraints=1500, balance_param=1e-5,
use_cov=False)
init='identity')
res_2 = sdml.fit_transform(self.X, self.y, random_state=seed)

assert_array_almost_equal(res_1, res_2)
Expand Down
2 changes: 1 addition & 1 deletion test/test_sklearn_compat.py
Original file line number Diff line number Diff line change
Expand Up @@ -85,7 +85,7 @@ def stable_init(self, sparsity_param=0.01, num_labeled='deprecated',
num_constraints=num_constraints,
verbose=verbose,
preprocessor=preprocessor,
balance_param=1e-5, use_cov=False)
balance_param=1e-5, init='identity')
dSDML.__init__ = stable_init
check_estimator(dSDML)

Expand Down
2 changes: 1 addition & 1 deletion test/test_transformer_metric_conversion.py
Original file line number Diff line number Diff line change
Expand Up @@ -49,7 +49,7 @@ def test_lmnn(self):

def test_sdml_supervised(self):
seed = np.random.RandomState(1234)
sdml = SDML_Supervised(num_constraints=1500, use_cov=False,
sdml = SDML_Supervised(num_constraints=1500, init='identity',
balance_param=1e-5)
sdml.fit(self.X, self.y, random_state=seed)
L = sdml.transformer_
Expand Down
4 changes: 2 additions & 2 deletions test/test_utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -106,7 +106,7 @@ def build_quadruplets(with_preprocessor=False):
# be solved
# TODO: remove this comment when #175 is solved
(MMC(max_iter=2), build_pairs), # max_iter=2 to be faster
(SDML(use_cov=False, balance_param=1e-5), build_pairs)]
(SDML(init='identity', balance_param=1e-5), build_pairs)]
ids_pairs_learners = list(map(lambda x: x.__class__.__name__,
[learner for (learner, _) in
pairs_learners]))
Expand All @@ -120,7 +120,7 @@ def build_quadruplets(with_preprocessor=False):
(LSML_Supervised(), build_classification),
(MMC_Supervised(max_iter=5), build_classification),
(RCA_Supervised(num_chunks=10), build_classification),
(SDML_Supervised(use_cov=False, balance_param=1e-5),
(SDML_Supervised(init='identity', balance_param=1e-5),
build_classification)]
ids_classifiers = list(map(lambda x: x.__class__.__name__,
[learner for (learner, _) in
Expand Down