Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1. score_pairs refactor #333

Merged
merged 22 commits into from
Oct 21, 2021
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion doc/introduction.rst
Original file line number Diff line number Diff line change
Expand Up @@ -141,7 +141,7 @@ to the following resources:
.. :math:`D`-dimensional learned metric space :math:`X L^{\top}`,
.. in which standard Euclidean distances may be used.
.. - ``transform(X)``, which applies the aforementioned transformation.
.. - ``score_pairs(pairs)`` which returns the distance between pairs of
.. - ``pair_distance(pairs)`` which returns the distance between pairs of
.. points. ``pairs`` should be a 3D array-like of pairs of shape ``(n_pairs,
.. 2, n_features)``, or it can be a 2D array-like of pairs indicators of
.. shape ``(n_pairs, 2)`` (see section :ref:`preprocessor_section` for more
Expand Down
30 changes: 27 additions & 3 deletions doc/supervised.rst
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,10 @@ Also, as explained before, our metric learners has learn a distance between
points. You can use this distance in two main ways:

- You can either return the distance between pairs of points using the
`score_pairs` function:
`pair_distance` function:

>>> nca.score_pairs([[[3.5, 3.6], [5.6, 2.4]], [[1.2, 4.2], [2.1, 6.4]]])
array([0.49627072, 3.65287282])
>>> nca.pair_distance([[[3.5, 3.6], [5.6, 2.4]], [[1.2, 4.2], [2.1, 6.4]], [[3.3, 7.8], [10.9, 0.1]]])
array([0.49627072, 3.65287282, 6.06079877])

- Or you can return a function that will return the distance (in the new
space) between two 1D arrays (the coordinates of the points in the original
Expand All @@ -82,6 +82,29 @@ array([0.49627072, 3.65287282])
>>> metric_fun([3.5, 3.6], [5.6, 2.4])
0.4962707194621285

- Alternatively, you can use `pair_similarity` to return the **score** between
points, the more the **score**, the closer the pairs and vice-versa. For
Mahalanobis learners, it is equal to the inverse of the distance.

>>> score = nca.pair_similarity([[[3.5, 3.6], [5.6, 2.4]], [[1.2, 4.2], [2.1, 6.4]], [[3.3, 7.8], [10.9, 0.1]]])
>>> score
array([-0.49627072, -3.65287282, -6.06079877])

This is useful because `pair_similarity` matches the **score** sematic of
scikit-learn's `Classification matrics <https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics>`_.
For instance, given a labeled data, you can pass the labels and the
**score** of your data to get the ROC curve.

>>> from sklearn.metrics import roc_curve
>>> fpr, tpr, thresholds = roc_curve(['dog', 'cat', 'dog'], score, pos_label='dog')
>>> fpr
array([0., 0., 1., 1.])
>>> tpr
array([0. , 0.5, 0.5, 1. ])
>>>
>>> thresholds
array([ 0.50372928, -0.49627072, -3.65287282, -6.06079877])

.. note::

If the metric learner that you use learns a :ref:`Mahalanobis distance
Expand All @@ -105,6 +128,7 @@ All supervised algorithms are scikit-learn estimators
scikit-learn model selection routines
(`sklearn.model_selection.cross_val_score`,
`sklearn.model_selection.GridSearchCV`, etc).
You can also use methods from `sklearn.metrics` that rely on y_scores.

Algorithms
==========
Expand Down
33 changes: 28 additions & 5 deletions doc/weakly_supervised.rst
Original file line number Diff line number Diff line change
Expand Up @@ -160,9 +160,9 @@ Also, as explained before, our metric learner has learned a distance between
points. You can use this distance in two main ways:

- You can either return the distance between pairs of points using the
`score_pairs` function:
`pair_distance` function:

>>> mmc.score_pairs([[[3.5, 3.6, 5.2], [5.6, 2.4, 6.7]],
>>> mmc.pair_distance([[[3.5, 3.6, 5.2], [5.6, 2.4, 6.7]],
... [[1.2, 4.2, 7.7], [2.1, 6.4, 0.9]]])
array([7.27607365, 0.88853014])

Expand All @@ -175,6 +175,29 @@ array([7.27607365, 0.88853014])
>>> metric_fun([3.5, 3.6, 5.2], [5.6, 2.4, 6.7])
7.276073646278203

- Alternatively, you can use `pair_similarity` to return the **score** between
points, the more the **score**, the closer the pairs and vice-versa. For
Mahalanobis learners, it is equal to the inverse of the distance.

>>> score = mmc.pair_similarity([[[3.5, 3.6], [5.6, 2.4]], [[1.2, 4.2], [2.1, 6.4]], [[3.3, 7.8], [10.9, 0.1]]])
>>> score
array([-0.49627072, -3.65287282, -6.06079877])

This is useful because `pair_similarity` matches the **score** sematic of
scikit-learn's `Classification matrics <https://scikit-learn.org/stable/modules/model_evaluation.html#classification-metrics>`_.
For instance, given a labeled data, you can pass the labels and the
**score** of your data to get the ROC curve.

>>> from sklearn.metrics import roc_curve
>>> fpr, tpr, thresholds = roc_curve(['dog', 'cat', 'dog'], score, pos_label='dog')
>>> fpr
array([0., 0., 1., 1.])
>>> tpr
array([0. , 0.5, 0.5, 1. ])
>>>
>>> thresholds
array([ 0.50372928, -0.49627072, -3.65287282, -6.06079877])

.. note::

If the metric learner that you use learns a :ref:`Mahalanobis distance
Expand Down Expand Up @@ -344,7 +367,7 @@ returns the `sklearn.metrics.roc_auc_score` (which is threshold-independent).

.. note::
See :ref:`fit_ws` for more details on metric learners functions that are
not specific to learning on pairs, like `transform`, `score_pairs`,
not specific to learning on pairs, like `transform`, `pair_distance`,
`get_metric` and `get_mahalanobis_matrix`.

Algorithms
Expand Down Expand Up @@ -691,7 +714,7 @@ of triplets that have the right predicted ordering.

.. note::
See :ref:`fit_ws` for more details on metric learners functions that are
not specific to learning on pairs, like `transform`, `score_pairs`,
not specific to learning on pairs, like `transform`, `pair_distance`,
`get_metric` and `get_mahalanobis_matrix`.


Expand Down Expand Up @@ -859,7 +882,7 @@ of quadruplets have the right predicted ordering.

.. note::
See :ref:`fit_ws` for more details on metric learners functions that are
not specific to learning on pairs, like `transform`, `score_pairs`,
not specific to learning on pairs, like `transform`, `pair_distance`,
`get_metric` and `get_mahalanobis_matrix`.


Expand Down
Loading