You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardexpand all lines: README.md
+9-7
Original file line number
Diff line number
Diff line change
@@ -40,13 +40,13 @@ We provide a short usage guide for the library in [short_usage_guide.ipynb](http
40
40
You can also check the documentation for more details.
41
41
42
42
43
-
## Methods, usage, and how to cite
43
+
## Methods implemented in xCOLUMNs
44
44
45
45
The library implements the following methods:
46
46
47
47
### Instance-wise weighted prediction
48
48
49
-
The library implements a set of methods for instance-wise weighted prediction, that include optimal prediction strategies for different metrics, such as:
49
+
The library implements a set of methods for instance-wise weighted prediction, that include optimal infernece strategies for some metrics, such as:
50
50
- Precision at k
51
51
- Propensity-scored precision at k
52
52
- Macro-averaged recall at k
@@ -55,12 +55,14 @@ The library implements a set of methods for instance-wise weighted prediction, t
55
55
56
56
### Optimization of prediction for a given test set using Block Coordinate Ascent/Descent (BCA/BCD)
57
57
58
-
The method aims to optimize the prediction for a given test set using the block coordinate ascent/descent algorithm.
58
+
The method aims to optimize the prediction for a given metrics and test set using the block coordinate ascent/descent algorithm.
59
59
60
60
The method was first introduced and described in the paper:
61
61
> [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
62
62
63
-
### Finding optimal population classifier via Frank-Wolfe (FW)
63
+
### Finding optimal population classifier using Frank-Wolfe (FW)
64
+
65
+
The method finds the optimal population classifier for given metric using the Frank-Wolfe optimization algorithm on the provided training set.
64
66
65
67
The method was first introduced and described in the paper:
66
68
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
@@ -69,9 +71,9 @@ The method was first introduced and described in the paper:
69
71
## Repository structure
70
72
71
73
The repository is organized as follows:
72
-
-`docs/` - Sphinx documentation (work in progress)
73
-
-`experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for details
74
-
-`xcolumns/` - Python package with the library
74
+
-`docs/` - Sphinx documentation
75
+
-`experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for more details
76
+
-`xcolumns/` - the library source code
75
77
-`tests/` - tests for the library (the coverage is bit limited at the moment, but these test should guarantee that the main components of the library works as expected)
`xcolumns.block_coordinate` module implements the methods for finding the optimal prediction for given test set using the Block Coordinate Ascend/Desend algorithm with 0-th order approximation of expected utility.
4
4
The method was first introduced and described in the paper:
@@ -7,15 +7,15 @@ The method was first introduced and described in the paper:
7
7
Note: BCA/BCD with 0-approximationuses tp, fp, fn, tn matrices parametrization of the confussion matrix,
8
8
as opposed to algorithms presented in the paper, which use :math:`t, q, p` parametrization. However both algorithms are equivalent.
9
9
10
-
The main function of the module is [**predict_using_bc_with_0approx**](#xcolumns.block_coordinate.predict_using_bc_with_0approx):
10
+
The main function of the module is {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>`:
The module provides the wrapper functions for specific metrics that can be used as arguments for the `predict_using_bc_with_0approx` function as well as factory function for creating such wrapper functions.
18
+
The module provides the wrapper functions for specific metrics that can be used as arguments for the {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>` function as well as factory function for creating such wrapper functions.
19
19
20
20
```{eval-rst}
21
21
.. automodule:: xcolumns.block_coordinate
@@ -28,7 +28,7 @@ The module provides the wrapper functions for specific metrics that can be used
28
28
29
29
## Special function for optimization of coverage
30
30
31
-
The module provides the special function for optimization of coverage metric that use other way of estimating the expected value of the metric than `predict_using_bc_with_0approx` function.e
31
+
The module provides the special function for optimization of coverage metric that use other way of estimating the expected value of the metric than {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>` function.
Copy file name to clipboardexpand all lines: docs/api/confusion_matrix.md
+3-1
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,9 @@
1
-
# Confusion Matrix
1
+
# Confusion Matrix (`xcolumns.confusion_matrix`)
2
2
3
3
`xcolumns.confusion_matrix` module implements confusion matrix object and functions that can be used to calculate it.
4
4
In xCOLUMNs, the confusion matrix is parametrized by four matrices: true positive (tp), false positive (fp), false negative (fn), and true negative (tn).
5
+
The confusion matrix object can be used to calculate the metrics based on the confusion matrix.
6
+
xCOLUMNs implements the popular metrics in [`xcolumns.metrics`](metrics) module.
Copy file name to clipboardexpand all lines: docs/api/frank_wolfe.md
+4-5
Original file line number
Diff line number
Diff line change
@@ -1,11 +1,10 @@
1
-
# Finding population classifiers using Frank Wolfe-based method
1
+
# Finding population classifiers using Frank Wolfe-based method (`xcolumns.frank_wolfe`)
2
2
3
3
`xcolumns.frank_wolfe` module implements the methods for finding the optimal population classifier using the Frank-Wolfe algorithm.
4
4
The method was first introduced and described in the paper:
5
5
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
6
6
7
-
The main function of the module is [**find_classifier_using_fw**](#xcolumns.frank_wolfe.find_classifier_using_fw):
8
-
7
+
The main function of the module is {func}`find_classifier_using_fw() <xcolumns.frank_wolfe.find_classifier_using_fw>`:
@@ -14,7 +13,7 @@ The main function of the module is [**find_classifier_using_fw**](#xcolumns.fran
14
13
15
14
The function returns the RandomizedWeightedClassifier object that can be used for prediction.
16
15
The RandomizedWeightedClassifier is a set of weighted classifiers as defined in
17
-
The module also provides the function [**predict_using_randomized_weighted_classifier**](#xcolumns.frank_wolfe.predict_using_randomized_weighted_classifier) for predicting the labels using the RandomizedWeightedClassifier object.
16
+
The module also provides the function {func}`predict_using_randomized_weighted_classifier() <xcolumns.frank_wolfe.predict_using_randomized_weighted_classifier>` for predicting the labels using the RandomizedWeightedClassifier object.
18
17
19
18
20
19
```{eval-rst}
@@ -28,7 +27,7 @@ The module also provides the function [**predict_using_randomized_weighted_class
28
27
29
28
## Wrapper functions for specific metrics
30
29
31
-
The module provides the wrapper functions for specific metrics that can be used as arguments for the `find_classifier_using_fw` function as well as factory function for creating such wrapper functions.
30
+
The module provides the wrapper functions for specific metrics that can be used as arguments for the {func}`find_classifier_using_fw() <xcolumns.frank_wolfe.find_classifier_using_fw>` function as well as factory function for creating such wrapper functions.
Copy file name to clipboardexpand all lines: docs/api/metrics.md
+3-2
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,8 @@
1
-
# Metrics
1
+
# Metrics (`xcolumns.metrics`)
2
2
3
3
`xcolumns.metrics` module implements a set of methods for calculating the metrics based on both the confusion matrix and the true and predicted labels.
4
-
The methods calculating the metrics on the entries of the confusion matrix can be also used as arguments for the methods in the [`xcolumns.block_coordinate`](api/block_coordinate) and [`xcolumns.frank_wolfe`](api/frank_wolfe) modules.
4
+
The methods calculating the metrics on the entries of the confusion matrix can be also used as arguments for the methods in the
5
+
[`xcolumns.block_coordinate`](block_coordinate) and [`xcolumns.frank_wolfe`](frank_wolfe) modules.
`xcolumns.weighted_prediction` module provides the methods for calculating the weighted prediction for each instance based on the conditional probabilities of labels.
4
-
The main function of the module is [**predict_weighted_per_instance**](#xcolumns.weighted_prediction.predict_weighted_per_instance).
4
+
The main function of the module is {func}`predict_weighted_per_instance() <xcolumns.weighted_prediction.predict_weighted_per_instance>`.
5
5
6
6
7
7
```{eval-rst}
@@ -11,7 +11,7 @@ The main function of the module is [**predict_weighted_per_instance**](#xcolumns
11
11
12
12
## Prediction strategies based on weighted predictions
13
13
14
-
Based on [**predict_weighted_per_instance**](#xcolumns.weighted_prediction.predict_weighted_per_instance) function the module provides few additional functions for calculating the predictions
14
+
Based on {func}`predict_weighted_per_instance() <xcolumns.weighted_prediction.predict_weighted_per_instance>` function the module provides few additional functions for calculating the predictions
15
15
that are optimal for some specific metrics or arbitrary upweight labels with smaller prior probabilities.
Copy file name to clipboardexpand all lines: docs/intro/overview.md
+8-9
Original file line number
Diff line number
Diff line change
@@ -1,10 +1,6 @@
1
1
# Overview of xCOLUMNs library
2
2
3
-
xCOLUMNs stands for x**Consistent Optimization of Label-wise Utilities in Multi-label classificatioN**s.
4
-
It is a small Python library that aims to implement different methods for the optimization of a general family of
5
-
metrics that can be defined on multi-label classification matrices.
6
-
These include, but are not limited to, label-wise metrics (see below for details).
7
-
The library provides an efficient implementation of the different optimization methods that easily scale to the extreme multi-label classification (XMLC) - problems with a very large number of labels and instances.
3
+
8
4
9
5
10
6
## What is multi-label classification?
@@ -43,9 +39,12 @@ In this sense xCOLUMNs implements plug-in inference methods, that can be used on
43
39
44
40
The aim of xCOLUMNs is to provide methods for the optimization of the general family of label-wise utilities. Currently, the following methods are implemented:
45
41
46
-
- Prediction for provided test set using **Block Coordinate Ascent/Descent (BC)** method, described in [1].
47
-
- Search for optimal population classifier using **Frank-Wolfe (FW)** method, described in [2].
42
+
- Weighted instance-wise prediction that include optimal infernece strategies for some metrics. Implemented in [`xcolumns.weighted_prediction`](../api/weighted_prediction) module.
43
+
44
+
- Prediction for provided label-wise metric and test set using **Block Coordinate Ascent/Descent (BC)** method. Implemented in [`xcolumns.block_coordinate`](../api/block_coordinate) module. It was first introduced and described in:
45
+
> [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
48
46
49
-
[1][Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
47
+
- Search for optimal population classifier for provided metric defined on mulit-label confusion matrix using **Frank-Wolfe (FW)** method and provided training set. Implemented in [`xcolumns.frank_wolfe`](../api/frank_wolfe) module. It was first introduced and described in:
48
+
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
50
49
51
-
[2][Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
50
+
The library also implements a set of methods for calculating the metrics based on both the confusion matrix and the true and predicted labels. Implemented in [`xcolumns.confusion_matrix`](../api/confusion_matrix) and [`xcolumns.metrics`](../api/metrics) modules.
Copy file name to clipboardexpand all lines: docs/intro/quick_start.md
+1-1
Original file line number
Diff line number
Diff line change
@@ -16,4 +16,4 @@ However, the PyTorch is not a required dependency, so you need to install it sep
16
16
17
17
## Usage
18
18
19
-
We provide a short usage guide for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.
19
+
We provide a short usage guide (with examples) for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.
0 commit comments