Skip to content

Commit 7fe82c3

Browse files
committed
Update docs
1 parent d233ebd commit 7fe82c3

14 files changed

+103
-70
lines changed

README.md

+9-7
Original file line numberDiff line numberDiff line change
@@ -40,13 +40,13 @@ We provide a short usage guide for the library in [short_usage_guide.ipynb](http
4040
You can also check the documentation for more details.
4141

4242

43-
## Methods, usage, and how to cite
43+
## Methods implemented in xCOLUMNs
4444

4545
The library implements the following methods:
4646

4747
### Instance-wise weighted prediction
4848

49-
The library implements a set of methods for instance-wise weighted prediction, that include optimal prediction strategies for different metrics, such as:
49+
The library implements a set of methods for instance-wise weighted prediction, that include optimal infernece strategies for some metrics, such as:
5050
- Precision at k
5151
- Propensity-scored precision at k
5252
- Macro-averaged recall at k
@@ -55,12 +55,14 @@ The library implements a set of methods for instance-wise weighted prediction, t
5555

5656
### Optimization of prediction for a given test set using Block Coordinate Ascent/Descent (BCA/BCD)
5757

58-
The method aims to optimize the prediction for a given test set using the block coordinate ascent/descent algorithm.
58+
The method aims to optimize the prediction for a given metrics and test set using the block coordinate ascent/descent algorithm.
5959

6060
The method was first introduced and described in the paper:
6161
> [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
6262
63-
### Finding optimal population classifier via Frank-Wolfe (FW)
63+
### Finding optimal population classifier using Frank-Wolfe (FW)
64+
65+
The method finds the optimal population classifier for given metric using the Frank-Wolfe optimization algorithm on the provided training set.
6466

6567
The method was first introduced and described in the paper:
6668
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
@@ -69,9 +71,9 @@ The method was first introduced and described in the paper:
6971
## Repository structure
7072

7173
The repository is organized as follows:
72-
- `docs/` - Sphinx documentation (work in progress)
73-
- `experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for details
74-
- `xcolumns/` - Python package with the library
74+
- `docs/` - Sphinx documentation
75+
- `experiments/` - a code for reproducing experiments from the papers, see the README.md file in the directory for more details
76+
- `xcolumns/` - the library source code
7577
- `tests/` - tests for the library (the coverage is bit limited at the moment, but these test should guarantee that the main components of the library works as expected)
7678

7779

docs/api/block_coordinate.md

+4-4
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Block Coordinate-based prediction methods
1+
# Block Coordinate-based prediction methods (`xcolumns.block_coordinate`)
22

33
`xcolumns.block_coordinate` module implements the methods for finding the optimal prediction for given test set using the Block Coordinate Ascend/Desend algorithm with 0-th order approximation of expected utility.
44
The method was first introduced and described in the paper:
@@ -7,15 +7,15 @@ The method was first introduced and described in the paper:
77
Note: BCA/BCD with 0-approximationuses tp, fp, fn, tn matrices parametrization of the confussion matrix,
88
as opposed to algorithms presented in the paper, which use :math:`t, q, p` parametrization. However both algorithms are equivalent.
99

10-
The main function of the module is [**predict_using_bc_with_0approx**](#xcolumns.block_coordinate.predict_using_bc_with_0approx):
10+
The main function of the module is {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>`:
1111

1212
```{eval-rst}
1313
.. autofunction:: xcolumns.block_coordinate.predict_using_bc_with_0approx
1414
```
1515

1616
## Wrapper functions for specific metrics
1717

18-
The module provides the wrapper functions for specific metrics that can be used as arguments for the `predict_using_bc_with_0approx` function as well as factory function for creating such wrapper functions.
18+
The module provides the wrapper functions for specific metrics that can be used as arguments for the {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>` function as well as factory function for creating such wrapper functions.
1919

2020
```{eval-rst}
2121
.. automodule:: xcolumns.block_coordinate
@@ -28,7 +28,7 @@ The module provides the wrapper functions for specific metrics that can be used
2828

2929
## Special function for optimization of coverage
3030

31-
The module provides the special function for optimization of coverage metric that use other way of estimating the expected value of the metric than `predict_using_bc_with_0approx` function.e
31+
The module provides the special function for optimization of coverage metric that use other way of estimating the expected value of the metric than {func}`predict_using_bc_with_0approx() <xcolumns.block_coordinate.predict_using_bc_with_0approx>` function.
3232

3333
```{eval-rst}
3434
.. autofunction:: xcolumns.block_coordinate.predict_optimizing_coverage_using_bc

docs/api/confusion_matrix.md

+3-1
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,9 @@
1-
# Confusion Matrix
1+
# Confusion Matrix (`xcolumns.confusion_matrix`)
22

33
`xcolumns.confusion_matrix` module implements confusion matrix object and functions that can be used to calculate it.
44
In xCOLUMNs, the confusion matrix is parametrized by four matrices: true positive (tp), false positive (fp), false negative (fn), and true negative (tn).
5+
The confusion matrix object can be used to calculate the metrics based on the confusion matrix.
6+
xCOLUMNs implements the popular metrics in [`xcolumns.metrics`](metrics) module.
57

68
```{eval-rst}
79
.. automodule:: xcolumns.confusion_matrix

docs/api/frank_wolfe.md

+4-5
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,10 @@
1-
# Finding population classifiers using Frank Wolfe-based method
1+
# Finding population classifiers using Frank Wolfe-based method (`xcolumns.frank_wolfe`)
22

33
`xcolumns.frank_wolfe` module implements the methods for finding the optimal population classifier using the Frank-Wolfe algorithm.
44
The method was first introduced and described in the paper:
55
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
66
7-
The main function of the module is [**find_classifier_using_fw**](#xcolumns.frank_wolfe.find_classifier_using_fw):
8-
7+
The main function of the module is {func}`find_classifier_using_fw() <xcolumns.frank_wolfe.find_classifier_using_fw>`:
98

109
```{eval-rst}
1110
.. autofunction:: xcolumns.frank_wolfe.find_classifier_using_fw
@@ -14,7 +13,7 @@ The main function of the module is [**find_classifier_using_fw**](#xcolumns.fran
1413

1514
The function returns the RandomizedWeightedClassifier object that can be used for prediction.
1615
The RandomizedWeightedClassifier is a set of weighted classifiers as defined in
17-
The module also provides the function [**predict_using_randomized_weighted_classifier**](#xcolumns.frank_wolfe.predict_using_randomized_weighted_classifier) for predicting the labels using the RandomizedWeightedClassifier object.
16+
The module also provides the function {func}`predict_using_randomized_weighted_classifier() <xcolumns.frank_wolfe.predict_using_randomized_weighted_classifier>` for predicting the labels using the RandomizedWeightedClassifier object.
1817

1918

2019
```{eval-rst}
@@ -28,7 +27,7 @@ The module also provides the function [**predict_using_randomized_weighted_class
2827

2928
## Wrapper functions for specific metrics
3029

31-
The module provides the wrapper functions for specific metrics that can be used as arguments for the `find_classifier_using_fw` function as well as factory function for creating such wrapper functions.
30+
The module provides the wrapper functions for specific metrics that can be used as arguments for the {func}`find_classifier_using_fw() <xcolumns.frank_wolfe.find_classifier_using_fw>` function as well as factory function for creating such wrapper functions.
3231

3332
```{eval-rst}
3433
.. automodule:: xcolumns.frank_wolfe

docs/api/metrics.md

+3-2
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
1-
# Metrics
1+
# Metrics (`xcolumns.metrics`)
22

33
`xcolumns.metrics` module implements a set of methods for calculating the metrics based on both the confusion matrix and the true and predicted labels.
4-
The methods calculating the metrics on the entries of the confusion matrix can be also used as arguments for the methods in the [`xcolumns.block_coordinate`](api/block_coordinate) and [`xcolumns.frank_wolfe`](api/frank_wolfe) modules.
4+
The methods calculating the metrics on the entries of the confusion matrix can be also used as arguments for the methods in the
5+
[`xcolumns.block_coordinate`](block_coordinate) and [`xcolumns.frank_wolfe`](frank_wolfe) modules.
56

67
```{eval-rst}
78
.. automodule:: xcolumns.metrics

docs/api/weighted_prediction.md

+3-3
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1-
# Weighted predictions
1+
# Weighted predictions (`xcolumns.weighted_prediction`)
22

33
`xcolumns.weighted_prediction` module provides the methods for calculating the weighted prediction for each instance based on the conditional probabilities of labels.
4-
The main function of the module is [**predict_weighted_per_instance**](#xcolumns.weighted_prediction.predict_weighted_per_instance).
4+
The main function of the module is {func}`predict_weighted_per_instance() <xcolumns.weighted_prediction.predict_weighted_per_instance>`.
55

66

77
```{eval-rst}
@@ -11,7 +11,7 @@ The main function of the module is [**predict_weighted_per_instance**](#xcolumns
1111

1212
## Prediction strategies based on weighted predictions
1313

14-
Based on [**predict_weighted_per_instance**](#xcolumns.weighted_prediction.predict_weighted_per_instance) function the module provides few additional functions for calculating the predictions
14+
Based on {func}`predict_weighted_per_instance() <xcolumns.weighted_prediction.predict_weighted_per_instance>` function the module provides few additional functions for calculating the predictions
1515
that are optimal for some specific metrics or arbitrary upweight labels with smaller prior probabilities.
1616

1717
```{eval-rst}

docs/index.md

+7
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,13 @@ lastpage:
1010

1111
# Welcome to xCOLUMNs documentation!
1212

13+
xCOLUMNs stands for x**Consistent Optimization of Label-wise Utilities in Multi-label classificatioN**s.
14+
It is a small Python library that aims to implement different methods for the optimization of a general family of
15+
metrics that can be defined on multi-label classification matrices.
16+
These include, but are not limited to, label-wise metrics.
17+
The library provides an efficient implementation of the different optimization methods
18+
that easily scale to the extreme multi-label classification (XMLC) - problems with a very large number of labels and instances.
19+
1320

1421
```{toctree}
1522
:hidden:

docs/intro/overview.md

+8-9
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,6 @@
11
# Overview of xCOLUMNs library
22

3-
xCOLUMNs stands for x**Consistent Optimization of Label-wise Utilities in Multi-label classificatioN**s.
4-
It is a small Python library that aims to implement different methods for the optimization of a general family of
5-
metrics that can be defined on multi-label classification matrices.
6-
These include, but are not limited to, label-wise metrics (see below for details).
7-
The library provides an efficient implementation of the different optimization methods that easily scale to the extreme multi-label classification (XMLC) - problems with a very large number of labels and instances.
3+
84

95

106
## What is multi-label classification?
@@ -43,9 +39,12 @@ In this sense xCOLUMNs implements plug-in inference methods, that can be used on
4339

4440
The aim of xCOLUMNs is to provide methods for the optimization of the general family of label-wise utilities. Currently, the following methods are implemented:
4541

46-
- Prediction for provided test set using **Block Coordinate Ascent/Descent (BC)** method, described in [1].
47-
- Search for optimal population classifier using **Frank-Wolfe (FW)** method, described in [2].
42+
- Weighted instance-wise prediction that include optimal infernece strategies for some metrics. Implemented in [`xcolumns.weighted_prediction`](../api/weighted_prediction) module.
43+
44+
- Prediction for provided label-wise metric and test set using **Block Coordinate Ascent/Descent (BC)** method. Implemented in [`xcolumns.block_coordinate`](../api/block_coordinate) module. It was first introduced and described in:
45+
> [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
4846
49-
[1] [Erik Schultheis, Marek Wydmuch, Wojciech Kotłowski, Rohit Babbar, Krzysztof Dembczyński. Generalized test utilities for long-tail performance in extreme multi-label classification. NeurIPS 2023.](https://arxiv.org/abs/2311.05081)
47+
- Search for optimal population classifier for provided metric defined on mulit-label confusion matrix using **Frank-Wolfe (FW)** method and provided training set. Implemented in [`xcolumns.frank_wolfe`](../api/frank_wolfe) module. It was first introduced and described in:
48+
> [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
5049
51-
[2] [Erik Schultheis, Wojciech Kotłowski, Marek Wydmuch, Rohit Babbar, Strom Borman, Krzysztof Dembczyński. Consistent algorithms for multi-label classification with macro-at-k metrics. ICLR 2024.](https://arxiv.org/abs/2401.16594)
50+
The library also implements a set of methods for calculating the metrics based on both the confusion matrix and the true and predicted labels. Implemented in [`xcolumns.confusion_matrix`](../api/confusion_matrix) and [`xcolumns.metrics`](../api/metrics) modules.

docs/intro/quick_start.md

+1-1
Original file line numberDiff line numberDiff line change
@@ -16,4 +16,4 @@ However, the PyTorch is not a required dependency, so you need to install it sep
1616

1717
## Usage
1818

19-
We provide a short usage guide for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.
19+
We provide a short usage guide (with examples) for the library in [short_usage_guide.ipynb](https://github.com/mwydmuch/xCOLUMNs/blob/master/short_usage_guide.ipynb) notebook.

short_usage_guide.ipynb

+2-3
Original file line numberDiff line numberDiff line change
@@ -96,7 +96,7 @@
9696
"metadata": {},
9797
"outputs": [],
9898
"source": [
99-
"!pip install sklearn"
99+
"!pip install sklearn matplotlib"
100100
]
101101
},
102102
{
@@ -152,6 +152,7 @@
152152
"metadata": {},
153153
"outputs": [],
154154
"source": [
155+
"# Cast the data to the desired type\n",
155156
"target_type = \"csr_matrix\"\n",
156157
"\n",
157158
"if target_type == \"torch\":\n",
@@ -267,7 +268,6 @@
267268
"metadata": {},
268269
"outputs": [],
269270
"source": [
270-
"# Test top-k prediction\n",
271271
"from xcolumns.weighted_prediction import predict_top_k\n",
272272
"\n",
273273
"y_pred = predict_top_k(y_proba_test, k=3)\n",
@@ -312,7 +312,6 @@
312312
"metadata": {},
313313
"outputs": [],
314314
"source": [
315-
"# Frank Wolfe\n",
316315
"from xcolumns.frank_wolfe import find_classifier_using_fw\n",
317316
"\n",
318317
"rnd_clf, meta = find_classifier_using_fw(\n",

xcolumns/__init__.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
__version__ = "0.0.2"
1+
__version__ = "0.0.3"

0 commit comments

Comments
 (0)