supervised tutorial and autotune documentation with python tabs

Summary: Docusaurus now allows multiple language tabs. This commit adds a python tab for supervised and autotune examples. It also includes a snippet that activates the same language tab for the whole page. Reviewed By: EdouardGrave Differential Revision: D17091834 fbshipit-source-id: 6e6f76aa9408baa08fcd6c0bfd011de2cb477dfb
facebookresearch · Aug 28, 2019 · 38350a5 · 38350a5
1 parent 4aca28c
commit 38350a5
Show file tree

Hide file tree

Showing 4 changed files with 365 additions and 10 deletions.
diff --git a/docs/autotune.md b/docs/autotune.md
@@ -13,28 +13,55 @@ In order to activate hyperparameter optimization, we must provide a validation f
 
 For example, using the same data as our [tutorial example](/docs/en/supervised-tutorial.html#our-first-classifier), the autotune can be used in the following way:
 
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Command line-->
 ```sh
 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid
 ```
+<!--Python-->
+```py
+>>> import fasttext
+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid')
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
+
 
 Then, fastText will search the hyperparameters that gives the best f1-score on `cooking.valid` file:
 ```sh
 Progress: 100.0% Trials:   27 Best score:  0.406763 ETA:   0h 0m 0s
 ```
 
 Now we can test the obtained model with:
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Command line-->
 ```sh
->> ./fasttext test model_cooking.bin data/cooking.valid
+>> ./fasttext test model_cooking.bin cooking.valid
 N       3000
 P@1     0.666
 R@1     0.288
 ```
+<!--Python-->
+```py
+>>> model.test("cooking.valid")
+(3000L, 0.666, 0.288)
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
+
 
 By default, the search will take 5 minutes. You can set the timeout in seconds with the `-autotune-duration` argument. For example, if you want to set the limit to 10 minutes:
 
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Command line-->
 ```sh
 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid -autotune-duration 600
 ```
+<!--Python-->
+```py
+>>> import fasttext
+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneDuration=600)
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
+
 
 While autotuning, fastText displays the best f1-score found so far. If we decide to stop the tuning before the time limit, we can send one `SIGINT` signal (via `CTLR-C` for example). FastText will then finish the current training, and retrain with the best parameters found so far.
 
@@ -46,23 +73,42 @@ As you may know, fastText can compress the model with [quantization](/docs/en/ch
 
 Fortunately, autotune can also find the hyperparameters for this compression task while targeting the desired model size. To this end, we can set the `-autotune-modelsize` argument:
 
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Command line-->
 ```sh
 >> ./fasttext supervised -input cooking.train -output model_cooking -autotune-validation cooking.valid -autotune-modelsize 2M
 ```
-
 This will produce a `.ftz` file with the best accuracy having the desired size:
 ```sh
 >> ls -la model_cooking.ftz
 -rw-r--r--. 1 celebio users 1990862 Aug 25 05:39 model_cooking.ftz
->> ./fasttext test model_cooking.ftz data/cooking.valid
+>> ./fasttext test model_cooking.ftz cooking.valid
 N       3000
 P@1     0.57
 R@1     0.246
 ```
+<!--Python-->
+```py
+>>> import fasttext
+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneModelSize="2M")
+```
+If you save the model, you will obtain a model file with the desired size:
+```py
+>>> model.save_model("model_cooking.ftz")
+>>> import os
+>>> os.stat("model_cooking.ftz").st_size
+1990862
+>>> model.test("cooking.valid")
+(3000L, 0.57, 0.246)
+```
+<!--END_DOCUSAURUS_CODE_TABS-->
 
 
 # How to set the optimization metric?
 
+<!--DOCUSAURUS_CODE_TABS-->
+<!--Command line-->
+<br />
 By default, autotune will test the validation file you provide, exactly the same way as `./fasttext test model_cooking.bin cooking.valid` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
 
 But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `-autotune-metric` argument:
@@ -74,3 +120,19 @@ But, if we want to optimize the score of a specific label, say `__label__baking`
 This is equivalent to manually optimize the f1-score we get when we test with `./fasttext test-label model_cooking.bin cooking.valid | grep __label__baking` in command line.
 
 Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `./fasttext test model_cooking.bin cooking.valid 2`. You can also tell autotune to optimize the parameters by testing two labels with the `-autotune-predictions` argument.
+<!--Python-->
+<br />
+By default, autotune will test the validation file you provide, exactly the same way as `model.test("cooking.valid")` and try to optimize to get the highest [f1-score](https://en.wikipedia.org/wiki/F1_score).
+
+But, if we want to optimize the score of a specific label, say `__label__baking`, we can set the `autotuneMetric` argument:
+
+```py
+>>> import fasttext
+>>> model = fasttext.train_supervised(input='cooking.train', autotuneValidationFile='cooking.valid', autotuneMetric="f1:__label__baking")
+```
+
+This is equivalent to manually optimize the f1-score we get when we test with `model.test_label('cooking.valid')['__label__baking']`.
+
+Sometimes, you may be interested in predicting more than one label. For example, if you were optimizing the hyperparameters manually to get the best score to predict two labels, you would test with `model.test("cooking.valid", k=2)`. You can also tell autotune to optimize the parameters by testing two labels with the `autotunePredictions` argument.
+<!--END_DOCUSAURUS_CODE_TABS-->
+