You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please note that some of the script examples (e.g., pretrain_*.sh directly under ```Megatron-DeepSpeed/examples/``` folder) are from the original NVIDIA's Megatron-LM and does not have DeepSpeed integration (scripts with DeepSpeed integration should include the ```deepspeed``` keyword). Below we list various examples that do have DeepSpeed integration.
4
-
5
-
### Azure
6
-
7
-
We strongly recommend to start with AzureML recipe in the ```azureml``` folder.
8
-
9
-
If you have a custom infrastructure (e.g. HPC clusters) or Azure VM and VMSS based environments, please refer to the bash scripts in the ```azure``` folder.
10
-
11
-
### MoE
12
-
13
-
Please see the ```MoE``` folder for different training recipes and scripts for Mixture-of-expert based models and dense models. These recipes are for GPT-style NLG models.
14
-
15
-
### Data Efficiency
16
-
17
-
The ```data_efficiency``` folder includes GPT-3 and BERT pretraining examples for DeepSpeed Data Efficiency Library. Please refer to the detailed tutorials in data_efficiency/README.MD.
18
-
19
-
### Curriculum Learning
20
-
21
-
Curriculum learning recipes are in the ```curriculum_learning``` folder. Please refer to the detailed tutorials linked inside. These recipes are for GPT-style NLG models.
22
-
Note that the DeepSpeed Data Efficiency Library above includes a more general curriculum learning support. This legacy curriculum learning feature is still compatible, but we recommend using the DeepSpeed Data Efficiency Library above.
23
-
24
-
### Model Compression
25
-
26
-
The ```compression``` folder includes examples about layer reduction for task-agnostic compression. Please refer to [this tutorial](https://www.deepspeed.ai/tutorials/model-compression/#11-layer-reduction) about the DeepSpeed Model Compression Library. These recipes are for GPT-style NLG models.
27
-
28
-
### BERT example
29
-
30
-
The ```bert_with_pile``` folder includes examples about BERT-style model pre-training (using the public Pile data or user's own data) with DeepSpeed integration. Please refer to the readme in the folder for tutorial.
3
+
This folder includes examples from the original NVIDIA/Megatron-LM repo. All of them do NOT have DeepSpeed technologies integrations, and some of them may not work due to changes in this Megatron-DeepSpeed repo. Thus we recommend you to go to ```../examples_deepspeed/``` folder which includes examples that have DeepSpeed technologies integrated and are tested by DeepSpeed team.
0 commit comments