Skip to content

ougrid/SuperAI_LLM_FineTune

 
 

Repository files navigation

Installation

Repository preparation

git clone https://github.com/boat1603/SuperAI_LLM_FineTune.git
cd ./SuperAI_LLM_FineTune

Install using Conda

ml Mamba
conda create -p ./env python=3.10.0 -y
conda activate ./env
pip install -e .

Install using Apptainer (Optional)

ml Apptainer
apptainer build ./llm-finetune.sif docker://boat1603/llm-finetune:latest

Submit Train Model

sbatch submit_multinode.sh

for Apptainer

sbatch submit_multinode_apptainer.sh

Note:

  • Change training config via ./smultinode.sh or ./smultinode_apptainer.sh (for apptainer).
  • When using Deepspeed training Scheduler will follow the Deepspeed config.
  • You can setup training spec in ./submit_multinode.sh or submit_multinode_apptainer.sh following our guideline.

Convert Deepspeed to FP32

sbatch ./submit_zero_to_fp32.sh

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 83.0%
  • Shell 16.8%
  • Dockerfile 0.2%