Update Domino for Llama3 #959

shenzheyu · 2025-02-26T20:09:00Z

No description provided.

GuanhuaWang · 2025-03-05T21:30:20Z

@hwchen2017 , please follow up on this pr. thank you!

* add domino * use transformer from deepspeed * clean args * mega opt * add opt & timer * add opt * fix loss * folder name * Change arguent in pretrain script * Add readme for domino * Update readme for domino * Fixing usage issues * update dataset * megatron dependencies * path * Update README.md * remove imports * update import * Update README.md * Minor example script changes * train bash * require * Update README.md --------- Co-authored-by: chengming-zhang <[email protected]> Co-authored-by: Zheyu SHEN <[email protected]> Co-authored-by: root <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Co-authored-by: Logan Adams <[email protected]> Signed-off-by: Zheyu SHEN <[email protected]>

…for Domino Signed-off-by: Zheyu SHEN <[email protected]>

* add benchmarking for offloading states * fix api names Signed-off-by: Zheyu SHEN <[email protected]>

Signed-off-by: Zheyu SHEN <[email protected]>

* Add label_smoothing while calculating step2 DPO loss in DeepSpeed-Chat. * Add training scripts for step2 DPO in DeepSpeed-Chat. * Remove unused packages and format the code of step2 DPO in DeepSpeed-Chat. * Update training scripts of step2 DPO in DeepSpeed-Chat. * Follow upstream fixes. * Update README.md for Step2 DPO finetuning. * Add opt 350M training log demo for step 2 dpo finetuning in DeepSpeed-Chat. * Address the formatting issue in step2 dpo finetuning in DeepSpeed-Chat. --------- Co-authored-by: Logan Adams <[email protected]> Co-authored-by: Olatunji Ruwase <[email protected]> Signed-off-by: Zheyu SHEN <[email protected]>

Signed-off-by: Zheyu SHEN <[email protected]>

shenzheyu requested a review from tjruwase as a code owner February 26, 2025 20:09

GuanhuaWang requested review from GuanhuaWang and hwchen2017 February 27, 2025 00:22

zhangsmallshark and others added 7 commits March 5, 2025 17:57

Update DeepSpeed version requirement to >=0.16.0 in requirements.txt …

c6a4462

…for Domino Signed-off-by: Zheyu SHEN <[email protected]>

Example and benchmark of APIs to offload states (deepspeedai#942)

710d83f

* add benchmarking for offloading states * fix api names Signed-off-by: Zheyu SHEN <[email protected]>

remove-redundant-code (deepspeedai#947)

06f5df4

Signed-off-by: Zheyu SHEN <[email protected]>

Update references to torchvision (deepspeedai#949)

9dc9b0d

Signed-off-by: Zheyu SHEN <[email protected]>

Add llama3 8b

Loading
Loading status checks…

680108d

Signed-off-by: Zheyu SHEN <[email protected]>

shenzheyu force-pushed the master branch from 9662082 to 680108d Compare March 5, 2025 22:57

Merge branch 'master' into master

Loading
Loading status checks…

24c0969

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update Domino for Llama3 #959

Update Domino for Llama3 #959

shenzheyu commented Feb 26, 2025

GuanhuaWang commented Mar 5, 2025

Update Domino for Llama3 #959

Are you sure you want to change the base?

Update Domino for Llama3 #959

Conversation

shenzheyu commented Feb 26, 2025

GuanhuaWang commented Mar 5, 2025