Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library. #4

Open
Git-ycy opened this issue Jan 26, 2025 · 2 comments

Comments

@Git-ycy
Copy link

Git-ycy commented Jan 26, 2025

我直接将../results/sft_crawler复制到../output/sft_crawler
出现报错
Some weights of Qwen2ForSequenceClassification were not initialized from the model checkpoint at ../output/sft_crawler and are newly initialized: ['score.0.weight', 'score.1.weight']
想问下该如何得到../output/sft_crawler

@hyc2026
Copy link
Collaborator

hyc2026 commented Jan 27, 2025

先要进行sft训练得到sft_crawler
https://github.com/bytedance/pasa?tab=readme-ov-file#crawler-sft-training

@Git-ycy
Copy link
Author

Git-ycy commented Jan 29, 2025

../results/sft_crawler已经是经过sft训练之后的sft_crawler了,然后我复制到了../output/sft_crawler
但是发现除了上面的报错,在进行PPO训练时更严重的下面的问题
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/student/ycy/RL-Train/trl/examples/scripts/ppo/ppo_tldr.py", line 71, in
[rank0]: trainer = FixZero3CheckpointPPOTrainer(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/ycy/RL-Train/trl/trl/trainer/ppo_trainer.py", line 145, in init
[rank0]: accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/.conda/envs/rltrain/lib/python3.12/site-packages/accelerate/accelerator.py", line 302, in init
[rank0]: deepspeed_plugins = AcceleratorState().deepspeed_plugins
[rank0]: ^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/.conda/envs/rltrain/lib/python3.12/site-packages/accelerate/state.py", line 887, in init
[rank0]: raise ValueError(
[rank0]: ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library.

于是我修改了ppo_tldr.py的如下部分

Image

之后出现的报错为,想请教如何解决
[rank0]: Traceback (most recent call last):
[rank0]: File "/home/student/ycy/RL-Train/trl/examples/scripts/ppo/ppo_tldr.py", line 71, in
[rank0]: trainer = FixZero3CheckpointPPOTrainer(
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/ycy/RL-Train/trl/trl/trainer/ppo_trainer.py", line 145, in init
[rank0]: accelerator = Accelerator(gradient_accumulation_steps=args.gradient_accumulation_steps)
[rank0]: ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/.conda/envs/rltrain/lib/python3.12/site-packages/accelerate/accelerator.py", line 302, in init
[rank0]: deepspeed_plugins = AcceleratorState().deepspeed_plugins
[rank0]: ^^^^^^^^^^^^^^^^^^
[rank0]: File "/home/student/.conda/envs/rltrain/lib/python3.12/site-packages/accelerate/state.py", line 887, in init
[rank0]: raise ValueError(
[rank0]: ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library.

我的训练参数为

accelerate launch \
    --config_file examples/accelerate_configs/deepspeed_zero3.yaml \
    --num_processes 2 \
    --main_process_port 2501 \
    --machine_rank 0 \
    --main_process_ip 127.0.0.1 \
    examples/scripts/ppo/ppo_tldr.py \
    --dataset_name ../data/AutoScholarQuery/train.jsonl \
    --dataset_test_split validation \
    --output_dir ../results/ppo_crawler \
    --learning_rate 1e-6 \
    --per_device_train_batch_size 1 \
    --gradient_accumulation_steps 4 \
    --total_episodes 16000 \
    --paper_db ../data/paper_database/cs_paper_2nd.zip \
    --paper_id ../data/paper_database/id2paper.json \
    --model_name_or_path ../output/sft_crawler \
    --sft_model_path ../output/sft_crawler \
    --reward_model_path ../output/sft_crawler \
    --local_rollout_forward_batch_size 4 \
    --num_sample_generations 0 \
    --attn_implementation "flash_attention_2" \
    --response_length 1024 \
    --stop_token eos \
    --gamma1 0.1 \
    --save_steps 10 \
    --rounds 3 \
    --use_vm True \
    --use_selector False \
    --vf_coef 10.0 \
    --expand_select_score 1.5 \
    --expand_cost 0.1 \
    --search_select_score 1.5 \
    --search_cost 0.1 \
    --num_ppo_epochs 2 \
    --kl_coef 0.1

相关库的版本为(已经尝试过把accelerate降级为0.28.0和0.27.2,以及把transformers改成4.48.1仍然出现相同报错)
accelerate 1.3.0
transformers 4.47.0.dev0
trl 0.12.0.dev0

@Git-ycy Git-ycy changed the title Crawler PPO Training中的../output/sft_crawler是什么 ValueError: Please make sure to properly initialize your accelerator via accelerator = Accelerator() before using any functionality from the accelerate library. Jan 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants