Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for hpu device #2595

Open
wants to merge 8 commits into
base: main
Choose a base branch
from

Conversation

arathi-hlab
Copy link
Contributor

Adding a new device support named 'hpu'

Copy link

@bsochack bsochack left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

all my comments were resolved. It still needs to be reviewed by a maintainer.

Copy link

@jeromean jeromean left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you paste a snapshot of the results?

@arathi-hlab
Copy link
Contributor Author

@jeromean here is the result for both train and eval mode

test_BERT_pytorch_train_hpu (main.TestBenchmark) ... ok
test_Background_Matting_train_hpu (main.TestBenchmark) ... ERROR
test_LearningToPaint_train_hpu (main.TestBenchmark) ... ok
test_Super_SloMo_train_hpu (main.TestBenchmark) ... ok
test_alexnet_train_hpu (main.TestBenchmark) ... ok
test_basic_gnn_edgecnn_train_hpu (main.TestBenchmark) ... ERROR
test_basic_gnn_gcn_train_hpu (main.TestBenchmark) ... ok
test_basic_gnn_gin_train_hpu (main.TestBenchmark) ... ok
test_basic_gnn_sage_train_hpu (main.TestBenchmark) ... ok
test_cm3leon_generate_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_dcgan_train_hpu (main.TestBenchmark) ... ok
test_demucs_train_hpu (main.TestBenchmark) ... ok
test_densenet121_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_c4_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_dc5_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_fpn_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_c4_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_dc5_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_fpn_train_hpu (main.TestBenchmark) ... ok
test_detectron2_fcos_r_50_fpn_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "FCOS train is not supported by upstream detectron2. See GH Issue: facebookresearch/detectron2#4369.", skipping...'
test_detectron2_maskrcnn_r_101_c4_train_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_101_fpn_train_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_50_c4_train_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_50_fpn_train_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_train_hpu (main.TestBenchmark) ... ok
test_dlrm_train_hpu (main.TestBenchmark) ... ok
test_doctr_det_predictor_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_doctr_reco_predictor_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_drq_train_hpu (main.TestBenchmark) ... ok
test_fastNLP_Bert_train_hpu (main.TestBenchmark) ... ERROR
test_functorch_dp_cifar10_train_hpu (main.TestBenchmark) ... ok
test_functorch_maml_omniglot_train_hpu (main.TestBenchmark) ... ok
test_hf_Albert_train_hpu (main.TestBenchmark) ... ok
test_hf_Bart_train_hpu (main.TestBenchmark) ... ok
test_hf_Bert_large_train_hpu (main.TestBenchmark) ... ok
test_hf_Bert_train_hpu (main.TestBenchmark) ... ok
test_hf_BigBird_train_hpu (main.TestBenchmark) ... ok
test_hf_DistilBert_train_hpu (main.TestBenchmark) ... ok
test_hf_GPT2_large_train_hpu (main.TestBenchmark) ... ok
test_hf_GPT2_train_hpu (main.TestBenchmark) ... ok
test_hf_Longformer_train_hpu (main.TestBenchmark) ... ok
test_hf_Reformer_train_hpu (main.TestBenchmark) ... ok
test_hf_Roberta_base_train_hpu (main.TestBenchmark) ... ok
test_hf_T5_base_train_hpu (main.TestBenchmark) ... ok
test_hf_T5_generate_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_hf_T5_large_train_hpu (main.TestBenchmark) ... ok
test_hf_T5_train_hpu (main.TestBenchmark) ... ok
test_hf_Whisper_train_hpu (main.TestBenchmark) ... skipped 'This test is skipped by its metadata'
test_hf_clip_train_hpu (main.TestBenchmark) ... ok
test_hf_distil_whisper_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Training is not implemented.", skipping...'
test_lennard_jones_train_hpu (main.TestBenchmark) ... ok
test_llama_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_llama_v2_7b_16h_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_llava_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_maml_omniglot_train_hpu (main.TestBenchmark) ... ok
test_maml_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "MAML model doesn't support train.", skipping...'
test_microbench_unbacked_tolist_sum_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Train test is not implemented.", skipping...'
test_mnasnet1_0_train_hpu (main.TestBenchmark) ... ok
test_mobilenet_v2_train_hpu (main.TestBenchmark) ... ok
test_mobilenet_v3_large_train_hpu (main.TestBenchmark) ... ok
test_moco_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "hpu not supported", skipping...'
test_moondream_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_nanogpt_train_hpu (main.TestBenchmark) ... ok
test_nvidia_deeprecommender_train_hpu (main.TestBenchmark) ... ERROR
test_opacus_cifar10_train_hpu (main.TestBenchmark) ... ok
test_phlippe_densenet_train_hpu (main.TestBenchmark) ... ok
test_phlippe_resnet_train_hpu (main.TestBenchmark) ... ok
test_pyhpc_equation_of_state_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_pyhpc_isoneutral_mixing_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_pyhpc_turbulent_kinetic_energy_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_pytorch_CycleGAN_and_pix2pix_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "DataParallel is currently not supported on HPU. Please use torch.nn.DistributedDataParallel instead.", skipping...'
test_pytorch_stargan_train_hpu (main.TestBenchmark) ... ok
test_pytorch_unet_train_hpu (main.TestBenchmark) ... ok
test_resnet152_train_hpu (main.TestBenchmark) ... ok
test_resnet18_train_hpu (main.TestBenchmark) ... ok
test_resnet50_train_hpu (main.TestBenchmark) ... ok
test_resnext50_32x4d_train_hpu (main.TestBenchmark) ... ok
test_sam_fast_train_hpu (main.TestBenchmark) ... ERROR
test_sam_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_shufflenet_v2_x1_0_train_hpu (main.TestBenchmark) ... ok
test_simple_gpt_tp_manual_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_simple_gpt_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Model's DEFAULT_TRAIN_BSIZE is not implemented.", skipping...'
test_soft_actor_critic_train_hpu (main.TestBenchmark) ... ok
test_speech_transformer_train_hpu (main.TestBenchmark) ... ok
test_squeezenet1_1_train_hpu (main.TestBenchmark) ... ok
test_stable_diffusion_text_encoder_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_stable_diffusion_unet_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_tacotron2_train_hpu (main.TestBenchmark) ... ok
test_timm_efficientdet_train_hpu (main.TestBenchmark) ... skipped 'Method train on hpu is not implemented because "The original model code forces the use of CUDA.", skipping...'
test_timm_efficientnet_train_hpu (main.TestBenchmark) ... ok
test_timm_nfnet_train_hpu (main.TestBenchmark) ... ok
test_timm_regnet_train_hpu (main.TestBenchmark) ... ok
test_timm_resnest_train_hpu (main.TestBenchmark) ... ok
test_timm_vision_transformer_large_train_hpu (main.TestBenchmark) ... ok
test_timm_vision_transformer_train_hpu (main.TestBenchmark) ... ok
test_timm_vovnet_train_hpu (main.TestBenchmark) ... ok
test_torch_multimodal_clip_train_hpu (main.TestBenchmark) ... ok
test_tts_angular_train_hpu (main.TestBenchmark) ... ok
test_vgg16_train_hpu (main.TestBenchmark) ... ok
test_vision_maskrcnn_train_hpu (main.TestBenchmark) ... ok
test_yolov3_train_hpu (main.TestBenchmark) ... ok

**Ran 101 tests in 860.358s

FAILED (errors=5, skipped=24)**

========================================================================================================== EVAL MODE
test_BERT_pytorch_eval_hpu (main.TestBenchmark) ... ok
test_Background_Matting_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "", skipping...'
test_LearningToPaint_eval_hpu (main.TestBenchmark) ... ok
test_Super_SloMo_eval_hpu (main.TestBenchmark) ... ok
test_alexnet_eval_hpu (main.TestBenchmark) ... ok
test_basic_gnn_edgecnn_eval_hpu (main.TestBenchmark) ... ok
test_basic_gnn_gcn_eval_hpu (main.TestBenchmark) ... ok
test_basic_gnn_gin_eval_hpu (main.TestBenchmark) ... ok
test_basic_gnn_sage_eval_hpu (main.TestBenchmark) ... ok
test_cm3leon_generate_eval_hpu (main.TestBenchmark) ... ok
test_dcgan_eval_hpu (main.TestBenchmark) ... ok
test_demucs_eval_hpu (main.TestBenchmark) ... ok
test_densenet121_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_c4_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_dc5_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_101_fpn_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_c4_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_dc5_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fasterrcnn_r_50_fpn_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_fcos_r_50_fpn_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_101_c4_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_101_fpn_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_50_c4_eval_hpu (main.TestBenchmark) ... ok
test_detectron2_maskrcnn_r_50_fpn_eval_hpu (main.TestBenchmark) ... ok
test_dlrm_eval_hpu (main.TestBenchmark) ... ok
test_doctr_det_predictor_eval_hpu (main.TestBenchmark) ... ERROR
test_doctr_reco_predictor_eval_hpu (main.TestBenchmark) ... ERROR
test_drq_eval_hpu (main.TestBenchmark) ... ok
test_fastNLP_Bert_eval_hpu (main.TestBenchmark) ... ERROR
test_functorch_dp_cifar10_eval_hpu (main.TestBenchmark) ... ok
test_functorch_maml_omniglot_eval_hpu (main.TestBenchmark) ... ok
test_hf_Albert_eval_hpu (main.TestBenchmark) ... ok
test_hf_Bart_eval_hpu (main.TestBenchmark) ... ok
test_hf_Bert_eval_hpu (main.TestBenchmark) ... ok
test_hf_Bert_large_eval_hpu (main.TestBenchmark) ... ok
test_hf_BigBird_eval_hpu (main.TestBenchmark) ... ok
test_hf_DistilBert_eval_hpu (main.TestBenchmark) ... ok
test_hf_GPT2_eval_hpu (main.TestBenchmark) ... ok
test_hf_GPT2_large_eval_hpu (main.TestBenchmark) ... ok
test_hf_Longformer_eval_hpu (main.TestBenchmark) ... ok
test_hf_Reformer_eval_hpu (main.TestBenchmark) ... ok
test_hf_Roberta_base_eval_hpu (main.TestBenchmark) ... ok
test_hf_T5_base_eval_hpu (main.TestBenchmark) ... ok
test_hf_T5_eval_hpu (main.TestBenchmark) ... ok
test_hf_T5_generate_eval_hpu (main.TestBenchmark) ... ok
test_hf_T5_large_eval_hpu (main.TestBenchmark) ... ok
test_hf_Whisper_eval_hpu (main.TestBenchmark) ... ok
test_hf_clip_eval_hpu (main.TestBenchmark) ... ok
test_hf_distil_whisper_eval_hpu (main.TestBenchmark) ... ok
test_lennard_jones_eval_hpu (main.TestBenchmark) ... ok
test_llama_eval_hpu (main.TestBenchmark) ... ERROR
test_llama_v2_7b_16h_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_llava_eval_hpu (main.TestBenchmark) ... ok
test_maml_eval_hpu (main.TestBenchmark) ... ok
test_maml_omniglot_eval_hpu (main.TestBenchmark) ... ok
test_microbench_unbacked_tolist_sum_eval_hpu (main.TestBenchmark) ... ok
test_mnasnet1_0_eval_hpu (main.TestBenchmark) ... ok
test_mobilenet_v2_eval_hpu (main.TestBenchmark) ... ok
test_mobilenet_v3_large_eval_hpu (main.TestBenchmark) ... ok
test_moco_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "hpu not supported", skipping...'
test_moondream_eval_hpu (main.TestBenchmark) ... ok
test_nanogpt_eval_hpu (main.TestBenchmark) ... ok
test_nvidia_deeprecommender_eval_hpu (main.TestBenchmark) ... ERROR
test_opacus_cifar10_eval_hpu (main.TestBenchmark) ... ok
test_phlippe_densenet_eval_hpu (main.TestBenchmark) ... ok
test_phlippe_resnet_eval_hpu (main.TestBenchmark) ... ok
test_pyhpc_equation_of_state_eval_hpu (main.TestBenchmark) ... ok
test_pyhpc_isoneutral_mixing_eval_hpu (main.TestBenchmark) ... ok
test_pyhpc_turbulent_kinetic_energy_eval_hpu (main.TestBenchmark) ... ok
test_pytorch_CycleGAN_and_pix2pix_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "DataParallel is currently not supported on HPU. Please use torch.nn.DistributedDataParallel instead.", skipping...'
test_pytorch_stargan_eval_hpu (main.TestBenchmark) ... ok
test_pytorch_unet_eval_hpu (main.TestBenchmark) ... ok
test_resnet152_eval_hpu (main.TestBenchmark) ... ok
test_resnet18_eval_hpu (main.TestBenchmark) ... ok
test_resnet50_eval_hpu (main.TestBenchmark) ... ok
test_resnext50_32x4d_eval_hpu (main.TestBenchmark) ... ok
test_sam_eval_hpu (main.TestBenchmark) ... ERROR
test_sam_fast_eval_hpu (main.TestBenchmark) ... ERROR
test_shufflenet_v2_x1_0_eval_hpu (main.TestBenchmark) ... ok
test_simple_gpt_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "Model requires CUDA", skipping...'
test_simple_gpt_tp_manual_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "Model requires CUDA", skipping...'
test_soft_actor_critic_eval_hpu (main.TestBenchmark) ... ok
test_speech_transformer_eval_hpu (main.TestBenchmark) ... ok
test_squeezenet1_1_eval_hpu (main.TestBenchmark) ... ok
test_stable_diffusion_text_encoder_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_stable_diffusion_unet_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "Make sure to set HUGGING_FACE_HUB_TOKEN so you can download weights", skipping...'
test_tacotron2_eval_hpu (main.TestBenchmark) ... ok
test_timm_efficientdet_eval_hpu (main.TestBenchmark) ... skipped 'Method eval on hpu is not implemented because "The original model code forces the use of CUDA.", skipping...'
test_timm_efficientnet_eval_hpu (main.TestBenchmark) ... ok
test_timm_nfnet_eval_hpu (main.TestBenchmark) ... ok
test_timm_regnet_eval_hpu (main.TestBenchmark) ... ok
test_timm_resnest_eval_hpu (main.TestBenchmark) ... ok
test_timm_vision_transformer_eval_hpu (main.TestBenchmark) ... ok
test_timm_vision_transformer_large_eval_hpu (main.TestBenchmark) ... ok
test_timm_vovnet_eval_hpu (main.TestBenchmark) ... ok
test_torch_multimodal_clip_eval_hpu (main.TestBenchmark) ... ok
test_tts_angular_eval_hpu (main.TestBenchmark) ... ok
test_vgg16_eval_hpu (main.TestBenchmark) ... ok
test_vision_maskrcnn_eval_hpu (main.TestBenchmark) ... ok
test_yolov3_eval_hpu (main.TestBenchmark) ... ok
**Ran 101 tests in 974.877s

FAILED (errors=7, skipped=9)**

@arathi-hlab
Copy link
Contributor Author

@jeromean can you review and approve the change

@EikanWang
Copy link

@xuzhao9 , I wonder if you could help review this PR?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants