Skip to content

Commit

Permalink
[Examples] Revise GS-GB-SM example (#1171)
Browse files Browse the repository at this point in the history
*Issue #, if available:*

*Description of changes:*
Revising readme, changing image names, fix typo.

By submitting this pull request, I confirm that you can use, modify,
copy, and redistribute this contribution, under the terms of your
choice.
  • Loading branch information
RonaldBXu authored Feb 14, 2025
1 parent a68a032 commit ea6dff4
Show file tree
Hide file tree
Showing 3 changed files with 15 additions and 11 deletions.
18 changes: 11 additions & 7 deletions examples/sagemaker-pipelines-graphbolt/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,8 @@ This will create the tabular graph data on S3 which you can verify by running

```bash
aws s3 ls s3://$BUCKET_NAME/ogb-arxiv-input/
```
```
PRE edges/
PRE nodes/
PRE splits/
Expand Down Expand Up @@ -254,11 +256,11 @@ docker -v

cd ~/graphstorm

bash ./docker/build_graphstorm_image.sh --environment sagemaker --device cpu
bash ./docker/build_graphstorm_image.sh --environment sagemaker --device cpu --image graphstorm-example-sagemaker-pipeline

bash docker/push_graphstorm_image.sh -e sagemaker -r $REGION -a $ACCOUNT_ID -d cpu
bash docker/push_graphstorm_image.sh -e sagemaker -r $REGION -a $ACCOUNT_ID -d cpu -i graphstorm-example-sagemaker-pipeline
# This will push an image to
# ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/graphstorm:sagemaker-cpu
# ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-cpu
```

Next, you will create a SageMaker Pipeline to run the jobs that are necessary to train GNN models with GraphStorm.
Expand All @@ -274,7 +276,7 @@ In this section, you will create a [Sagemaker Pipeline](https://docs.aws.amazon.
```bash
PIPELINE_NAME="ogbn-arxiv-gs-pipeline"

bash deploy_papers100M_pipeline.sh \
bash deploy_arxiv_pipeline.sh \
--account $ACCOUNT_ID \
--bucket-name $BUCKET_NAME --role $SAGEMAKER_EXECUTION_ROLE_ARN \
--pipeline-name $PIPELINE_NAME \
Expand Down Expand Up @@ -321,7 +323,7 @@ Every pipeline execution that shares the same input arguments will be under a ra
Note that the particular execution subpath might be different in your case.

```bash
aws s3 ls s3://$BUCKET_NAME/pipelines-output/ogbn-arxiv-gs-pipeline/
aws s3 ls s3://$BUCKET_NAME/pipelines-output/ogbn-arxiv-gs-pipeline/

# 761a4ff194198d49469a3bb223d5f26e

Expand Down Expand Up @@ -439,9 +441,11 @@ For this job you will use large GPU instances, so you will build and push the GP
```bash
cd ~/graphstorm

bash ./docker/build_graphstorm_image.sh --environment sagemaker --device gpu
bash ./docker/build_graphstorm_image.sh --environment sagemaker --device gpu --image graphstorm-example-sagemaker-pipeline

bash docker/push_graphstorm_image.sh -e sagemaker -r $REGION -a $ACCOUNT_ID -d gpu
bash docker/push_graphstorm_image.sh -e sagemaker -r $REGION -a $ACCOUNT_ID -d gpu -i graphstorm-example-sagemaker-pipeline
# This will push an image to
# ${ACCOUNT_ID}.dkr.ecr.us-east-1.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-gpu
```

### Deploy and execute pipelines for papers-100M
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ TASK_TYPE="node_classification"
INFERENCE_MODEL_SNAPSHOT="epoch-9"

JOBS_TO_RUN="gconstruct train inference"
GSF_CPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm:sagemaker-cpu
GSF_GPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm:sagemaker-gpu
GSF_CPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-cpu
GSF_GPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-gpu
VOLUME_SIZE=50

if [[ -z "${PIPELINE_NAME-}" ]]; then
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -88,8 +88,8 @@ TRAIN_GPU_INSTANCE="ml.g5.48xlarge"
GCONSTRUCT_INSTANCE="ml.r5.24xlarge"
NUM_TRAINERS=8

GSF_CPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm:sagemaker-cpu
GSF_GPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm:sagemaker-gpu
GSF_CPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-cpu
GSF_GPU_IMAGE_URI=${ACCOUNT}.dkr.ecr.$REGION.amazonaws.com/graphstorm-example-sagemaker-pipeline:sagemaker-gpu

GCONSTRUCT_CONFIG="gconstruct_config_papers100m.json"
GRAPH_CONSTRUCTION_ARGS="--num-processes 16"
Expand Down

0 comments on commit ea6dff4

Please sign in to comment.