diff --git a/docker/README.md b/docker/README.md index e9fd237c6f..3d6e287006 100644 --- a/docker/README.md +++ b/docker/README.md @@ -7,74 +7,5 @@ recommend that our users use Docker as the base running environment to use Graph For users who want to create their own GraphStorm Docker images because they want to add additional functions, e.g. graph data building, you can use the provided scripts to build your own GraphStorm Docker images. -## Prerequisites ------------------ -You need to install Docker in your environment as the [Docker documentation](https://docs.docker.com/get-docker/) -suggests. - -For example, in an AWS EC2 instance created with Deep Learning AMI GPU PyTorch 1.13.0, you can run -the following commands to install Docker. -```shell -sudo apt-get update -sudo apt update -sudo apt install Docker.io -``` - -## Build a Docker image from source ---------------- - -Once you have the GraphStorm repository cloned, please use the following command to build a Docker image from source: -```shell -cd /path-to-graphstorm/docker/ - -bash /path-to-graphstorm/docker/build_docker_oss4local.sh /path-to-graphstorm/ image-name image-tag device -``` - -There are four arguments of the `build_docker_oss4local.sh`: - -1. **path-to-graphstorm**(required), is the absolute path of the "graphstorm" folder, where you -cloned the GraphStorm source code. For example, the path could be "/code/graphstorm". -2. **docker-name**(optional), is the assigned name of the to be built Docker image. Default is -"graphstorm". -3. **docker-tag**(optional), is the assigned tag name of the to be built docker image. Default is -"local". -4. **device**(optional), is the intended execution device for the image. Should be one of `cpu` or `gpu`, default is -`gpu`. - -If Docker requires you to run it as a root user and you don't want to preface all docker commands with sudo, you can check the solution available [here](https://docs.docker.com/engine/install/linux-postinstall/#manage-docker-as-a-non-root-user). - -You can use the below command to check if the new image exists. -```shell -docker image ls -``` -If the build succeeds, there should be a new Docker image, named `:-`, e.g., "graphstorm:local-gpu". - -To push the image to ECR you can use the `push_gsf_container.sh` script. -It takes 4 positional arguments, `image-name` `image-tag-device`, `region`, and `account`. -For example to push the local GPU image to the us-west-2 on AWS account `1234567890` use: - -```bash -bash docker/push_gsf_container.sh graphstorm local-gpu us-west-2 1234567890 -``` - -## Using a customer DGL codebase ---------------- -To use a local DGL codebase, you'll need to modify the build script and Dockerfile.local. - - -You can add the following to the build_docker_oss4local.sh: - -```bash -mkdir -p code/dgl -rsync -qr "${GSF_HOME}/../dgl/" code/dgl/ --exclude .venv --exclude dist --exclude ".*/" \ - --exclude "*__pycache__" --exclude "third_party" -``` - -and in `local/Dockerfile.local` replace the line `RUN cd /root; git clone --branch v${DGL_VERSION} https://github.com/dmlc/dgl.git` -with the following lines: - -```Dockerfile -COPY code/dgl /root/dgl -ENV PYTHONPATH="/root/dgl/python/:${PYTHONPATH}" -ENV LD_LIBRARY_PATH="/opt/gs-venv/lib/python3.9/site-packages/dgl/:$LD_LIBRARY_PATH" -``` \ No newline at end of file +For instructions refer to the +[GraphStorm documentation](https://graphstorm.readthedocs.io/en/latest/install/env-setup.html#setup-graphstorm-docker-environment) diff --git a/docker/build_docker_oss4local.sh b/docker/build_docker_oss4local.sh index 15f848f6e5..469c18fed6 100644 --- a/docker/build_docker_oss4local.sh +++ b/docker/build_docker_oss4local.sh @@ -33,6 +33,13 @@ else DEVICE_TYPE="$4" fi +# process argument 5: support for parmetis +if [ -z "$5" ]; then + USE_PARMETIS="false" +else + USE_PARMETIS="$5" +fi + # Copy scripts and tools codes to the docker folder mkdir -p $GSF_HOME"/docker/code" cp $SCRIPT_DIR"/local/fetch_and_run.sh" $GSF_HOME"/docker/code/" @@ -42,7 +49,6 @@ cp -r $GSF_HOME"/inference_scripts" $GSF_HOME"/docker/code/inference_scripts" cp -r $GSF_HOME"/tools" $GSF_HOME"/docker/code/tools" cp -r $GSF_HOME"/training_scripts" $GSF_HOME"/docker/code/training_scripts" - # Build OSS docker for EC2 instances that an pull ECR docker images DOCKER_FULLNAME="${IMAGE_NAME}:${TAG}-${DEVICE_TYPE}" @@ -55,7 +61,7 @@ elif [[ $DEVICE_TYPE = "cpu" ]]; then docker login --username AWS --password-stdin public.ecr.aws SOURCE_IMAGE="public.ecr.aws/ubuntu/ubuntu:22.04_stable" else - echo >&2 -e "Image type can only be \"gpu\" or \"cpu\", but got \""$DEVICE_TYPE"\"" + echo >&2 -e "Image type can only be \"gpu\" or \"cpu\", but got '$DEVICE_TYPE'" # remove the temporary code folder rm -rf code exit 1 @@ -65,6 +71,7 @@ fi DOCKER_BUILDKIT=1 docker build \ --build-arg DEVICE=$DEVICE_TYPE \ --build-arg SOURCE=${SOURCE_IMAGE} \ + --build-arg USE_PARMETIS=${USE_PARMETIS} \ -f "${GSF_HOME}/docker/local/Dockerfile.local" . -t $DOCKER_FULLNAME # remove the temporary code folder diff --git a/docker/build_graphstorm_image.sh b/docker/build_graphstorm_image.sh index fc94933453..cb6753e537 100644 --- a/docker/build_graphstorm_image.sh +++ b/docker/build_graphstorm_image.sh @@ -19,8 +19,9 @@ Available options: -d, --device Device type, must be one of 'cpu' or 'gpu'. Default is 'gpu'. -p, --path Path to graphstorm root directory, default is one level above this script's location. -i, --image Docker image name, default is 'graphstorm'. --s, --suffix Suffix for the image tag, can be used to push custom image tags. Default is "-". +-s, --suffix Suffix for the image tag, can be used to push custom image tags. Default tag is "-". -b, --build Docker build directory prefix, default is '/tmp/graphstorm-build/docker'. +--use-parmetis When this flag is set we add the ParMETIS dependencies to the local image. ParMETIS partitioning is not available on SageMaker. Example: @@ -49,6 +50,7 @@ parse_params() { IMAGE_NAME='graphstorm' BUILD_DIR='/tmp/graphstorm-build/docker' SUFFIX="" + USE_PARMETIS=false while :; do case "${1-}" in @@ -78,6 +80,9 @@ parse_params() { SUFFIX="${2-}" shift ;; + --use-parmetis) + USE_PARMETIS=true + ;; -?*) die "Unknown option: $1" ;; *) break ;; esac @@ -113,6 +118,7 @@ msg "- DEVICE_TYPE: ${DEVICE_TYPE}" msg "- GSF_HOME: ${GSF_HOME}" msg "- IMAGE_NAME: ${IMAGE_NAME}" msg "- SUFFIX: ${SUFFIX}" +msg "- USE_PARMETIS: ${USE_PARMETIS}" # Prepare Docker build directory if [[ -d ${BUILD_DIR} ]]; then @@ -121,13 +127,15 @@ fi mkdir -p "${BUILD_DIR}" # Authenticate to ECR to be able to pull source SageMaker or public.ecr.aws image -msg "Authenticating to public ECR registry" if [[ ${EXEC_ENV} == "sagemaker" ]]; then - # Pulling SageMaker image, login to public SageMaker ECR registry + if [[ ${USE_PARMETIS} == true ]]; then + die "ParMETIS partitioning is not supported for SageMaker execution environment" + fi + msg "Authenticating to public SageMaker ECR registry" aws ecr get-login-password --region us-east-1 | docker login --username AWS --password-stdin 763104351884.dkr.ecr.us-east-1.amazonaws.com else - # Pulling local image, login to Amazon ECR Public Gallery + msg "Authenticating to Amazon ECR Public Gallery" aws ecr-public get-login-password --region us-east-1 | docker login --username AWS --password-stdin public.ecr.aws fi @@ -179,4 +187,5 @@ echo "Building Docker image: ${DOCKER_FULLNAME}" DOCKER_BUILDKIT=1 docker build \ --build-arg DEVICE="$DEVICE_TYPE" \ --build-arg SOURCE="${SOURCE_IMAGE}" \ + --build-arg USE_PARMETIS="${USE_PARMETIS}" \ -f "$DOCKERFILE" "${BUILD_DIR}" -t "$DOCKER_FULLNAME" diff --git a/docker/local/Dockerfile.local b/docker/local/Dockerfile.local index ad2c8a4bd1..533d862740 100644 --- a/docker/local/Dockerfile.local +++ b/docker/local/Dockerfile.local @@ -1,4 +1,5 @@ ARG DEVICE=gpu +ARG USE_PARMETIS=false ARG SOURCE FROM ${SOURCE} as base @@ -46,9 +47,17 @@ RUN pip install \ ARG DGL_VERSION=2.3.0 ARG DGL_CUDA_VERSION=121 ARG OGB_VERSION=1.3.6 -ARG TORCH_VERSION=2.3 +ARG TORCH_VERSION=2.3.0 ARG TRANSFORMERS_VERSION=4.28.1 +# Download dgl files +RUN cd /root && \ + git clone --branch v${DGL_VERSION} --single-branch https://github.com/dmlc/dgl.git && \ + rm -rf /root/dgl/.git +ENV DGL_HOME=/root/dgl +ENV DGLBACKEND=pytorch +ENV PYTHONPATH="/root/dgl/tools/:${PYTHONPATH}" + FROM base as base-cpu # Install torch, DGL, and GSF deps that require torch @@ -78,18 +87,53 @@ RUN TORCH_MAJOR_MINOR=$(echo $TORCH_VERSION | cut -c1-3) && \ transformers==${TRANSFORMERS_VERSION} \ && rm -rf /root/.cache -FROM base-${DEVICE} as runtime +FROM base-${DEVICE} as parmetis-true -ENV PYTHONPATH="/root/dgl/tools/:${PYTHONPATH}" +# Install MPI and dependencies +RUN apt update && apt install -y --no-install-recommends \ + build-essential \ + cmake \ + libopenmpi-dev \ + openmpi-bin \ + && rm -rf /var/lib/apt/lists/* -# Download DGL source code -RUN cd /root; git clone --branch v${DGL_VERSION} https://github.com/dmlc/dgl.git +RUN pip install \ + pyyaml \ + && rm -rf /root/.cache -# Copy GraphStorm source and add to PYTHONPATH -RUN mkdir -p /graphstorm -COPY code/python/graphstorm /graphstorm/python/graphstorm -ENV PYTHONPATH="/graphstorm/python/:${PYTHONPATH}" +# Install GKLib +RUN cd /root && \ + git clone --single-branch --branch master https://github.com/KarypisLab/GKlib && \ + cd GKlib && \ + make && \ + make install && \ + rm -rf .git +# Install Metis +RUN cd /root && \ + git clone --single-branch --branch master https://github.com/KarypisLab/METIS.git && \ + cd METIS && \ + make config shared=1 cc=gcc prefix=/root/local i64=1 && \ + make install && \ + rm -rf .git + +# Install Parmetis +RUN cd /root && \ + git clone --single-branch --branch main https://github.com/KarypisLab/PM4GNN.git && \ + cd PM4GNN && \ + make config cc=mpicc prefix=/root/local && \ + make install && \ + rm -rf .git + +ENV PATH=$PATH:/root/local/bin +ENV LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/root/local/lib/ +RUN cp /root/local/bin/pm_dglpart /root/local/bin/pm_dglpart3 + +FROM base-${DEVICE} as parmetis-false + +# No additional dependencies when not supporting ParMETIS + +FROM parmetis-${USE_PARMETIS} as runtime # Set up SSH access ENV SSH_PORT=2222 @@ -104,6 +148,12 @@ RUN mkdir -p ${SSHDIR} \ EXPOSE ${SSH_PORT} +# Copy GraphStorm source and add to PYTHONPATH +RUN mkdir -p /graphstorm +COPY code/python/graphstorm /graphstorm/python/graphstorm +ENV PYTHONPATH="/graphstorm/python/:${PYTHONPATH}" + + # Copy GraphStorm scripts and tools COPY code/examples /graphstorm/examples COPY code/inference_scripts /graphstorm/inference_scripts diff --git a/docs/source/install/env-setup.rst b/docs/source/install/env-setup.rst index a10c4d216f..4d98ff91e3 100644 --- a/docs/source/install/env-setup.rst +++ b/docs/source/install/env-setup.rst @@ -176,6 +176,7 @@ tag and other aspects of the build. We list the full argument list below: * ``-i, --image`` Docker image name, default is 'graphstorm'. * ``-s, --suffix`` Suffix for the image tag, can be used to push custom image tags. Default is "-". * ``-b, --build`` Docker build directory prefix, default is '/tmp/graphstorm-build/docker'. +* ``--use-parmetis`` When this flag is set we add the ParMETIS dependencies to the local image. ParMETIS partitioning is not available on SageMaker. For example you can build an image to support CPU-only execution using: @@ -184,6 +185,13 @@ For example you can build an image to support CPU-only execution using: bash docker/build_graphstorm_image.sh --environment local --device cpu # Will build an image named 'graphstorm:local-cpu' +Or to build and tag an image to run ParMETIS with EC2 instances: + +.. code-block:: bash + + bash docker/build_graphstorm_image.sh --environment local --device cpu --use-parmetis --suffix "-parmetis" + # Will build an image named 'graphstorm:local-cpu-parmetis' + See ``bash docker/build_graphstorm_image.sh --help`` for more information. @@ -211,12 +219,14 @@ In addition to ``-e/--environment``, the script supports several optional argume * ``-s, --suffix`` Suffix for the image tag, can be used to push custom image tags. Default is "-". -Example: +Examples: .. code-block:: bash - bash docker/push_graphstorm_image.sh -e local -r "us-east-1" -a "123456789012" - # Will push an image to '123456789012.dkr.ecr.us-east-1.amazonaws.com/graphstorm:local-gpu' + # Push an image to '123456789012.dkr.ecr.us-east-1.amazonaws.com/graphstorm:local-cpu' + bash docker/push_graphstorm_image.sh -e local -r "us-east-1" -a "123456789012" --device cpu + # Push a ParMETIS-capable image to '123456789012.dkr.ecr.us-east-1.amazonaws.com/graphstorm:local-cpu-parmetis' + bash docker/push_graphstorm_image.sh -e local -r "us-east-1" -a "123456789012" --device cpu --suffix "-parmetis" Create a GraphStorm Container