TensorFlow MaskRCNN Training

Description

This document has instructions for running MaskRCNN training with BFloat16 and FP32 precisions using Intel® Extension for TensorFlow on Intel® Data Center GPU Max Series.

Datasets

Download and preprocess the COCO 2017 dataset using the instructions here. After running the conversion script you should have a directory with the COCO 2017 dataset in the TF records format.

Set the DATASET_DIR to point to the TF records directory when running MaskRCNN.

Quick Start Scripts

Script name	Description
`run_model.sh`	Runs MaskRCNN BF16 and FP32 training on single and two tiles

Requirements:

Host has Intel® Data Center GPU Max Series
Follow instructions to install GPU-compatible drivers
Docker

Docker pull command:

docker pull intel/image-segmentation:tf-max-gpu-maskrcnn-training

To MaskRCNN training container includes scripts, models and libraries needed to run BFloat16/FP32 training. To run the run_model.sh quickstart script using this container, you are required to volume mount the pre-processed COCO 2017 dataset to run the script. You will also need to provide an output directory to store logs.

#Optional
export BATCH_SIZE=<provide batch size. Default is 4>

#Required
export DATASET_DIR=<path to pre-processed COCO 2017 dataset>
export MULTI_TILE=<specify True for Multi-tile training and False for single-tile training>
export PRECISION=<provide either bfloat16 or fp32 precision>
export OUTPUT_DIR=<path to output directory>

IMAGE_NAME=intel/image-segmentation:tf-max-gpu-maskrcnn-training

if [[ ${MULTI_TILE} == "False" ]]; then
    SCRIPT="mpirun -np 1 -prepend-rank -ppn 1 bash run_model.sh"
else 
    SCRIPT="bash run_model.sh"
fi

DOCKER_ARGS="--rm --init -it"
docker run \
--device /dev/dri \
--env BATCH_SIZE=${BATCH_SIZE} \
--env MULTI_TILE=${MULTI_TILE} \
--env DATASET_DIR=${DATASET_DIR} \
--env PRECISION=${PRECISION} \
--env OUTPUT_DIR=${OUTPUT_DIR} \
--volume /dev/dri:/dev/dri \
--volume ${OUTPUT_DIR}:${OUTPUT_DIR} \
--volume ${DATASET_DIR}:${DATASET_DIR} \
${DOCKER_ARGS} \
${IMAGE_NAME} \
$SCRIPT

Documentation and Sources

GitHub* Repository

Support

Support for Intel® Extension for TensorFlow* is available at Intel® AI Analytics Toolkit Additionally, the Intel® Extension for TensorFlow* team tracks both bugs and enhancement requests using GitHub issues. Before submitting a suggestion or bug report, please search the GitHub issues to see if your issue has already been reported.

License Agreement

LEGAL NOTICE: By accessing, downloading or using this software and any required dependent software (the “Software Package”), you agree to the terms and conditions of the software license agreements for the Software Package, which may also include notices, disclaimers, or license terms for third party software included with the Software Package. Please refer to the license file for additional details.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CONTAINER.md

CONTAINER.md

TensorFlow MaskRCNN Training

Description

Datasets

Quick Start Scripts

Docker pull command:

Documentation and Sources

Support

License Agreement

Files

CONTAINER.md

Latest commit

History

CONTAINER.md

File metadata and controls

TensorFlow MaskRCNN Training

Description

Datasets

Quick Start Scripts

Docker pull command:

Documentation and Sources

Support

License Agreement