- Introduction
- Getting Started
- Acknowledging Our Work
- Getting in Touch
This repository contains the source code for LightRW, a project that accelerates graph dynamic random walks using AMD FPGAs. LightRW utilizes the proposed efficiency parallel weighted reservoir sampling algorithm and its hardware implementation to speed up random walk algorithms that have run-time varying transition probabilities. For more details, please refer to our paper.
LightRW supports MetaPath and Node2Vec random walk algorithms, and the implementations are based on AMD Alveo U250 FPGAs. This documentation provides the fundamental instructions to compile and execute LightRW.
To ensure the smooth operation of our code, we have outlined the necessary system requirements and dependencies below. Please ensure your environment aligns with the following specifications:
Dependency | Description |
---|---|
OS | Ubuntu 20.04 |
Linux Kernel: | 5.4.x |
FPGA Platform: | AMD Alveo U250 |
FPGA Development: | Vitis Core Development Kit 2020.2 |
FPGA Shell: | xilinx_u250_gen3x16_xdma_3_1_202020_1 (modified) |
FPGA runtime: | XRT 2.14.354 |
*NOTE: Please note that the FPGA Shell has been modified to suit the specific needs of our project. Ensure you have the correct version with the following instructions.
The official U250 shell limits the number of FPGA kernels that can be implemented. To accommodate our design, we have modified the official U250 shell. The following instructions will guide you through setting up a development environment that can reproduce our implementations.
We recommend using the HACC@NUS cluster, as it already has all the required environment set up.
- Download and install the Vitis Core Development Kit 2020.2 from the AMD official website.
- After installation, execute the following script to install the necessary dependencies:
sudo /path/to/Xilinx/Vitis/2020.2/scripts/installLibs.sh
- Follow this guide to patch the Vivado toolchain to ensure the export IP functionality works correctly.
Download and install the Xilinx Runtime Library (XRT) from the AMD official website using the following command:
wget -O xrt.deb https://www.xilinx.com/bin/public/openDownload?filename=xrt_202220.2.14.354_20.04-amd64-xrt.deb
sudo apt install ./xrt.deb
Note: Higher versions of the Linux kernel may result in a failed installation of XRT.
AMD/Xilinx currently does not provide a download link for the shell xilinx_u250_gen3x16_xdma_3_1_202020_1
. We have shared the related files, which you can install using the following commands:
wget -O shell.tar.gz https://www.dropbox.com/s/rnf7z219xijch3h/xilinx-u250-gen3x16-xdma-all_3.1-3063142.deb_2.tar.gz?dl=0
tar -zxvf shell.tar.gz
wget -O dev.deb https://www.dropbox.com/s/zly45955gexip9j/xilinx-u250-gen3x16-xdma-3.1-202020-1-dev_1-3061241_all.deb?dl=0
sudo apt install *.deb
We have also modified the official development shell to support a larger number of kernels that can be deployed on a single FPGA. To use our modified shell:
wget -O hw.xsa https://www.dropbox.com/s/sks6o254lflt7qo/hw.xsa?dl=0
sudo cp /opt/xilinx/platforms/xilinx_u250_gen3x16_xdma_3_1_202020_1/hw/hw.xsa /opt/xilinx/platforms/xilinx_u250_gen3x16_xdma_3_1_202020_1/hw/hw_backup.xsa
sudo cp hw.xsa /opt/xilinx/platforms/xilinx_u250_gen3x16_xdma_3_1_202020_1/hw/hw.xsa
LightRW uses the Rule-Based Accelerator Building System (RABS) on top of Vitis to build accelerators. To install the required packages:
sudo apt install graphviz libgraphviz-dev faketime opencl-headers
pip install numpy
pip install graphviz
To clone the repository along with its submodules, use the following command:
git clone --recurse-submodules [email protected]:Xtra-Computing/LightRW.git
The repository is structured as follows:
├── app # Contains the configurations for the accelerator project. Each subfolder corresponds to a hardware accelerator.
│ ├── test # Contains the configurations for unit tests or benchmark projects.
├── host # Contains the CPU code for preparing data and controlling the FPGA accelerator.
├── src # Contains the FPGA source code, grouped in modules.
├── test # Contains the CPU/FPGA test code.
├── lib # Contains libraries of rules to use as a subgroup of kernels.
├── misc # Contains scripts for automating tests.
└── mk # Contains the RABS submodule for building the accelerator.
To set the necessary environment variables for build and execution, run the following commands (the installation path is specific to your own environment):
# Set XRT env
source /opt/xilinx/xrt/setup.sh
# Set Vitis env
source /tools/Xilinx/Vitis/2020.2/settings64.sh
To build LightRW, use the make
command as shown below. Replace ${app_name}
with the name of the application in the app
directory:
make app=${app_name} TARGET=hw all
Here's a quick guide to the arguments:
app
: Specifies the target accelerator to be built.TARGET
:hw
: Builds an accelerator that can run on real FPGA hardware.hw_emu
: Builds a project that can run waveform-based simulation.- The default value can be found in the corresponding
kernel.mk
files.
all
: Builds the entire project.
For example, to build a full implementation of the MetaPath random walk accelerator, use:
make app=metapath_x4 TARGET=hw all
To build a full implementation of the Node2Vec random walk accelerator, use:
make app=node2vec_x4 TARGET=hw all
Please note that the build of each accelerator may take around 15 hours, depending on the server's performance.
To build tests and benchmarks (projects in app/test/
), replace app=${app_name}
with test=${test_name}
, where ${test_name}
the name of the test project. For example:
make test=vcache_test TARGET=hw all
The above make command generate the accelerator of vcache_test unit test of DAC cache.
A successful build will generate the program and FPGA bitstream. Here is an example of the metapath_x4
build:
├── metapath_x4.app # CPU program.
├── build_dir_metapath_metapath_x4
│ ├── clock.log # Accelerator clock.
│ ├── git
│ ├── kernel.cfg.json.pdf # Accelerator kernel topology.
│ ├── kernel.link.ltx
│ ├── kernel.link.xclbin.info
│ ├── kernel.link.xclbin.link_summary
│ ├── kernel.xclbin # FPGA bitstream.
│ ├── kernel.xclbin.package_summary
│ ├── link # FPGA build log.
│ ├── report # FPGA report.
└── xrt.ini
RABS will generate the kernel topology for debugging. You can view the example topologies of single channel MetaPath and Node2Vec here: MetaPath Topo and Node2Vec Topo.
LightRW accepts graphs in the same format as ThunderRW. You can follow the prepare_data.sh
script here to prepare the input graphs, or you can download them in the next section.
You can download the formatted graphs here. These can be used directly as the input for our accelerator.
The U250 requires programming a secondary shell:
sudo /opt/xilinx/xrt/bin/xbmgmt partition --program --name xilinx_u250_gen3x16_xdma_shell_3_1 --card ${PCIE_ID}
Here, ${PCIE_ID}
is the PCIE BDF id of the FPGA board.
To execute the program, use the following command, for example the ${metapath_x4}
./metapath_x4.app -fpga build_dir_metapath_metapath_x4/kernel.xclbin -graph ${path/to/formatted/graph}
The arguments are as follows:
fpga
: Thekernel.xclbin
file used to configure the FPGAs.graph
: The path that stores the graph dataset.
The output will look something like this:
...
[END] fpga init: 0.010874
[END] pcie memory copy: 1.721860
[END] acc execution: 4.039499
[END] overall thr (M Step/s): 36.721491
This output displays the elapsed time of accelerator initialization, data transfer between CPU and FPGA through PCIE, and accelerator execution. The accelerator execution time is the median value of ten measurements.
We provide a script to run the throughput test on all graphs in the misc
directory:
./misc/rw_test.sh ${app_name} ${path_to_all_graphs}
Ensure all graphs have been downloaded in ${path_to_all_graphs}
. The ${app_name}
is the same as the input used during the build process. For example:
./misc/rw_test.sh metapath_x4 /data/graphs/
The script will generate a log directory with a timestamp (e.g.,${log_2023_08_02_11_40_59}
), and all execution logs will be stored in this directory.
We are delighted that you have found our repository beneficial for your research. We kindly request that you acknowledge our work by citing our research paper. Below is the citation in BibTeX format for your convenience:
@article{10.1145/3588944,
author = {Tan, Hongshi and Chen, Xinyu and Chen, Yao and He, Bingsheng and Wong, Weng-Fai},
title = {LightRW: FPGA Accelerated Graph Dynamic Random Walks},
year = {2023},
issue_date = {May 2023},
publisher = {Association for Computing Machinery},
address = {New York, NY, USA},
volume = {1},
number = {1},
url = {https://doi.org/10.1145/3588944},
doi = {10.1145/3588944},
journal = {Proc. ACM Manag. Data},
month = {may},
articleno = {90},
numpages = {27},
}
This repository is a prototype for an accelerator, and we are actively working on extending its support to more FPGA platforms and large graph workloads. We are committed to keeping this repository updated in sync with our development repositories.
We appreciate your interest and encourage you to share your experiences and issues while using this repository. Please feel free to submit issues directly on this platform. For more specific concerns or queries, you can reach out to Hongshi Tan via email at [email protected]. We look forward to hearing from you and improving our work based on your valuable feedback.