Vitis AI Integration — tvm doc 文档 (2024)

Vitis AI is Xilinx’sdevelopment stack for hardware-accelerated AI inference on Xilinxplatforms, including both edge devices and Alveo cards. It consists ofoptimized IP, tools, libraries, models, and example designs. It isdesigned with high efficiency and ease of use in mind, unleashing thefull potential of AI acceleration on Xilinx FPGA and ACAP.

The current Vitis AI flow inside TVM enables acceleration of NeuralNetwork model inference on edge and cloud with the Zynq Ultrascale+MPSoc,Alveoand Versal platforms.The identifiers for the supported edge and cloud Deep Learning Processor Units (DPU’s) are:

Target Board	DPU ID	TVM Target ID
ZCU104	DPUCZDX8G	DPUCZDX8G-zcu104
ZCU102	DPUCZDX8G	DPUCZDX8G-zcu102
Kria KV260	DPUCZDX8G	DPUCZDX8G-kv260
VCK190	DPUCVDX8G	DPUCVDX8G
VCK5000	DPUCVDX8H	DPUCVDX8H
U200	DPUCADF8H	DPUCADF8H
U250	DPUCADF8H	DPUCADF8H
U50	DPUCAHX8H / DPUCAHX8L	DPUCAHX8H-u50 / DPUCAHX8L
U280	DPUCAHX8H / DPUCAHX8L	DPUCAHX8H-u280 / DPUCAHX8L

For more information about the DPU identifiers see following table:

DPU	Application	HW Platform	Quantization Method	Quantization Bitwidth	Design Target
Deep Learning Processing Unit	C: CNN R: RNN	AD: Alveo DDR AH: Alveo HBM VD: Versal DDR with AIE & PL ZD: Zynq DDR	X: DECENT I: Integer threshold F: Float threshold R: RNN	4: 4-bit 8: 8-bit 16: 16-bit M: Mixed Precision	G: General purpose H: High throughput L: Low latency C: Cost optimized

On this page you will find information on how to setup TVM with Vitis AIon different platforms (Zynq, Alveo, Versal) and on how to get started with Compiling a Modeland executing on different platforms: Inference.

System Requirements#

The Vitis AI System Requirements pagelists the system requirements for running docker containers as well as doing executing on Alveo cards.For edge devices (e.g. Zynq), deploying models requires a host machine for compiling models using the TVM with Vitis AI flow,and an edge device for running the compiled models. The host system requirements are the same as specified in the link above.

Setup instructions#

This section provide the instructions for setting up the TVM with Vitis AI flow for both cloud and edge.TVM with Vitis AI support is provided through a docker container. The provided scripts and Dockerfilecompiles TVM and Vitis AI into a single image.

Clone TVM repo

git clone --recursive https://github.com/apache/tvm.gitcd tvm

Build and start the TVM - Vitis AI docker container.

./docker/build.sh demo_vitis_ai bash./docker/bash.sh tvm.demo_vitis_ai# Setup inside containerconda activate vitis-ai-tensorflow

Build TVM inside the container with Vitis AI (inside tvm directory)

mkdir buildcp cmake/config.cmake buildcd buildecho set\(USE_LLVM ON\) >> config.cmakeecho set\(USE_VITIS_AI ON\) >> config.cmakecmake ..make -j$(nproc)

Install TVM
```
cd ../pythonpip3 install -e . --user
```

Inside this docker container you can now compile models for both cloud and edge targets.To run on cloud Alveo or Versal VCK5000 cards inside the docker container, please follow theAlveo respectively Versal VCK5000 setup instructions.To setup your Zynq or Versal VCK190 evaluation board for inference, please followthe Zynq respectively Versal VCK190 instructions.

Alveo Setup#

Check out following page for setup information: Alveo Setup.

After setup, you can select the right DPU inside the docker container in the following way:

cd /workspacegit clone --branch v1.4 --single-branch --recursive https://github.com/Xilinx/Vitis-AI.gitcd Vitis-AI/setup/alveosource setup.sh [DPU-IDENTIFIER]

The DPU identifier for this can be found in the second column of the DPU Targets table at the top of this page.

Versal VCK5000 Setup#

Check out following page for setup information: VCK5000 Setup.

After setup, you can select the right DPU inside the docker container in the following way:

cd /workspacegit clone --branch v1.4 --single-branch --recursive https://github.com/Xilinx/Vitis-AI.gitcd Vitis-AI/setup/vck5000source setup.sh

Zynq Setup#

For the Zynq target (DPUCZDX8G) the compilation stage will run inside the docker on a host machine.This doesn’t require any specific setup except for building the TVM - Vitis AI docker. For executing the model,the Zynq board will first have to be set up and more information on that can be found here.

Download the Petalinux image for your target:
Use Etcher software to burn the image file onto the SD card.
Insert the SD card with the image into the destination board.
Plug in the power and boot the board using the serial port to operate on the system.
Set up the IP information of the board using the serial port. For more details on step 1 to 5, please refer to Setting Up The Evaluation Board.
Create 4GB of swap space on the board

fallocate -l 4G /swapfilechmod 600 /swapfilemkswap /swapfileswapon /swapfileecho "/swapfile swap swap defaults 0 0" >> /etc/fstab

Install hdf5 dependency (will take between 30 min and 1 hour to finish)

cd /tmp && \ wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.7/src/hdf5-1.10.7.tar.gz && \ tar -zxvf hdf5-1.10.7.tar.gz && \ cd hdf5-1.10.7 && \ ./configure --prefix=/usr && \ make -j$(nproc) && \ make install && \ cd /tmp && rm -rf hdf5-1.10.7*

Install Python dependencies

pip3 install Cython==0.29.23 h5py==2.10.0 pillow

Install PyXIR

git clone --recursive --branch rel-v0.3.1 --single-branch https://github.com/Xilinx/pyxir.gitcd pyxirsudo python3 setup.py install --use_vart_edge_dpu

Build and install TVM with Vitis AI

git clone --recursive https://github.com/apache/tvmcd tvmmkdir buildcp cmake/config.cmake buildcd buildecho set\(USE_LLVM OFF\) >> config.cmakeecho set\(USE_VITIS_AI ON\) >> config.cmakecmake ..make tvm_runtime -j$(nproc)cd ../pythonpip3 install --no-deps -e .

Check whether the setup was successful in the Python shell:

python3 -c 'import pyxir; import tvm'

备注

You might see a warning about the ‘cpu-tf’ runtime not being found. This warning isexpected on the board and can be ignored.

Versal VCK190 Setup#

For the Versal VCK190 setup, please follow the instructions for Zynq Setup,but now use the VCK190 imagein step 1. The other steps are the same.

Compiling a Model#

The TVM with Vitis AI flow contains two stages: Compilation and Inference.During the compilation a user can choose a model to compile for the cloud oredge target devices that are currently supported. Once a model is compiled,the generated files can be used to run the model on a the specified targetdevice during the Inference stage. Currently, the TVM withVitis AI flow supported a selected number of Xilinx data center and edge devices.

In this section we walk through the typical flow for compiling models with Vitis AIinside TVM.

Imports

Make sure to import PyXIR and the DPU target (import pyxir.contrib.target.DPUCADF8H for DPUCADF8H):

import pyxirimport pyxir.contrib.target.DPUCADF8Himport tvmimport tvm.relay as relayfrom tvm.contrib.target import vitis_aifrom tvm.contrib import utils, graph_executorfrom tvm.relay.op.contrib.vitis_ai import partition_for_vitis_ai

Declare the Target

tvm_target = 'llvm'dpu_target = 'DPUCADF8H' # options: 'DPUCADF8H', 'DPUCAHX8H-u50', 'DPUCAHX8H-u280', 'DPUCAHX8L', 'DPUCVDX8H', 'DPUCZDX8G-zcu104', 'DPUCZDX8G-zcu102', 'DPUCZDX8G-kv260'

The TVM with Vitis AI flow currently supports the DPU targets listed inthe table at the top of this page. Once the appropriate targets are defined,we invoke the TVM compiler to build the graph for the specified target.

Import the Model

Example code to import an MXNet model:

mod, params = relay.frontend.from_mxnet(block, input_shape)

Partition the Model

After importing the model, we utilize the Relay API to annotate the Relay expression for the provided DPU target and partition the graph.

mod = partition_for_vitis_ai(mod, params, dpu=dpu_target)

Build the Model

The partitioned model is passed to the TVM compiler to generate the runtime libraries for the TVM Runtime.

export_rt_mod_file = os.path.join(os.getcwd(), 'vitis_ai.rtmod')build_options = { 'dpu': dpu_target, 'export_runtime_module': export_rt_mod_file}with tvm.transform.PassContext(opt_level=3, config={'relay.ext.vitis_ai.options': build_options}): lib = relay.build(mod, tvm_target, params=params)

Quantize the Model

Usually, to be able to accelerate inference of Neural Network modelswith Vitis AI DPU accelerators, those models need to quantized upfront.In TVM - Vitis AI flow, we make use of on-the-fly quantization to removethis additional preprocessing step. In this flow, one doesn’t need toquantize his/her model upfront but can make use of the typical inferenceexecution calls (module.run) to quantize the model on-the-fly using thefirst N inputs that are provided (see more information below). This willset up and calibrate the Vitis-AI DPU and from that point onwardsinference will be accelerated for all next inputs. Note that the edgeflow deviates slightly from the explained flow in that inference won’tbe accelerated after the first N inputs but the model will have beenquantized and compiled and can be moved to the edge device fordeployment. Please check out the Running on Zynqsection below for more information.

module = graph_executor.GraphModule(lib["default"](tvm.cpu()))# First N (default = 128) inputs are used for quantization calibration and will# be executed on the CPU# This config can be changed by setting the 'PX_QUANT_SIZE' (e.g. export PX_QUANT_SIZE=64)for i in range(128): module.set_input(input_name, inputs[i]) module.run()

By default, the number of images used for quantization is set to 128.You could change the number of images used for On-The-Fly Quantizationwith the PX_QUANT_SIZE environment variable. For example, execute thefollowing line in the terminal before calling the compilation scriptto reduce the quantization calibration dataset to eight images.This can be used for quick testing.

export PX_QUANT_SIZE=8

Lastly, we store the compiled output from the TVM compiler on disk forrunning the model on the target device. This happens as follows forcloud DPU’s (Alveo, VCK5000):

lib_path = "deploy_lib.so"lib.export_library(lib_path)

For edge targets (Zynq, VCK190) we have to rebuild for aarch64. To do thiswe first have to normally export the module to also serialize the Vitis AIruntime module (vitis_ai.rtmod). We will load this runtime module againafterwards to rebuild and export for aarch64.

temp = utils.tempdir()lib.export_library(temp.relpath("tvm_lib.so"))# Build and export lib for aarch64 targettvm_target = tvm.target.arm_cpu('ultra96')lib_kwargs = { 'fcompile': contrib.cc.create_shared, 'cc': "/usr/aarch64-linux-gnu/bin/ld"}build_options = { 'load_runtime_module': export_rt_mod_file}with tvm.transform.PassContext(opt_level=3, config={'relay.ext.vitis_ai.options': build_options}): lib_edge = relay.build(mod, tvm_target, params=params)lib_edge.export_library('deploy_lib_edge.so', **lib_kwargs)

This concludes the tutorial to compile a model using TVM with Vitis AI.For instructions on how to run a compiled model please refer to the next section.

Inference#

The TVM with Vitis AI flow contains two stages: Compilation and Inference.During the compilation a user can choose to compile a model for any of thetarget devices that are currently supported. Once a model is compiled, thegenerated files can be used to run the model on a target device during theInference stage.

Check out the Running on Alveo and VCK5000and Running on Zynq and VCK190 sections fordoing inference on cloud accelerator cards respectively edge boards.

Running on Alveo and VCK5000#

After having followed the steps in the Compiling a Modelsection, you can continue running on new inputs inside the docker for acceleratedinference:

module.set_input(input_name, inputs[i])module.run()

Alternatively, you can load the exported runtime module (the deploy_lib.soexported in Compiling a Model):

import pyxirimport tvmfrom tvm.contrib import graph_executordev = tvm.cpu()# input_name = ...# input_data = ...# load the module into memorylib = tvm.runtime.load_module("deploy_lib.so")module = graph_executor.GraphModule(lib["default"](dev))module.set_input(input_name, input_data)module.run()

Running on Zynq and VCK190#

Before proceeding, please follow the Zynq orVersal VCK190 setup instructions.

Prior to running a model on the board, you need to compile the model foryour target evaluation board and transfer the compiled model on to the board.Please refer to the Compiling a Model section forinformation on how to compile a model.

Afterwards, you will have to transfer the compiled model (deploy_lib_edge.so)to the evaluation board. Then, on the board you can use the typical“load_module” and “module.run” APIs to execute. For this, please make sure torun the script as root (execute su in terminal to log into root).

备注

Note also that you shouldn’t import thePyXIR DPU targets in the run script (import pyxir.contrib.target.DPUCZDX8G).

import pyxirimport tvmfrom tvm.contrib import graph_executordev = tvm.cpu()# input_name = ...# input_data = ...# load the module into memorylib = tvm.runtime.load_module("deploy_lib_edge.so")module = graph_executor.GraphModule(lib["default"](dev))module.set_input(input_name, input_data)module.run()