Vitis AI Integration — tvm doc 文档 (2024)

Vitis AI is Xilinx’sdevelopment stack for hardware-accelerated AI inference on Xilinxplatforms, including both edge devices and Alveo cards. It consists ofoptimized IP, tools, libraries, models, and example designs. It isdesigned with high efficiency and ease of use in mind, unleashing thefull potential of AI acceleration on Xilinx FPGA and ACAP.

The current Vitis AI flow inside TVM enables acceleration of NeuralNetwork model inference on edge and cloud with the Zynq Ultrascale+MPSoc,Alveoand Versal platforms.The identifiers for the supported edge and cloud Deep Learning Processor Units (DPU’s) are:

Target Board

DPU ID

TVM Target ID

ZCU104

DPUCZDX8G

DPUCZDX8G-zcu104

ZCU102

DPUCZDX8G

DPUCZDX8G-zcu102

Kria KV260

DPUCZDX8G

DPUCZDX8G-kv260

VCK190

DPUCVDX8G

DPUCVDX8G

VCK5000

DPUCVDX8H

DPUCVDX8H

U200

DPUCADF8H

DPUCADF8H

U250

DPUCADF8H

DPUCADF8H

U50

DPUCAHX8H / DPUCAHX8L

DPUCAHX8H-u50 / DPUCAHX8L

U280

DPUCAHX8H / DPUCAHX8L

DPUCAHX8H-u280 / DPUCAHX8L

For more information about the DPU identifiers see following table:

DPU

Application

HW Platform

Quantization Method

Quantization Bitwidth

Design Target

Deep Learning

Processing Unit

C: CNN

R: RNN

AD: Alveo DDR

AH: Alveo HBM

VD: Versal DDR with AIE & PL

ZD: Zynq DDR

X: DECENT

I: Integer threshold

F: Float threshold

R: RNN

4: 4-bit

8: 8-bit

16: 16-bit

M: Mixed Precision

G: General purpose

H: High throughput

L: Low latency

C: Cost optimized

On this page you will find information on how to setup TVM with Vitis AIon different platforms (Zynq, Alveo, Versal) and on how to get started with Compiling a Modeland executing on different platforms: Inference.

System Requirements#

The Vitis AI System Requirements pagelists the system requirements for running docker containers as well as doing executing on Alveo cards.For edge devices (e.g. Zynq), deploying models requires a host machine for compiling models using the TVM with Vitis AI flow,and an edge device for running the compiled models. The host system requirements are the same as specified in the link above.

Setup instructions#

This section provide the instructions for setting up the TVM with Vitis AI flow for both cloud and edge.TVM with Vitis AI support is provided through a docker container. The provided scripts and Dockerfilecompiles TVM and Vitis AI into a single image.

  1. Clone TVM repo

    git clone --recursive https://github.com/apache/tvm.gitcd tvm
  2. Build and start the TVM - Vitis AI docker container.

    ./docker/build.sh demo_vitis_ai bash./docker/bash.sh tvm.demo_vitis_ai# Setup inside containerconda activate vitis-ai-tensorflow
  3. Build TVM inside the container with Vitis AI (inside tvm directory)

    mkdir buildcp cmake/config.cmake buildcd buildecho set\(USE_LLVM ON\) >> config.cmakeecho set\(USE_VITIS_AI ON\) >> config.cmakecmake ..make -j$(nproc)
  4. Install TVM

    cd ../pythonpip3 install -e . --user

Inside this docker container you can now compile models for both cloud and edge targets.To run on cloud Alveo or Versal VCK5000 cards inside the docker container, please follow theAlveo respectively Versal VCK5000 setup instructions.To setup your Zynq or Versal VCK190 evaluation board for inference, please followthe Zynq respectively Versal VCK190 instructions.

Alveo Setup#

Check out following page for setup information: Alveo Setup.

After setup, you can select the right DPU inside the docker container in the following way:

cd /workspacegit clone --branch v1.4 --single-branch --recursive https://github.com/Xilinx/Vitis-AI.gitcd Vitis-AI/setup/alveosource setup.sh [DPU-IDENTIFIER]

The DPU identifier for this can be found in the second column of the DPU Targets table at the top of this page.

Versal VCK5000 Setup#

Check out following page for setup information: VCK5000 Setup.

After setup, you can select the right DPU inside the docker container in the following way:

cd /workspacegit clone --branch v1.4 --single-branch --recursive https://github.com/Xilinx/Vitis-AI.gitcd Vitis-AI/setup/vck5000source setup.sh

Zynq Setup#

For the Zynq target (DPUCZDX8G) the compilation stage will run inside the docker on a host machine.This doesn’t require any specific setup except for building the TVM - Vitis AI docker. For executing the model,the Zynq board will first have to be set up and more information on that can be found here.

  1. Download the Petalinux image for your target:
  2. Use Etcher software to burn the image file onto the SD card.

  3. Insert the SD card with the image into the destination board.

  4. Plug in the power and boot the board using the serial port to operate on the system.

  5. Set up the IP information of the board using the serial port. For more details on step 1 to 5, please refer to Setting Up The Evaluation Board.

  6. Create 4GB of swap space on the board

fallocate -l 4G /swapfilechmod 600 /swapfilemkswap /swapfileswapon /swapfileecho "/swapfile swap swap defaults 0 0" >> /etc/fstab
  1. Install hdf5 dependency (will take between 30 min and 1 hour to finish)

cd /tmp && \ wget https://support.hdfgroup.org/ftp/HDF5/releases/hdf5-1.10/hdf5-1.10.7/src/hdf5-1.10.7.tar.gz && \ tar -zxvf hdf5-1.10.7.tar.gz && \ cd hdf5-1.10.7 && \ ./configure --prefix=/usr && \ make -j$(nproc) && \ make install && \ cd /tmp && rm -rf hdf5-1.10.7*
  1. Install Python dependencies

pip3 install Cython==0.29.23 h5py==2.10.0 pillow
  1. Install PyXIR

git clone --recursive --branch rel-v0.3.1 --single-branch https://github.com/Xilinx/pyxir.gitcd pyxirsudo python3 setup.py install --use_vart_edge_dpu
  1. Build and install TVM with Vitis AI

git clone --recursive https://github.com/apache/tvmcd tvmmkdir buildcp cmake/config.cmake buildcd buildecho set\(USE_LLVM OFF\) >> config.cmakeecho set\(USE_VITIS_AI ON\) >> config.cmakecmake ..make tvm_runtime -j$(nproc)cd ../pythonpip3 install --no-deps -e .
  1. Check whether the setup was successful in the Python shell:

python3 -c 'import pyxir; import tvm'

备注

You might see a warning about the ‘cpu-tf’ runtime not being found. This warning isexpected on the board and can be ignored.

Versal VCK190 Setup#

For the Versal VCK190 setup, please follow the instructions for Zynq Setup,but now use the VCK190 imagein step 1. The other steps are the same.

Compiling a Model#

The TVM with Vitis AI flow contains two stages: Compilation and Inference.During the compilation a user can choose a model to compile for the cloud oredge target devices that are currently supported. Once a model is compiled,the generated files can be used to run the model on a the specified targetdevice during the Inference stage. Currently, the TVM withVitis AI flow supported a selected number of Xilinx data center and edge devices.

In this section we walk through the typical flow for compiling models with Vitis AIinside TVM.

Imports

Make sure to import PyXIR and the DPU target (import pyxir.contrib.target.DPUCADF8H for DPUCADF8H):

import pyxirimport pyxir.contrib.target.DPUCADF8Himport tvmimport tvm.relay as relayfrom tvm.contrib.target import vitis_aifrom tvm.contrib import utils, graph_executorfrom tvm.relay.op.contrib.vitis_ai import partition_for_vitis_ai

Declare the Target

tvm_target = 'llvm'dpu_target = 'DPUCADF8H' # options: 'DPUCADF8H', 'DPUCAHX8H-u50', 'DPUCAHX8H-u280', 'DPUCAHX8L', 'DPUCVDX8H', 'DPUCZDX8G-zcu104', 'DPUCZDX8G-zcu102', 'DPUCZDX8G-kv260'

The TVM with Vitis AI flow currently supports the DPU targets listed inthe table at the top of this page. Once the appropriate targets are defined,we invoke the TVM compiler to build the graph for the specified target.

Import the Model

Example code to import an MXNet model:

mod, params = relay.frontend.from_mxnet(block, input_shape)

Partition the Model

After importing the model, we utilize the Relay API to annotate the Relay expression for the provided DPU target and partition the graph.

mod = partition_for_vitis_ai(mod, params, dpu=dpu_target)

Build the Model

The partitioned model is passed to the TVM compiler to generate the runtime libraries for the TVM Runtime.

export_rt_mod_file = os.path.join(os.getcwd(), 'vitis_ai.rtmod')build_options = { 'dpu': dpu_target, 'export_runtime_module': export_rt_mod_file}with tvm.transform.PassContext(opt_level=3, config={'relay.ext.vitis_ai.options': build_options}): lib = relay.build(mod, tvm_target, params=params)

Quantize the Model

Usually, to be able to accelerate inference of Neural Network modelswith Vitis AI DPU accelerators, those models need to quantized upfront.In TVM - Vitis AI flow, we make use of on-the-fly quantization to removethis additional preprocessing step. In this flow, one doesn’t need toquantize his/her model upfront but can make use of the typical inferenceexecution calls (module.run) to quantize the model on-the-fly using thefirst N inputs that are provided (see more information below). This willset up and calibrate the Vitis-AI DPU and from that point onwardsinference will be accelerated for all next inputs. Note that the edgeflow deviates slightly from the explained flow in that inference won’tbe accelerated after the first N inputs but the model will have beenquantized and compiled and can be moved to the edge device fordeployment. Please check out the Running on Zynqsection below for more information.

module = graph_executor.GraphModule(lib["default"](tvm.cpu()))# First N (default = 128) inputs are used for quantization calibration and will# be executed on the CPU# This config can be changed by setting the 'PX_QUANT_SIZE' (e.g. export PX_QUANT_SIZE=64)for i in range(128): module.set_input(input_name, inputs[i]) module.run()

By default, the number of images used for quantization is set to 128.You could change the number of images used for On-The-Fly Quantizationwith the PX_QUANT_SIZE environment variable. For example, execute thefollowing line in the terminal before calling the compilation scriptto reduce the quantization calibration dataset to eight images.This can be used for quick testing.

export PX_QUANT_SIZE=8

Lastly, we store the compiled output from the TVM compiler on disk forrunning the model on the target device. This happens as follows forcloud DPU’s (Alveo, VCK5000):

lib_path = "deploy_lib.so"lib.export_library(lib_path)

For edge targets (Zynq, VCK190) we have to rebuild for aarch64. To do thiswe first have to normally export the module to also serialize the Vitis AIruntime module (vitis_ai.rtmod). We will load this runtime module againafterwards to rebuild and export for aarch64.

temp = utils.tempdir()lib.export_library(temp.relpath("tvm_lib.so"))# Build and export lib for aarch64 targettvm_target = tvm.target.arm_cpu('ultra96')lib_kwargs = { 'fcompile': contrib.cc.create_shared, 'cc': "/usr/aarch64-linux-gnu/bin/ld"}build_options = { 'load_runtime_module': export_rt_mod_file}with tvm.transform.PassContext(opt_level=3, config={'relay.ext.vitis_ai.options': build_options}): lib_edge = relay.build(mod, tvm_target, params=params)lib_edge.export_library('deploy_lib_edge.so', **lib_kwargs)

This concludes the tutorial to compile a model using TVM with Vitis AI.For instructions on how to run a compiled model please refer to the next section.

Inference#

The TVM with Vitis AI flow contains two stages: Compilation and Inference.During the compilation a user can choose to compile a model for any of thetarget devices that are currently supported. Once a model is compiled, thegenerated files can be used to run the model on a target device during theInference stage.

Check out the Running on Alveo and VCK5000and Running on Zynq and VCK190 sections fordoing inference on cloud accelerator cards respectively edge boards.

Running on Alveo and VCK5000#

After having followed the steps in the Compiling a Modelsection, you can continue running on new inputs inside the docker for acceleratedinference:

module.set_input(input_name, inputs[i])module.run()

Alternatively, you can load the exported runtime module (the deploy_lib.soexported in Compiling a Model):

import pyxirimport tvmfrom tvm.contrib import graph_executordev = tvm.cpu()# input_name = ...# input_data = ...# load the module into memorylib = tvm.runtime.load_module("deploy_lib.so")module = graph_executor.GraphModule(lib["default"](dev))module.set_input(input_name, input_data)module.run()

Running on Zynq and VCK190#

Before proceeding, please follow the Zynq orVersal VCK190 setup instructions.

Prior to running a model on the board, you need to compile the model foryour target evaluation board and transfer the compiled model on to the board.Please refer to the Compiling a Model section forinformation on how to compile a model.

Afterwards, you will have to transfer the compiled model (deploy_lib_edge.so)to the evaluation board. Then, on the board you can use the typical“load_module” and “module.run” APIs to execute. For this, please make sure torun the script as root (execute su in terminal to log into root).

备注

Note also that you shouldn’t import thePyXIR DPU targets in the run script (import pyxir.contrib.target.DPUCZDX8G).

import pyxirimport tvmfrom tvm.contrib import graph_executordev = tvm.cpu()# input_name = ...# input_data = ...# load the module into memorylib = tvm.runtime.load_module("deploy_lib_edge.so")module = graph_executor.GraphModule(lib["default"](dev))module.set_input(input_name, input_data)module.run()
Vitis AI Integration — tvm doc 文档 (2024)

References

Top Articles
Latest Posts
Article information

Author: Moshe Kshlerin

Last Updated:

Views: 6023

Rating: 4.7 / 5 (77 voted)

Reviews: 92% of readers found this page helpful

Author information

Name: Moshe Kshlerin

Birthday: 1994-01-25

Address: Suite 609 315 Lupita Unions, Ronnieburgh, MI 62697

Phone: +2424755286529

Job: District Education Designer

Hobby: Yoga, Gunsmithing, Singing, 3D printing, Nordic skating, Soapmaking, Juggling

Introduction: My name is Moshe Kshlerin, I am a gleaming, attractive, outstanding, pleasant, delightful, outstanding, famous person who loves writing and wants to share my knowledge and understanding with you.