DLICV是一个基于PyTorch开发,用于在计算机视觉任务中进行深度学习推理的python库。针对不同的硬件平台和推理后端,它供了深度学习模型推理的统一接口,屏蔽了不同推理后端的诸多使用细节诸如资源申请释放、数据搬运等。DLICV将常见计算机视觉基础任务的深度学习推理过程抽象为数据前处理、后端模型推理、预测结果后处理和推理结果可视化,并将上述流程封装在基础预测器中实现端到端的推理过程,避免重复编写繁琐的推理脚本。上述特性使得DLICV可以在不同平台上针对不同任务提供一致和便捷的深度学习模型推理体验。
支持的硬件平台和推理后端如下表所示 The supported Device-InferenceBackend matrix is presented as following,
Device / Inference Backend |
ONNX Runtime | TensorRT | OpenVINO | ncnn | CANN | CoreML |
---|---|---|---|---|---|---|
X86_64 CPU | ✅ | ✅ | ||||
ARM CPU | ✅ | ✅ | ||||
RISC-V | ✅ | |||||
NVIDIA GPU | ✅ | ✅ | ||||
NVIDIA Jetson | ✅ | |||||
Huawei ascend | ✅ | |||||
Apple M1 | ✅ | ✅ |
DLICV实现的BasePredictor
提供了端到端的推理体验,它将常见的计算机视觉基础任务中的深度学习推理过程分解为四个核心环节:数据预处理、后端模型推理、预测结果后处理和推理结果可视化。通过将这四个环节整合到一个基础预测器中,DLICV避免了开发者需要重复编写复杂且繁琐的推理脚本,从而提高开发效率。
- 图像处理:
imresize
,impad
,imcrop
,imrotate
- 图像变换:
LoadImage
,Resize
,Pad
,ImgToTensor
- 边界框处理:
clip_boxes
,resize_boxes
,box_iou
,batched_nms
安装DLICV和基础依赖包
pip install git+https://github.com/xueqing888/dlicv.git
为了实现多平台推理,需要安装相应推理后端及所提供的Python SDK
名称 | 安装说明 |
---|---|
ONNXRuntime | ONNX Runtime官方文档中提供了GPU和CPU两个版本的Python包安装方式。在任何一个环境中,一次只能安装其中一个包。 如果你的平台上有支持CUDA的GPU硬件,推荐GPU版本的安装包,它同时包含了绝大部分CPU版本的功能
|
TensorRT | 首先确认你的平台上安装有合适的CUDA 版本的GPU驱动,可以通过nivdia-smi 指令查看。然后可以通过安装TensorRT官方提供的预编译Python包来安装TensorRT
|
OpenVINO | 安装 OpenVINO
|
ncnn | 1. 请参考 ncnn的 wiki 编译 ncnn。编译时,请打开-DNCNN_PYTHON=ON 2. 将 ncnn 的根目录写入环境变量 3. 安装 pyncnn
|
Ascend | 1. 按照官方指引安装 CANN 工具集. 2. 配置环境
|
后端模型推理
DLICV实现的BackendModel
支持多种推理后端模型的推理。使用起来也非常简单,传入相应的后端模型文件、设备类型(可选)等参数构建一个可调用后端模型对象。传入torch.Tensor
数据就可进行推理,获取推理结果。
import dlicv
import torch
from dlicv import BackendModel
X = torch.randn(1, 3, 224, 224)
onnx_file = '/path/to/onnx_model.onnx'
onnx_model = BackendModel(onnx_file)
onnx_preds = onnx_model(X, force_cast=True)
trt_file = '/path/to/tensorrt_model.trt'
trt_model = BackendModel(trt_file)
trt_pred = trt_model(X, force_cast=True)
使用BaseClassifier
实现图像识别任务的端到端推理
以Resnet18的推理为例介绍BaseClassifier
的使用
import urllib.request
import dlicv
import torch
from dlicv import BaseClassifier
from dlicv.transform import *
from torchvision.models.resnet import resnet18, ResNet18_Weights
# Download an example image from the pytorch website
url, filename = ("https://github.com/pytorch/hub/raw/master/images/dog.jpg", "dog.jpg")
urllib.request.urlretrieve(url, filename)
# Build resnet18 with ImageNet 1k pretrained weights from torchvison.
model = resnet18(weights=ResNet18_Weights.IMAGENET1K_V1)
model.eval().cuda()
# Build data pipeline for image preprocessing with `dlicv.transforms`
MEAN = [123.675, 116.28, 103.53]
STD = [58.395, 57.12, 57.375]
data_pipeline = Compose([
LoadImage(channel_order='rgb', to_tensor=True, device='cuda'),
Resize(224),
Pad(to_square=True, pad_val=114),
Normalize(mean=MEAN, std=STD),
])
# Build Classifier
classifier = BaseClassifier(model, data_pipeline, classes='imagenet')
res = classifier(filename, show_dir='./') #
成功运行上述代码后会在当前工作目录下生成vis
目录,该目录下有一张名为dog.jpg
的可视化结果图像如下所示
使用BaseDetector
实现目标检测任务的端到端推理
以目标检测模型YOLOv8的推理为例介绍BaseDetector
的使用
可以参考YOLOv8
官方的模型导出教程来获取你想要的后端模型,这里我们以yolov8n的onnx模型推理为例
import urllib.resuest
import torch
from dlicv import BackendModel, BaseClassifier
from dlicv.transform import *
# Download an example image from the ultralytics website
url, filename = ("https://ultralytics.com/images/bus.jpg", "bus.jpg")
urllib.request.urlretrieve(url, filename)
# Build BackendModel.
backend_model_file = '/path/to/onnx-model/yolov8n.onnx'
backend_model = BackendModel(backend_model_file)
# Build data pipeline for image preprocessing with `dlicv.transforms`
data_pipeline = (
LoadImage(channel_order='rgb'),
Resize((640, 640)),
Normalize(mean=0, std=255),
ImgToTensor()
)
# Build detector by subclassing `BaseDetector`, and implement the abstract
# method `_parse_preds` to parse the predictions from backend model into
# bbox results
class YOLOv8(BaseDetector):
def _parse_preds(self, preds: torch.Tensor, *args, **kargs) -> tuple:
scores, boxes, labels = [], [], []
outputs = preds.permute(0, 2, 1)
for output in outputs:
classes_scores = output[:, 4:]
cls_scores, cls_labels = classes_scores.max(-1)
scores.append(cls_scores)
labels.append(cls_labels)
x, y, w, h = output[:, 0], output[:, 1], output[:, 2], output[:, 3]
x1, y1 = x - w / 2, y - h / 2
x2, y2 = x + w / 2, y + h / 2
boxes.append(torch.stack([x1, y1, x2, y2], 1))
return boxes, scores, labels
# Init Detector
detector = YOLOv8(backend_model,
data_pipeline,
conf=0.5,
nms_cfg=dict(iou_thres=0.5, class_agnostic=True),
classes='coco')
res = detector(filename, show_dir='.')
成功运行上述代码后会在当前工作目录下生成vis
目录,该目录下有一张名为bus.jpg
的可视化结果图像如下所示
使用BaseSegmentor
实现语义分割任务的端到端推理
以语义分割模型DeepLabV3的推理为例介绍BaseSegmentor
的使用
import urllib.request
from torchvision.models.segmentation import deeplabv3_resnet101, DeepLabV3_ResNet101_Weights
from dlicv.predictor import BaseSegmentor
from dlicv.transforms import *
# Download an example image from the pytorch website
url, filename = ("https://github.com/pytorch/hub/raw/master/images/deeplab1.png", "deeplab1.png")
urllib.request.urlretrieve(url, filename)
# Build DeepLabv3 with pretrained weights from torchvison.
model = deeplabv3_resnet101(weights=DeepLabV3_ResNet101_Weights)
model.eval().cuda()
# Build data pipeline for image preprocessing with `dlicv.transforms`
MEAN = [123.675, 116.28, 103.53]
STD = [58.395, 57.12, 57.375]
data_pipeline = Compose([
LoadImage(channel_order='rgb', to_tensor=True, device='cuda'),
Normalize(mean=MEAN, std=STD),
])
# Build segmentor by subclassing `BaseSegmentor`, and rewrite the
# method `postprocess`
class DeepLabv3(BaseSegmentor):
def postprocess(self, preds, *args, **kwargs):
pred_seg_maps = preds['out']
return super().postprocess(pred_seg_maps, *args, ** kwargs)
segmentor = DeepLabv3(model, data_pipeline, classes='voc_seg')
res = segmentor(filename, show_dir='./')
成功运行上述代码后会在当前工作目录下生成vis
目录,该目录下有一张名为deeplab1.png
的可视化结果图像如下所示
该项目采用 Apache 2.0 开源许可协证
- MMEngine: OpenMMLab foundational library for training deep learning models.
- MMCV: OpenMMLab foundational library for computer vision.
- MMDeploy: OpenMMLab model deployment framework.
如果您在研究中使用了本项目的代码,请参考如下 bibtex 引用 DLICV:
@misc{=dlicv,
title={Deep Learning Inference kit tool for Computer Vision},
author={Wang, Xueqing},
howpublished = {\url{https://github.com/xueqing888/dlicv.git}},
year={2024}
}