基准

后端

CPU: ncnn, ONNXRuntime, OpenVINO

GPU: TensorRT, PPLNN

延迟基准

平台

Ubuntu 18.04 操作系统
Cuda 11.3
TensorRT 7.2.3.4
Docker 20.10.8
NVIDIA tesla T4 显卡.

其他设置

静态图导出
批次大小为 1
每次推理后均同步
延迟基准测试时，我们计算各个数据集中100张图片的平均延时。
热身。针对分类任务，我们热身1010轮。对其他任务，我们热身10轮。
输入分辨率根据代码库的数据集不同而不同，除了mmediting，其他代码库均使用真实图片作为输入。

用户可以直接通过如何测试延迟获得想要的速度测试结果。下面是我们环境中的测试结果：

MMCls

MMCls			TensorRT						PPLNN		NCNN
Model	Dataset	Input	fp32		fp16		int8		fp16		SnapDragon888-fp32		Adreno660-fp32		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ResNet	ImageNet	1x3x224x224	2.97	336.90	1.26	791.89	1.21	829.66	1.30	768.28	33.91	29.49	25.93	38.57	$MMCLS_DIR/configs/resnet/resnet50_b32x8_imagenet.py
ResNeXt	ImageNet	1x3x224x224	4.31	231.93	1.42	703.42	1.37	727.42	1.36	737.67	133.44	7.49	69.38	14.41	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet	ImageNet	1x3x224x224	3.41	293.64	1.66	600.73	1.51	662.90	1.91	524.07	107.84	9.27	80.85	12.37	$MMCLS_DIR/configs/seresnet/seresnet50_b32x8_imagenet.py
ShuffleNetV2	ImageNet	1x3x224x224	1.37	727.94	1.19	841.36	1.13	883.47	4.69	213.33	9.55	104.71	10.66	93.81	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py

MMDet

MMDet			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		int8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
YOLOv3	COCO	1x3x320x320	14.76	67.76	24.92	40.13	24.92	40.13	18.07	55.35	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD-Lite	COCO	1x3x320x320	8.84	113.12	9.21	108.56	8.04	124.38	19.72	50.71	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py
RetinaNet	COCO	1x3x800x1344	97.09	10.30	25.79	38.78	16.88	59.23	38.34	26.08	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	COCO	1x3x800x1344	84.06	11.90	23.15	43.20	17.68	56.57	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	COCO	1x3x800x1344	82.96	12.05	21.02	47.58	13.50	74.08	30.41	32.89	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
Faster-RCNN	COCO	1x3x800x1344	88.08	11.35	26.52	37.70	19.14	52.23	65.40	15.29	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
Mask-RCNN	COCO	1x3x800x1344	320.86	3.12	241.32	4.14	-	-	86.80	11.52	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMDet			NCNN
Model	Dataset	Input	SnapDragon888-fp32		Adreno660-fp32		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	model config file
MobileNetv2-YOLOv3	COCO	1x3x320x320	48.57	20.59	66.55	15.03	$MMDET_DIR/configs/yolo/yolov3_mobilenetv2_mstrain-416_300e_coco.py
SSD-Lite	COCO	1x3x320x320	44.91	22.27	66.19	15.11	$MMDET_DIR/configs/ssd/ssdlite_mobilenetv2_scratch_600e_coco.py

MMEdit

MMEdit		TensorRT						PPLNN
Model	Input	fp32		fp16		int8		fp16		model config file
Model	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
ESRGAN	1x3x32x32	12.64	79.14	12.42	80.50	12.45	80.35	7.67	130.39	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
SRCNN	1x3x32x32	0.70	1436.47	0.35	2836.62	0.26	3850.45	0.56	1775.11	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py

MMOCR

MMOCR			TensorRT						PPLNN		NCNN
Model	Dataset	Input	fp32		fp16		int8		fp16		SnapDragon888-fp32		Adreno660-fp32		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
DBNet	ICDAR2015	1x3x640x640	10.70	93.43	5.62	177.78	5.00	199.85	34.84	28.70	-	-	-	-	$MMOCR_DIR/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
CRNN	IIIT5K	1x1x32x32	1.93	518.28	1.40	713.88	1.36	736.79	-	-	10.57	94.64	20.00	50.00	$MMOCR_DIR/configs/textrecog/crnn/crnn_academic_dataset.py

MMSeg

MMSeg			TensorRT						PPLNN
Model	Dataset	Input	fp32		fp16		int8		fp16		model config file
Model	Dataset	Input	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	latency (ms)	FPS	model config file
FCN	Cityscapes	1x3x512x1024	128.42	7.79	23.97	41.72	18.13	55.15	27.00	37.04	$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet	Cityscapes	1x3x512x1024	119.77	8.35	24.10	41.49	16.33	61.23	27.26	36.69	$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3	Cityscapes	1x3x512x1024	226.75	4.41	31.80	31.45	19.85	50.38	36.01	27.77	$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_80k_cityscapes.py
DeepLabV3+	Cityscapes	1x3x512x1024	151.25	6.61	47.03	21.26	50.38	26.67	34.80	28.74	$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_80k_cityscapes.py

性能基准

用户可以直接通过如何测试性能获得想要的性能测试结果。下面是我们环境中的测试结果：

MMCls

MMCls			PyTorch	ONNX Runtime	TensorRT			PPLNN
Model	Task	Metrics	fp32	fp32	fp32	fp16	int8	fp16	model config file
ResNet-18	Classification	top-1	69.90	69.88	69.88	69.86	69.86	69.86	$MMCLS_DIR/configs/resnet/resnet18_b32x8_imagenet.py
ResNet-18	Classification	top-5	89.43	89.34	89.34	89.33	89.38	89.34	$MMCLS_DIR/configs/resnet/resnet18_b32x8_imagenet.py
ResNeXt-50	Classification	top-1	77.90	77.90	77.90	-	77.78	77.89	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
ResNeXt-50	Classification	top-5	93.66	93.66	93.66	-	93.64	93.65
SE-ResNet-50	Classification	top-1	77.74	77.74	77.74	77.75	77.63	77.73	$MMCLS_DIR/configs/resnext/resnext50_32x4d_b32x8_imagenet.py
SE-ResNet-50	Classification	top-5	93.84	93.84	93.84	93.83	93.72	93.84
ShuffleNetV1 1.0x	Classification	top-1	68.13	68.13	68.13	68.13	67.71	68.11	$MMCLS_DIR/configs/shufflenet_v1/shufflenet_v1_1x_b64x16_linearlr_bn_nowd_imagenet.py
ShuffleNetV1 1.0x	Classification	top-5	87.81	87.81	87.81	87.81	87.58	87.80
ShuffleNetV2 1.0x	Classification	top-1	69.55	69.55	69.55	69.54	69.10	69.54	$MMCLS_DIR/configs/shufflenet_v2/shufflenet_v2_1x_b64x16_linearlr_bn_nowd_imagenet.py
ShuffleNetV2 1.0x	Classification	top-5	88.92	88.92	88.92	88.91	88.58	88.92
MobileNet V2	Classification	top-1	71.86	71.86	71.86	71.87	70.91	71.84	$MMEDIT_DIR/configs/restorers/real_esrgan/realesrnet_c64b23g32_12x4_lr2e-4_1000k_df2k_ost.py
MobileNet V2	Classification	top-5	90.42	90.42	90.42	90.40	89.85	90.41

MMDet

MMDet				Pytorch	ONNXRuntime	TensorRT			PPLNN	OpenVINO
Model	Task	Dataset	Metrics	fp32	fp32	fp32	fp16	int8	fp16	fp32	model config file
YOLOV3	Object Detection	COCO2017	box AP	33.7	-	33.5	33.5	33.5	-	-	$MMDET_DIR/configs/yolo/yolov3_d53_320_273e_coco.py
SSD	Object Detection	COCO2017	box AP	25.5	-	25.5	25.5	-	-	-	$MMDET_DIR/configs/ssd/ssd300_coco.py
RetinaNet	Object Detection	COCO2017	box AP	36.5	-	36.4	36.4	36.3	36.5	-	$MMDET_DIR/configs/retinanet/retinanet_r50_fpn_1x_coco.py
FCOS	Object Detection	COCO2017	box AP	36.6	-	36.6	36.5	-	-	-	$MMDET_DIR/configs/fcos/fcos_r50_caffe_fpn_gn-head_1x_coco.py
FSAF	Object Detection	COCO2017	box AP	37.4	-	37.4	37.4	37.2	37.4	-	$MMDET_DIR/configs/fsaf/fsaf_r50_fpn_1x_coco.py
YOLOX	Object Detection	COCO2017	box AP	40.5	-	40.3	40.3	29.3	-	-	$MMDET_DIR/configs/yolox/yolox_s_8x8_300e_coco.py
Faster R-CNN	Object Detection	COCO2017	box AP	37.4	-	37.3	37.3	37.1	37.3	-	$MMDET_DIR/configs/faster_rcnn/faster_rcnn_r50_fpn_1x_coco.py
ATSS	Object Detection	COCO2017	box AP	39.4	-	39.4	39.4	-	-	-	$MMDET_DIR/configs/atss/atss_r50_fpn_1x_coco.py
Cascade R-CNN	Object Detection	COCO2017	box AP	40.4	-	40.4	40.4	-	40.4	-	$MMDET_DIR/configs/cascade_rcnn/cascade_rcnn_r50_caffe_fpn_1x_coco.py
Mask R-CNN	Instance Segmentation	COCO2017	box AP	38.2	-	38.1	38.1	-	38.0	-	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py
Mask R-CNN	Instance Segmentation	COCO2017	mask AP	34.7	-	33.7	33.7	-	-	-	$MMDET_DIR/configs/mask_rcnn/mask_rcnn_r50_fpn_1x_coco.py

MMEdit

MMEdit				Pytorch	ONNX Runtime	TensorRT			PPLNN
Model	Task	Dataset	Metrics	fp32	fp32	fp32	fp16	int8	fp16	model config file
SRCNN	Super Resolution	Set5	PSNR	28.4316	28.4323	28.4323	28.4286	28.1995	28.4311	$MMEDIT_DIR/configs/restorers/srcnn/srcnn_x4k915_g1_1000k_div2k.py
SRCNN	Super Resolution	Set5	SSIM	0.8099	0.8097	0.8097	0.8096	0.7934	0.8096
ESRGAN	Super Resolution	Set5	PSNR	28.2700	28.2592	28.2592	-	-	28.2624	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_x4c64b23g32_g1_400k_div2k.py
ESRGAN	Super Resolution	Set5	SSIM	0.7778	0.7764	0.7774	-	-	0.7765
ESRGAN-PSNR	Super Resolution	Set5	PSNR	30.6428	30.6444	30.6430	-	-	27.0426	$MMEDIT_DIR/configs/restorers/esrgan/esrgan_psnr_x4c64b23g32_g1_1000k_div2k.py
ESRGAN-PSNR	Super Resolution	Set5		0.8559	0.8558	0.8558	-	-	0.8557
SRGAN	Super Resolution	Set5	PSNR	27.9499	27.9408	27.9408	-	-	27.9388	$MMEDIT_DIR/configs/restorers/srresnet_srgan/srgan_x4c64b16_g1_1000k_div2k.pyy
SRGAN	Super Resolution	Set5	SSIM	0.7846	0.7839	0.7839	-	-	0.7839
SRResNet	Super Resolution	Set5	PSNR	30.2252	30.2300	30.2300	-	-	30.2294	$MMEDIT_DIR/configs/restorers/srresnet_srgan/msrresnet_x4c64b16_g1_1000k_div2k.py
SRResNet	Super Resolution	Set5		0.8491	0.8488	0.8488	-	-	0.8488
Real-ESRNet	Super Resolution	Set5	PSNR	28.0297	27.7016	27.7016	-	-	27.7049	$MMEDIT_DIR/configs/restorers/real_esrgan/realesrnet_c64b23g32_12x4_lr2e-4_1000k_df2k_ost.py
Real-ESRNet	Super Resolution	Set5	SSIM	0.8236	0.8122	0.8122	-	-	0.8123
EDSR	Super Resolution	Set5	PSNR	30.2223	30.2214	30.2214	30.2211	30.1383	-	$MMEDIT_DIR/configs/restorers/edsr/edsr_x4c64b16_g1_300k_div2k.py
EDSR	Super Resolution	Set5	SSIM	0.8500	0.8497	0.8497	0.8497	0.8469	-

MMOCR

MMOCR				Pytorch	ONNXRuntime	TensorRT			PPLNN	OpenVINO
Model	Task	Dataset	Metrics	fp32	fp32	fp32	fp16	int8	fp16	fp32	model config file
DBNet*	TextDetection	ICDAR2015	recall	0.7310	0.7304	0.7198	0.7179	0.7111	0.7304	0.7309	$MMOCR_DIR/configs/textdet/dbnet/dbnet_r18_fpnc_1200e_icdar2015.py
			precision	0.8714	0.8718	0.8677	0.8674	0.8688	0.8718	0.8714
			hmean	0.7950	0.7949	0.7868	0.7856	0.7821	0.7949	0.7950
CRNN	TextRecognition	IIIT5K	acc	0.8067	0.8067	0.8067	0.8063	0.8067	0.8067	-	$MMOCR_DIR/configs/textrecog/crnn/crnn_academic_dataset.py
SAR	TextRecognition	IIIT5K	acc	0.9517	0.9287	-	-	-	-	-	$MMOCR_DIR/configs/textrecog/sar/sar_r31_parallel_decoder_academic.py

MMSeg

MMSeg			Pytorch	ONNXRuntime	TensorRT			PPLNN
Model	Dataset	Metrics	fp32	fp32	fp32	fp16	int8	fp16	model config file
FCN	Cityscapes	mIoU	72.25	-	72.36	72.35	74.19	-	$MMSEG_DIR/configs/fcn/fcn_r50-d8_512x1024_40k_cityscapes.py
PSPNet	Cityscapes	mIoU	78.55	-	78.26	78.24	77.97	-	$MMSEG_DIR/configs/pspnet/pspnet_r50-d8_512x1024_80k_cityscapes.py
deeplabv3	Cityscapes	mIoU	79.09	-	79.12	79.12	78.96	-	$MMSEG_DIR/configs/deeplabv3/deeplabv3_r50-d8_512x1024_40k_cityscapes.py
deeplabv3+	Cityscapes	mIoU	79.61	-	79.6	79.6	79.43	-	$MMSEG_DIR/configs/deeplabv3plus/deeplabv3plus_r50-d8_512x1024_40k_cityscapes.py
Fast-SCNN	Cityscapes	mIoU	70.96	-	70.93	70.92	66.0	-	$MMSEG_DIR/configs/fastscnn/fast_scnn_lr0.12_8x4_160k_cityscapes.py

注意

由于某些数据集在代码库中包含各种分辨率的图像，例如 MMDet，速度基准是通过 MMDeploy 中的静态配置获得的，而性能基准是通过动态配置获得的。
TensorRT 的一些 int8 性能基准测试需要具有 tensor core 的 Nvidia 卡，否则性能会大幅下降。
DBNet 在模型的颈部使用了nearest插值模式，TensorRT-7 应用了与 Pytorch 完全不同的策略。为了使与 TensorRT-7 兼容，我们重写了neck以使用bilinear插值模式，这提高了最终检测性能。为了获得与 Pytorch 匹配的性能，推荐使用 TensorRT-8+，其插值方法与 Pytorch 相同。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

benchmark.md

benchmark.md

基准

后端

延迟基准

平台

其他设置

性能基准

注意

Files

benchmark.md

Latest commit

History

benchmark.md

File metadata and controls

基准

后端

延迟基准

平台

其他设置

性能基准

注意