Validated Models

Validated MLPerf Models

Model	Framework	Support	Example
ResNet50 V1.5	TensorFlow	Yes	Link
ResNet50 V1.5	PyTorch	Yes	Link
DLRM	PyTorch	Yes	Link
BERT large	TensorFlow	Yes	Link
BERT large	PyTorch	Yes	Link
SSD ResNet34	TensorFlow	Yes	Link
SSD ResNet34	PyTorch	Yes	Link
RNN-T	PyTorch	Yes	Link
3D-UNet	TensorFlow	WIP
3D-UNet	PyTorch	Yes	Link

Validated Quantization Examples

Performance results test on 06/07/2022 with Intel Xeon Platinum 8380 Scalable processor, using 1 socket, 4 cores/instance, 10 instances and batch size 1.

Performance varies by use, configuration and other factors. See platform configuration for configuration details. For more complete information about performance and benchmark results, visit www.intel.com/benchmarks

TensorFlow models with Intel TensorFlow 2.9.1

Model	Accuracy			Performance throughput (samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
BERT large SQuAD	92.39	92.99	-0.64%	25.32	12.53	2.02x	pb
DenseNet121	73.57%	72.89%	0.93%	370.52	329.74	1.12x	pb
DenseNet161	76.24%	76.29%	-0.07%	219.46	180.75	1.21x	pb
DenseNet169	74.40%	74.65%	-0.33%	301.33	259.88	1.16x	pb
Faster R-CNN Inception ResNet V2	37.98%	38.33%	-0.91%	3.96	2.34	1.69x	pb
Faster R-CNN Inception ResNet V2	37.84%	38.33%	-1.28%	3.98	2.31	1.72x	SavedModel
Faster R-CNN ResNet101	30.28%	30.39%	-0.36%	70	19.98	3.50x	pb
Faster R-CNN ResNet101	30.37%	30.39%	-0.07%	70.26	16.98	4.14x	SavedModel
Inception ResNet V2	80.44%	80.40%	0.05%	281.79	137.91	2.04x	pb
Inception V1	70.48%	69.74%	1.06%	2193.17	975.6	2.25x	pb
Inception V2	74.36%	73.97%	0.53%	1835.35	838.82	2.19x	pb
Inception V3	77.28%	76.75%	0.69%	973.42	376.3	2.59x	pb
Inception V4	80.40%	80.27%	0.16%	575.9	200.55	2.87x	pb
Mask R-CNN Inception V2	28.53%	28.73%	-0.70%	132.51	50.3	2.63x	pb
Mask R-CNN Inception V2	28.53%	28.73%	-0.70%	132.89	50.97	2.61x	ckpt
MobileNet V1	71.79%	70.96%	1.17%	3545.79	1191.94	2.97x	pb
MobileNet V2	71.89%	71.76%	0.18%	2431.66	1420.11	1.71x	pb
ResNet101	77.50%	76.45%	1.37%	877.91	355.49	2.47x	pb
ResNet50 Fashion	77.80%	78.12%	-0.41%	3977.5	2150.68	1.85x	pb
ResNet50 V1.0	74.11%	74.27%	-0.22%	1509.64	472.66	3.19x	pb
ResNet50 V1.5	76.82%	76.46%	0.47%	1260.01	415.83	3.03x	pb
ResNet V2 101	72.67%	71.87%	1.11%	436.52	318.3	1.37x	pb
ResNet V2 152	73.03%	72.37%	0.91%	306.82	221.4	1.39x	pb
ResNet V2 50	70.33%	69.64%	0.99%	749.85	574.19	1.31x	pb
SSD MobileNet V1	22.97%	23.13%	-0.69%	952.9	582.87	1.63x	pb
SSD MobileNet V1	22.99%	23.13%	-0.61%	954.92	413.24	2.31x	ckpt
SSD ResNet34	21.69%	22.09%	-1.81%	44.46	11.81	3.76x	pb
SSD ResNet50 V1	37.86%	38.00%	-0.37%	69.5	26.04	2.67x	pb
SSD ResNet50 V1	37.81%	38.00%	-0.50%	69.27	21.17	3.27x	ckpt
VGG16	72.66%	70.89%	2.50%	660.46	177.85	3.71x	pb
VGG19	72.72%	71.01%	2.41%	562.04	147.61	3.81x	pb
Wide & Deep	77.62%	77.67%	-0.07%	21332.47	19714.08	1.08x	pb

PyTorch models with Torch 1.11.0+cpu in PTQ mode

Model	Accuracy			Performance throughput (samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
ALBERT base MRPC	88.06%	88.50%	-0.50%	34.28	29.54	1.16x	eager
Barthez MRPC	82.99%	83.81%	-0.97%	166.84	89.56	1.86x	eager
BERT base COLA	58.80%	58.84%	-0.07%	260	126.47	2.06x	fx
BERT base MRPC	90.28%	90.69%	-0.45%	251.79	126.46	1.99x	fx
BERT base RTE	69.31%	69.68%	-0.52%	252.14	126.45	1.99x	fx
BERT base SST2	91.97%	91.86%	0.12%	258.98	126.42	2.05x	fx
BERT base STSB	89.13%	89.75%	-0.68%	249.57	126.39	1.97x	fx
BERT large COLA	62.88%	62.57%	0.49%	88.75	36.7	2.42x	fx
BERT large MRPC	89.93%	90.38%	-0.49%	89.43	36.62	2.44x	fx
BERT large QNLI	90.96%	91.82%	-0.94%	91.27	37	2.47x	fx
BERT large RTE	71.84%	72.56%	-1.00%	77.62	36.01	2.16x	fx
CamemBERT base MRPC	86.56%	86.82%	-0.30%	241.39	124.77	1.93x	eager
Deberta MRPC	91.17%	90.91%	0.28%	152.09	85.13	1.79x	eager
DistilBERT base MRPC	88.66%	89.16%	-0.56%	415.09	246.9	1.68x	eager
DistilBERT base MRPC	88.74%	89.16%	-0.47%	459.93	245.33	1.87x	fx
FlauBERT MRPC	81.01%	80.19%	1.01%	644.05	457.32	1.41x	eager
Inception V3	69.43%	69.52%	-0.13%	454.3	213.7	2.13x	eager
Longformer MRPC	90.59%	91.46%	-0.95%	21.51	17.45	1.23x	eager
Mask R-CNN	37.70%	37.80%	-0.26%	17.61	5.76	3.06x	eager
mBart WNLI	56.34%	56.34%	0.00%	65.05	31.26	2.08x	eager
MobileNet V2	70.54%	71.84%	-1.81%	740.97	535.54	1.38x	eager
lvwerra/pegasus-samsum	42.21	42.67	-1.09%	3.89	1.14	3.41x	eager
PeleeNet	71.64%	72.10%	-0.64%	502.01	391.31	1.28x	eager
ResNet18	69.57%	69.76%	-0.27%	800.43	381.27	2.10x	eager
ResNet18	69.57%	69.76%	-0.28%	811.09	389.36	2.08x	fx
ResNet50	75.98%	76.15%	-0.21%	507.55	200.52	2.53x	eager
ResNeXt101_32x8d	79.08%	79.31%	-0.29%	203.54	73.85	2.76x	eager
RNN-T	92.45	92.55	-0.10%	79.21	20.47	3.87x	eager
Roberta Base MRPC	87.88%	88.18%	-0.34%	250.21	124.92	2.00x	eager
Se_ResNeXt50_32x4d	78.98%	79.08%	-0.13%	358.63	173.03	2.07x	eager
SqueezeBERT MRPC	87.77%	87.65%	0.14%	249.89	207.43	1.20x	eager
Transfo-xl MRPC	81.97%	81.20%	0.94%	11.25	8.34	1.35x	eager
YOLOv3	24.60%	24.54%	0.21%	108.09	40.02	2.70x	eager

PyTorch models with Torch 1.11.0+cpu in QAT mode

Model	Accuracy			Performance throughput (samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
ResNet18	69.74%	69.76%	-0.03%	804.76	388.67	2.07x	eager
ResNet18	69.73%	69.76%	-0.04%	806.44	386.59	2.09x	fx
BERT base MRPC QAT	89.60%	89.50%	0.11%	258.89	125.79	2.06x	fx
ResNet50	76.04%	76.15%	-0.14%	490.64	203.49	2.41x	eager

PyTorch models with IPEX 1.11.0

Model	Accuracy			Performance throughput (samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
bert-large-uncased-whole-word-masking-finetuned-squad	92.9	93.16	-0.28%	37.13	11.45	3.24x	ipex
ResNeXt101_32x16d_wsl	84.02%	84.17%	-0.18%	163.45	28.9	5.66x	ipex
ResNet50	76.00%	76.15%	-0.20%	707.86	202.02	3.51x	ipex
SSD ResNet34	19.97%	20.00%	-0.15%	30.84	8.55	3.61x	ipex

ONNX Models with ONNX Runtime 1.11.0

Model	Accuracy			Performance throughput (samples/sec)			Example
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]	Example
AlexNet	54.74%	54.79%	-0.09%	1518.97	676.74	2.24x	qlinearops
AlexNet	54.74%	54.79%	-0.09%	1411.3	652.6	2.16x	qdq
BERT base MRPC DYNAMIC	85.54%	86.03%	-0.57%	379.71	156.16	2.43x	qlinearops
BERT base MRPC STATIC	85.29%	86.03%	-0.86%	756.33	316.36	2.39x	qlinearops
BERT SQuAD	80.44	80.67	-0.29%	115.58	64.71	1.79x	qlinearops
BERT SQuAD	80.44	80.67	-0.29%	115.4	64.68	1.78x	qdq
CaffeNet	56.19%	56.30%	-0.20%	2786.79	802.7	3.47x	qlinearops
CaffeNet	56.19%	56.30%	-0.20%	2726.86	819.41	3.33x	qdq
DenseNet	60.20%	60.96%	-1.25%	404.83	340.63	1.19x	qlinearops
DistilBERT base MRPC	84.56%	84.56%	0.00%	1630.41	596.68	2.73x	qlinearops
EfficientNet	77.58%	77.70%	-0.15%	1985.35	1097.33	1.81x	qlinearops
Faster R-CNN	33.99%	34.37%	-1.11%	10.02	4.32	2.32x	qlinearops
Faster R-CNN	33.94%	34.37%	-1.25%	10.41	4.28	2.43x	qdq
FCN	64.66%	64.98%	-0.49%	44.31	14.2	3.12x	qlinearops
FCN	64.66%	64.98%	-0.49%	18.11	14.19	1.28x	qdq
GoogleNet	67.61%	67.79%	-0.27%	1165.84	810.65	1.44x	qlinearops
GoogleNet	67.61%	67.79%	-0.27%	1165.73	809.98	1.44x	qdq
Inception V1	67.23%	67.24%	-0.01%	1205.89	838.71	1.44x	qlinearops
Inception V1	67.23%	67.24%	-0.01%	1204.93	843.16	1.43x	qdq
Mask R-CNN	33.40%	33.72%	-0.95%	8.56	3.76	2.27x	qlinearops
Mask R-CNN	33.33%	33.72%	-1.16%	8.4	3.81	2.20x	qdq
Mobile bert MRPC	86.03%	86.27%	-0.28%	790.11	686.35	1.15x	qlinearops
MobileBERT SQuAD MLPerf	89.84	90.03	-0.20%	102.92	95.19	1.08x	qlinearops
MobileNet V2	65.47%	66.89%	-2.12%	5133.84	3394.73	1.51x	qlinearops
MobileNet V2	65.47%	66.89%	-2.12%	5066.31	3386.3	1.50x	qdq
MobileNet V3 MLPerf	75.59%	75.74%	-0.20%	4133.22	2132.92	1.94x	qlinearops
MobileNetV2 (ONNX Model Zoo)	68.30%	69.48%	-1.70%	5349.42	3373.29	1.59x	qlinearops
ResNet50 V1.5 MLPerf	76.13%	76.46%	-0.43%	1139.56	549.88	2.07x	qlinearops
ResNet50 V1.5	72.28%	72.29%	-0.01%	1165.35	556.02	2.10x	qlinearops
ResNet50 V1.5	72.28%	72.29%	-0.01%	1319.32	543.44	2.43x	qdq
ResNet50 V1.5 (ONNX Model Zoo)	74.76%	74.99%	-0.31%	1363.39	573.1	2.38x	qlinearops
Roberta Base MRPC	90.44%	89.95%	0.54%	811.05	312.71	2.59x	qlinearops
ShuffleNet V2	66.13%	66.36%	-0.35%	4948.77	2847.66	1.74x	qlinearops
SqueezeNet	56.55%	56.87%	-0.56%	6296.79	4340.51	1.45x	qlinearops
SqueezeNet	56.55%	56.87%	-0.56%	6227.76	4383.8	1.42x	qdq
SSD MobileNet V1	22.20%	23.10%	-3.90%	917.64	709.48	1.29x	qlinearops
SSD MobileNet V1	22.20%	23.10%	-3.90%	840.99	655.99	1.28x	qdq
SSD MobileNet V1 (ONNX Model Zoo)	22.88%	23.03%	-0.65%	845.17	666.25	1.27x	qlinearops
SSD MobileNet V1 (ONNX Model Zoo)	22.88%	23.03%	-0.65%	790.06	624.2	1.27x	qdq
SSD MobileNet V2	23.83%	24.68%	-3.44%	703.55	506.6	1.39x	qlinearops
SSD	18.68%	18.98%	-1.58%	41.99	11.12	3.78x	qdq
Tiny YOLOv3	12.08%	12.43%	-2.82%	836.21	659.69	1.27x	qlinearops
VGG16	66.60%	66.69%	-0.13%	312.48	128.98	2.42x	qlinearops
VGG16 (ONNX Model Zoo)	72.28%	72.40%	-0.17%	446.13	131.04	3.40x	qlinearops
YOLOv3	26.88%	28.74%	-6.47%	157.39	66.72	2.36x	qlinearops
YOLOv4	33.18%	33.71%	-1.57%	58.55	38.09	1.54x	qlinearops
ZFNet	55.89%	55.96%	-0.13%	664.37	358.62	1.85x	qlinearops
ZFNet	55.89%	55.96%	-0.13%	666.99	354.38	1.88x	qdq

MXNet models with MXNet 1.7.0

Model	Accuracy			Performance throughput (samples/sec)
Model	INT8	FP32	Accuracy Ratio[(INT8-FP32)/FP32]	INT8	FP32	Performance Ratio[INT8/FP32]
Inception V3	77.80%	77.65%	0.20%	920.74	276.73	3.33x
MobileNet V1	71.60%	72.23%	-0.86%	6585.19	2529.21	2.60x
MobileNet V2	70.80%	70.87%	-0.10%	5230.32	1996.47	2.62x
ResNet V1 152	78.28%	78.54%	-0.33%	574.85	156.2	3.68x
ResNet50 V1.0	75.91%	76.33%	-0.55%	1567.9	427.99	3.66x
SqueezeNet	56.80%	56.97%	-0.28%	4704.51	1332.29	3.53x
SSD MobileNet V1	74.94%	75.54%	-0.79%	769.26	193.03	3.99x

Validated Pruning Examples

Tasks	Framework	Model	FP32 Baseline	Gradient Sensitivity with 20% Sparsity			+ONNX Dynamic Quantization on Pruned Model
Tasks	Framework	Model	FP32 Baseline	Accuracy%	Drop	Perf Gain (sample/s)	Accuracy%	Drop	Perf Gain (sample/s)
SST-2	PyTorch	BERT base	accuracy = 92.32	accuracy = 91.97	-0.38	1.30x	accuracy = 92.20	-0.13	1.86x
QQP	PyTorch	BERT base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [89.97, 86.54]	[-1.24, -1.71]	1.32x	[accuracy, f1] = [89.75, 86.60]	[-1.48, -1.65]	1.81x

Tasks	Framework	Model	FP32 Baseline	Pattern Lock on 70% Unstructured Sparsity		Pattern Lock on 50% 1:2 Structured Sparsity
Tasks	Framework	Model	FP32 Baseline	Accuracy%	Drop	Accuracy%	Drop
MNLI	PyTorch	BERT base	[m, mm] = [84.57, 84.79]	[m, mm] = [82.45, 83.27]	[-2.51, -1.80]	[m, mm] = [83.20, 84.11]	[-1.62, -0.80]
SST-2	PyTorch	BERT base	accuracy = 92.32	accuracy = 91.51	-0.88	accuracy = 92.20	-0.13
QQP	PyTorch	BERT base	[accuracy, f1] = [91.10, 88.05]	[accuracy, f1] = [90.48, 87.06]	[-0.68, -1.12]	[accuracy, f1] = [90.92, 87.78]	[-0.20, -0.31]
QNLI	PyTorch	BERT base	accuracy = 91.54	accuracy = 90.39	-1.26	accuracy = 90.87	-0.73
QnA	PyTorch	BERT base	[em, f1] = [79.34, 87.10]	[em, f1] = [77.27, 85.75]	[-2.61, -1.54]	[em, f1] = [78.03, 86.50]	[-1.65, -0.69]

Framework	Model	FP32 Baseline	Compression	Dataset	Accuracy% (Drop)
PyTorch	ResNet18	69.76	30% Sparsity on Magnitude	ImageNet	69.47(-0.42)
PyTorch	ResNet18	69.76	30% Sparsity on Gradient Sensitivity	ImageNet	68.85(-1.30)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude	ImageNet	76.11(-0.03)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude and Post Training Quantization	ImageNet	76.01(-0.16)
PyTorch	ResNet50	76.13	30% Sparsity on Magnitude and Quantization Aware Training	ImageNet	75.90(-0.30)

Validated Knowledge Distillation Examples

Example Name	Dataset	Student (Accuracy)	Teacher (Accuracy)	Student With Distillation (Accuracy Improvement)
Example Name	Dataset	Student (Accuracy)	Teacher (Accuracy)	Student With Distillation (Accuracy Improvement)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
ResNet example	ImageNet	ResNet18 (0.6739)	ResNet50 (0.7399)	0.6845 (0.0106)
BlendCNN example	MRPC	BlendCNN (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BlendCNN example	MRPC	BlendCNN (0.7034)	BERT-Base (0.8382)	0.7034 (0)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)
BiLSTM example	SST-2	BiLSTM (0.7913)	RoBERTa-Base (0.9404)	0.8085 (0.0172)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

validated_model_list.md

validated_model_list.md

Validated Models

Validated MLPerf Models

Validated Quantization Examples

TensorFlow models with Intel TensorFlow 2.9.1

PyTorch models with Torch 1.11.0+cpu in PTQ mode

PyTorch models with Torch 1.11.0+cpu in QAT mode

PyTorch models with IPEX 1.11.0

ONNX Models with ONNX Runtime 1.11.0

MXNet models with MXNet 1.7.0

Validated Pruning Examples

Validated Knowledge Distillation Examples

Files

validated_model_list.md

Latest commit

History

validated_model_list.md

File metadata and controls

Validated Models

Validated MLPerf Models

Validated Quantization Examples

TensorFlow models with Intel TensorFlow 2.9.1

PyTorch models with Torch 1.11.0+cpu in PTQ mode

PyTorch models with Torch 1.11.0+cpu in QAT mode

PyTorch models with IPEX 1.11.0

ONNX Models with ONNX Runtime 1.11.0

MXNet models with MXNet 1.7.0

Validated Pruning Examples

Validated Knowledge Distillation Examples