Files
AMD-SHARK-Studio/tank/torch_model_list.csv
2023-06-16 13:24:44 -07:00

3.8 KiB

1model_nameuse_tracingmodel_typedynamicmlir_typedecomposeparam_counttagsnotes
2efficientnet_b0TruevisionFalselinalgFalse5.3Mimage-classification;cnn;conv2d;depthwise-convSmallest EfficientNet variant with 224x224 input
3efficientnet_b7TruevisionFalselinalgFalse66Mimage-classification;cnn;conv2d;depthwise-convLargest EfficientNet variant with 600x600 input
4microsoft/MiniLM-L12-H384-uncasedTruehfTruelinalgFalse66Mnlp;bert-variant;transformer-encoderLarge version has 12 layers; 384 hidden size; Smaller than BERTbase (66M params vs 109M params)
5bert-base-uncasedTruehfTruelinalgFalse109Mnlp;bert-variant;transformer-encoder12 layers; 768 hidden; 12 attention heads
6bert-base-casedTruehfTruelinalgFalse109Mnlp;bert-variant;transformer-encoder12 layers; 768 hidden; 12 attention heads
7google/mobilebert-uncasedTruehfTruelinalgFalse25Mnlp,bert-variant,transformer-encoder,mobile24 layers, 512 hidden size, 128 embedding
8alexnetFalsevisionTruelinalgFalse61Mcnn,parallel-layersThe CNN that revolutionized computer vision (move away from hand-crafted features to neural networks),10 years old now and probably no longer used in prod.
9resnet18FalsevisionTruelinalgFalse11Mcnn,image-classification,residuals,resnet-variant1 7x7 conv2d and the rest are 3x3 conv2d
10resnet50FalsevisionTruelinalgFalse23Mcnn,image-classification,residuals,resnet-variantBottlenecks with only conv2d (1x1 conv -> 3x3 conv -> 1x1 conv blocks)
11resnet101FalsevisionTruelinalgFalse29Mcnn,image-classification,residuals,resnet-variantBottlenecks with only conv2d (1x1 conv -> 3x3 conv -> 1x1 conv blocks)
12squeezenet1_0FalsevisionTruelinalgFalse1.25Mcnn,image-classification,mobile,parallel-layersParallel conv2d (1x1 conv to compress -> (3x3 expand | 1x1 expand) -> concat)
13wide_resnet50_2FalsevisionTruelinalgFalse69Mcnn,image-classification,residuals,resnet-variantResnet variant where model depth is decreased and width is increased.
14mobilenet_v3_smallFalsevisionTruelinalgFalse2.5Mimage-classification,cnn,mobileN/A
15google/vit-base-patch16-224Truehf_img_clsFalselinalgFalse86Mimage-classification,vision-transformer,transformer-encoderN/A
16microsoft/resnet-50Truehf_img_clsFalselinalgFalse23Mimage-classification,cnn,residuals,resnet-variantBottlenecks with only conv2d (1x1 conv -> 3x3 conv -> 1x1 conv blocks)
17facebook/deit-small-distilled-patch16-224Truehf_img_clsFalselinalgFalse22Mimage-classification,vision-transformer,cnnN/A
18microsoft/beit-base-patch16-224-pt22k-ft22kTruehf_img_clsFalselinalgFalse86Mimage-classification,transformer-encoder,bert-variant,vision-transformerN/A
19nvidia/mit-b0Truehf_img_clsFalselinalgFalse3.7Mimage-classification,transformer-encoderSegFormer
20mnasnet1_0FalsevisionTruelinalgFalse-cnn, torchvision, mobile, architecture-searchOutperforms other mobile CNNs on Accuracy vs. Latency
21resnet50_fp16FalsevisionTruelinalgFalse23Mcnn,image-classification,residuals,resnet-variantBottlenecks with only conv2d (1x1 conv -> 3x3 conv -> 1x1 conv blocks)
22bert-base-uncased_fp16Truefp16FalselinalgFalse109Mnlp;bert-variant;transformer-encoder12 layers; 768 hidden; 12 attention heads
23bert-large-uncasedTruehfTruelinalgFalse330Mnlp;bert-variant;transformer-encoder24 layers, 1024 hidden units, 16 attention heads
24bert-base-uncasedTruehfFalsestablehloFalse109Mnlp;bert-variant;transformer-encoder12 layers; 768 hidden; 12 attention heads
25gpt2Truehf_causallmFalsestablehloTrue125Mnlp;transformer-encoder-
26facebook/opt-125mTruehfFalsestablehloTrue125Mnlp;transformer-encoder-
27distilgpt2TruehfFalsestablehloTrue88Mnlp;transformer-encoder-
28microsoft/deberta-v3-baseTruehfFalsestablehloTrue88Mnlp;transformer-encoder-