We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Describe the bug my code:
import torch from PIL import Image from thop import profile, clever_format from torchinfo import summary import cn_clip.clip as clip from cn_clip.clip import load_from_name, available_models print("Available models:", available_models()) # Available models: ['ViT-B-16', 'ViT-L-14', 'ViT-L-14-336', 'ViT-H-14', 'RN50'] device = "cuda" if torch.cuda.is_available() else "cpu" model, preprocess = load_from_name("RN50", device=device, download_root='./') model.eval() image = preprocess(Image.open("examples/pokemon.jpeg")).unsqueeze(0).to(device) text = clip.tokenize(["杰尼龟", "妙蛙种子", "小火龙", "皮卡丘"]).to(device) print(f"device:{device}") with torch.no_grad(): image_features = model.encode_image(image) text_features = model.encode_text(text) # 对特征进行归一化,请使用归一化后的图文特征用于下游任务 image_features /= image_features.norm(dim=-1, keepdim=True) text_features /= text_features.norm(dim=-1, keepdim=True) logits_per_image, logits_per_text = model.get_similarity(image, text) probs = logits_per_image.softmax(dim=-1).cpu().numpy() print("Label probs:", probs) # [[1.268734e-03 5.436878e-02 6.795761e-04 9.436829e-01]] # 计算FLOPs print(f"image:{image.shape}, text:{text.shape}") flops, params = profile(model, inputs=(image, text), verbose=False) flops, params = clever_format([flops, params], "%.3f") print(f"Model FLOPs: {flops}, Model Params:{params}") print("========seperate==========") flops, params = profile(model.visual, inputs=(image, ), verbose=False) flops, params = clever_format([flops, params], "%.3f") print(f"Model FLOPs: {flops}, Model Params:{params}") print(f"=========another method=============") summary(model, input_data=[image, text])
my results:
Model FLOPs: 9.839G, Model Params:44.792M ========seperate========== Model FLOPs: 5.418G, Model Params:23.527M =========another method============= ========================================================================================================= Layer (type:depth-idx) Output Shape Param # ========================================================================================================= CLIP [1, 1024] 786,433 ├─ModifiedResNet: 1-1 [1, 1024] -- │ └─Conv2d: 2-1 [1, 32, 112, 112] 864 │ └─BatchNorm2d: 2-2 [1, 32, 112, 112] 64 │ └─ReLU: 2-3 [1, 32, 112, 112] -- │ └─Conv2d: 2-4 [1, 32, 112, 112] 9,216 │ └─BatchNorm2d: 2-5 [1, 32, 112, 112] 64 │ └─ReLU: 2-6 [1, 32, 112, 112] -- │ └─Conv2d: 2-7 [1, 64, 112, 112] 18,432 │ └─BatchNorm2d: 2-8 [1, 64, 112, 112] 128 │ └─ReLU: 2-9 [1, 64, 112, 112] -- │ └─AvgPool2d: 2-10 [1, 64, 56, 56] -- │ └─Sequential: 2-11 [1, 256, 56, 56] -- │ │ └─Bottleneck: 3-1 [1, 256, 56, 56] 75,008 │ │ └─Bottleneck: 3-2 [1, 256, 56, 56] 70,400 │ │ └─Bottleneck: 3-3 [1, 256, 56, 56] 70,400 │ └─Sequential: 2-12 [1, 512, 28, 28] -- │ │ └─Bottleneck: 3-4 [1, 512, 28, 28] 379,392 │ │ └─Bottleneck: 3-5 [1, 512, 28, 28] 280,064 │ │ └─Bottleneck: 3-6 [1, 512, 28, 28] 280,064 │ │ └─Bottleneck: 3-7 [1, 512, 28, 28] 280,064 │ └─Sequential: 2-13 [1, 1024, 14, 14] -- │ │ └─Bottleneck: 3-8 [1, 1024, 14, 14] 1,512,448 │ │ └─Bottleneck: 3-9 [1, 1024, 14, 14] 1,117,184 │ │ └─Bottleneck: 3-10 [1, 1024, 14, 14] 1,117,184 │ │ └─Bottleneck: 3-11 [1, 1024, 14, 14] 1,117,184 │ │ └─Bottleneck: 3-12 [1, 1024, 14, 14] 1,117,184 │ │ └─Bottleneck: 3-13 [1, 1024, 14, 14] 1,117,184 │ └─Sequential: 2-14 [1, 2048, 7, 7] -- │ │ └─Bottleneck: 3-14 [1, 2048, 7, 7] 6,039,552 │ │ └─Bottleneck: 3-15 [1, 2048, 7, 7] 4,462,592 │ │ └─Bottleneck: 3-16 [1, 2048, 7, 7] 4,462,592 │ └─AttentionPool2d: 2-15 [1, 1024] 14,789,632 ├─BertModel: 1-2 [4, 52, 768] -- │ └─BertEmbeddings: 2-16 [4, 52, 768] -- │ │ └─Embedding: 3-17 [4, 52, 768] 16,226,304 │ │ └─Embedding: 3-18 [4, 52, 768] 393,216 │ │ └─Embedding: 3-19 [4, 52, 768] 1,536 │ │ └─LayerNorm: 3-20 [4, 52, 768] 1,536 │ │ └─Dropout: 3-21 [4, 52, 768] -- │ └─BertEncoder: 2-17 [4, 52, 768] -- │ │ └─ModuleList: 3-22 -- 21,263,616 ========================================================================================================= Total params: 76,989,537 Trainable params: 76,989,537 Non-trainable params: 0 Total mult-adds (G): 5.52 ========================================================================================================= Input size (MB): 0.60 Forward/backward pass size (MB): 123.19 Params size (MB): 122.93 Estimated Total Size (MB): 246.73 =========================================================================================================
Why the number of parameters varies so much? (44.792M VS 77M) Why are mult-adds such different? (9.839G VS 5.52G)
(44.792M VS 77M)
(9.839G VS 5.52G)
To Reproduce Steps to reproduce the behavior:
Expected behavior A clear and concise description of what you expected to happen.
Screenshots If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context Add any other context about the problem here.
The text was updated successfully, but these errors were encountered:
Same with me! Have you found the cause yet?
Sorry, something went wrong.
I have found that torchinfo, ultralytics-thop, and common approach
torchinfo
ultralytics-thop
pytorch_total_params = sum(p.numel() for p in model.parameters())
They are likely yield different results for newer models.
No branches or pull requests
Describe the bug
my code:
my results:
Why the number of parameters varies so much?
(44.792M VS 77M)
Why are mult-adds such different?
(9.839G VS 5.52G)
To Reproduce
Steps to reproduce the behavior:
Expected behavior
A clear and concise description of what you expected to happen.
Screenshots
If applicable, add screenshots to help explain your problem.
Desktop (please complete the following information):
Smartphone (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: