Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to get the feature map by using pretrained model. #239

Open
JiuqingDong opened this issue Apr 28, 2023 · 2 comments
Open

How to get the feature map by using pretrained model. #239

JiuqingDong opened this issue Apr 28, 2023 · 2 comments

Comments

@JiuqingDong
Copy link

Hi, recently I try to use the DINO and DINOv2 pre-train model to get the feature map. I use the pre-trained model to get the feature, but only got a vector. It seems like the output of the last average pooling layer.
Could you give me some suggestions to get the pyramid feature maps?

image

@Catchip
Copy link

Catchip commented Nov 26, 2024

I tried to use the following code to extract 2D image features. According to my expectations, I should have obtained a feature map of [1, 14*14, 384], but in reality, I got a feature map of [1, 197, 384]. This makes it difficult for me to figure out how to reshape this feature map into [1, 14, 14, 384].

preprocess = T.Compose([
    T.Resize((224, 224)), 
    T.ToTensor(),  
    T.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), 
])


img = Image.open("sample.jpg")
img_tensor = preprocess(img)
img_tensor = img_tensor.unsqueeze(0)

vits16 = torch.hub.load('facebookresearch/dino:main', 'dino_vits16')
feature = vits16.get_intermediate_layers(img_tensor)  # [1, 197, 384]

@Catchip
Copy link

Catchip commented Nov 28, 2024

#252 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants