You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for your amazing work with this repository.
I am experimenting with unsupervised learning - I would like to use the G / D of the trained GAN as encoders (there are ways to use G via knowledge distillation into a simpler CNN or papers like this one. I would like to build a model that can recognize 2 images of the same room.
The problem is that standard techniques (hashing, CNN => dimensionality reduction => distance, siamese networks) will either find only very similar images / clustered images or image replicas. If the image is shot from a bit different angle - these methods fail. Ideally I need to build an index of images by such variables as couches / TVs / carpets. Looks like DCGAN / WGAN learns something like this as a latent variable. If we have 100+ latent variables that represent this - they may work for different images.
I have a dataset of ~300-500k similar images, that mostly resemble lsun living room, but are shot in Russia. So far I have tried the variations of hyperparameters similar to these (please refer to the PR to see what the new parameters mean):
Also on my dataset I managed to replicate training on 64x64 resolution (both for lsun and my dataset), but 256x256 still fails to converge meaningfully.
@snakers4 I assume in order to create images with higher resolution you have to increase model's compelexity in terms of parameter number - maybe you will have some success with extra layers?
Hi!
Thank you for your amazing work with this repository.
I am experimenting with unsupervised learning - I would like to use the G / D of the trained GAN as encoders (there are ways to use G via knowledge distillation into a simpler CNN or papers like this one. I would like to build a model that can recognize 2 images of the same room.
The problem is that standard techniques (hashing, CNN => dimensionality reduction => distance, siamese networks) will either find only very similar images / clustered images or image replicas. If the image is shot from a bit different angle - these methods fail. Ideally I need to build an index of images by such variables as couches / TVs / carpets. Looks like DCGAN / WGAN learns something like this as a latent variable. If we have 100+ latent variables that represent this - they may work for different images.
I have a dataset of ~300-500k similar images, that mostly resemble lsun living room, but are shot in Russia. So far I have tried the variations of hyperparameters similar to these (please refer to the PR to see what the new parameters mean):
So far I did not achieve any success yet. Can you please add some guidelines into README on training on larger resolutions / batch sizes?
The text was updated successfully, but these errors were encountered: