Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid Resizing & migrating to TF 2.x #21

Open
Chakri-V-V opened this issue Oct 27, 2020 · 2 comments
Open

Avoid Resizing & migrating to TF 2.x #21

Chakri-V-V opened this issue Oct 27, 2020 · 2 comments

Comments

@Chakri-V-V
Copy link

Hi Team,

Thanks for the awesome solution.

I went through your previous issues resolving OOM (resizing of images) & the TensorFlow version used. (< 2.x)

Python version I used for tweaking: 3.6
Tensorflow version I used for tweaking: 1.15.2

I would like to know your perspective on the below queries that I have.

  1. I find it difficult to use the resized deshadowed output images from your solution, as I am losing out on features that I need for the further steps in my custom pipeline.

Consider that I have images with dimensions 4000,3000 (Height, Width) & might increase further as well. I am trying the solution on a normal Nvidia Tesla K80 of Google Colab. I did try out getting the max image dimensions (max resizing) that this setup can take for the 4:3 aspect ratio. Worked well for (1000,750 - H, W), but it immediately went into the OOM issue when I changed the (H, W) to (1500,1125).

By using the config_pb2.RunOptions(report_tensor_allocations_upon_oom=True) in my session run, I could find the attached screenshot for the error message.

error

Can you please point me to the code snippet in your py files (demo.py/deshadower.py/networks.py) where I can handle the tensor inputs to the hypercolumn features, without resizing?

Do you want me to try out a different GPU setup, if it avoids the resizing section?


  1. Considering many TensorFlow attributes to be depreciating in the near future, I am planning to manually change the code to Tf 2.x version. Do you suggest any better way than the manual route which am looking into?
    It would be super awesome if you can help me migrate to Tf 2.x.

  1. Can you help me with a function that captures the evaluation metrics of the model performance? I see "Perceptual & aggregated loss" to be computed in deshadower.py (under setup_model function), but not quite clear on how to get a baseline evaluation for the outputs that are generated on my custom data.
    Please bear with me, if I overlooked the implementation of the same, if present.
@vinthony
Copy link
Owner

Hi, thanks for your attention to our work.

  1. you can try to revise the code in vgg16 hypercolumn features here,:

for layer_id in range(1,6):
vgg19_f = vgg19_features['conv%d_2'%layer_id]
input = tf.concat([tf.image.resize_bilinear(vgg19_f,(tf.shape(input)[1],tf.shape(input)[2]))/255.0,input], axis=3)

Since you are working on a larger size, I think it might use a few features from VGG16 as the hyper-column features, for example, from the original 1400+ to a small size.

  1. I am very willing to make it possible to TFv2 or PyTorch. However, currently, I don't have much time before December.
    I hope I can try to reimplement or help your implementation around December.

  2. I have rewritten the evaluation code from previous matlab to python. Also, you can try to evaluate the model using PSNR or SSIM as performance metrics. please refer below code.

import numpy as np
import cv2
from skimage.measure import compare_psnr as psnr
from skimage.measure import compare_ssim as ssim
from skimage import io, color


def rmse_lab(imtarget,imoutput,immask):
    
    imtarget = np.float32(cv2.cvtColor(imtarget,cv2.COLOR_BGR2Lab))
    imoutput = np.float32(cv2.cvtColor(imoutput,cv2.COLOR_BGR2Lab))

    imtarget[:,:,0] = imtarget[:,:,0]*100/255.
    imtarget[:,:,1] = imtarget[:,:,1]-128
    imtarget[:,:,2] =imtarget[:,:,2]-128
    
    imoutput[:,:,0] = imoutput[:,:,0]*100/255.
    imoutput[:,:,1] = imoutput[:,:,1]-128
    imoutput[:,:,2] = imoutput[:,:,2]-128
    
    mask_binary = immask/255.0
    
    err_masked = np.sum(abs(imtarget*mask_binary-imoutput*mask_binary))
    num_of_mask = np.sum(mask_binary)

    return err_masked,num_of_mask

err_m,err_nm,err_a,total_mask,total_nonmask,total_all,cntx =  0.,0.,0.,0.,0.,0.,0.
psnrs = []
ssims=[]

for val_path in os.listdir('images/'):
    
    if '.png' in val_path or '.jpg' in val_path:
        imoutput = cv2.imread('imoutputs/'+val_path)
        imtarget = cv2.imread('shadow_free/'+val_path)
        immask = cv2.imread('shadow_mask/'+val_path)  
        
        immask = immask[:,:,0:1]
        
        bin_mask = np.where(immask>128,np.ones_like(immask),np.zeros_like(immask))
        
        err_masked,num_of_mask = rmse_lab(imtarget,imoutput,immask)
        err_non_maksed,num_of_non_mask = rmse_lab(imtarget,imoutput,255-immask)
        err_all,all_mask = rmse_lab(imtarget,imoutput,np.ones_like(imoutput[:,:,0:1])*255)
        
        err_m = err_m + err_masked
        err_nm = err_nm + err_non_maksed
        err_a = err_a + err_all

        total_mask = total_mask + num_of_mask
        total_nonmask = total_nonmask + num_of_non_mask
        total_all = total_all + all_mask
        cntx = cntx+1


RMSE_NS = err_nm/total_nonmask
RMSE_S = err_m/total_mask
RMSE_ALL = err_a/total_all

print("TEST:  RMSE: (%.4f,%.4f,%.4f), PSNR: (%.4f), SSIM: (%.4f)"%(RMSE_S,RMSE_NS,RMSE_ALL,np.mean(psnrs),np.mean(ssims)))

@xuhangc
Copy link

xuhangc commented Sep 29, 2021

import change in newer version of python

from skimage.metrics import peak_signal_noise_ratio as psnr
from skimage.metrics import structural_similarity as ssim

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants