-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Color Loss using normalized input #7
Comments
Hi there, As far as I know, most python libraries readout lab images in a range of 255 (8-bit image). After having looked at numerous reference papers, I could not find any concrete reference for normalizing the deltaE value in a range of 0-100. Also, the ground truth and input images are normalized between 0-1. Therefore, the deltaE value has been normalized by dividing 255.0. Can you please elaborate on the logic behind normalizing between 0-100? We may work out to solve the problem. |
Hi, sure no problem. I'll do it in a few parts. We have the tensor in [C,H,W] format and want to go to [H,W,C] for skimage. If we create a 2x2 image with 3 channels, and fill it with one value per channel (R=1,G=2,B=3), it becomes clear that permute also changes the offsets accordingly, while reshape only changes the dimensions:
Permute actually gives us 2 x 2 x [R,G,B] values in the last dimension |
You were correct that skimage expect arrays containing floats to be in the 0.0, 1.0 range. This is my mistake, as I tested with an array containing uint8. However, the L*ab color space is defined with a range of L between 0 and 100. a, b technically unbounded but centered around 0 and usually in the range -128, 127
(random in RGB being biased and not covering entire Lab space obviously) The CIEDE2000 color difference is a scale that is also technically unbounded I believe although sources say it goes from 0 to 100, I did a quick check though and it seems like the difference between full saturation and black is round 100, with opposite colors slightly higher:
with delta E going up even more when we select opposing colors within Lab color space (which can't be represented by RGB), but in reality I guess it doesn't really matter for a loss function whether it gets divided by 100 or 255, it would probably just take a little longer to converge as it probably favors the other objectives a bit more (?) |
Finally you mention that the images are normalized between 0 and 1. But in the dataloader there is a Normalize transform that takes mean (0.5, 0.5, 0.5) and stddev (0.5,0.5,0.5) from dataTools/dataNormalization.py, which gives a range from -1 to 1
This gets then fed into the rgb2lab, which expects 0,1 range (not 0,255 range, which was my mistake, so the * 255 in my suggestion should be omitted) If I made a mistake somewhere, please let me know :) |
Thanks for your insightful analysis. You're absolutely right. During training, we read all images with PIL. Also, we leveraged tanh in the final layer of our model. Thus, generated and ground-truth images are normalized between 0 to 1, and normalization by 255 can be omitted. You may have noticed that normalization with 255 is actually optional. We found that it helps us in converging faster. Here, it is working as an additional parameter for tuning the loss function. However, depending on the application and data it can be ignored. |
Hey, I believe that the color loss might be a bit off. The input is a normalized image, and the ciede_2000 color difference is in the range 0 to 100 i think? Something like:
or use the UnNormalize in dataTools... and to normalize the color difference:
Normally I would do a PR, but my fork diverges quite a bit at the moment. Side note, your model also works great on 12 channel mosaicked images (with some trivial alterations) !
[edit: reshape becomes permute, going from C,H,W to H,W,C]
The text was updated successfully, but these errors were encountered: