-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
inconsistent rgb/rgba behavior #4
Comments
Internally all 444 formats are 12-bit, not 16-bit. So when 0xffff is input using CFHD_PIXEL_FORMAT_B64A, it is shifted to 0x0fff, encoded as 0x0fff, which is shifted back to 0xfff0 for the pixel container precision (not 0xff00 as you have suggested, I think that was a typo.) For alpha channel the codec will clip to 0xffff as a special case, but I'm not sure that is the correct behavior for RGB data. The issue is round-trip losses if 0x0fff is scaled to 0xffff, with a wavelet encode in between. Round-tripping quality is a big concern for codecs, but an optional scaled input and output pixel format would be straight forward to add, but the PSNR would drop slightly over round-trip encodes. For CFHD_PIXEL_FORMAT_R210 I got 0x3FF for 0x3FF input (using a white frame.) |
Maybe it depends on how to encode sample? I used vfw codec feeding rgba32. |
If you used VFW you only feed 0xff into the codec. I'm not convinced that should be 0x3ff (10-bit) or 0xffff (16-bit) should be on the output. 0x3fc is 0xff shifted up by 2 bits. Seems to be working correctly. The ambiguity over the value of white when switching bit depths has been long known. After Effects used to use 15+1bit representation where white was 0x8000, maybe to help with this. As codec's performance is measured by its ability to match its input, I think CineForm is correct here. Currently it matches the input, but if you want 0xff00 to represent white (or know 8-bit input 0xff, should be 0xffff on 16-bit output) then metadata would be the solution. Fortunately CineForm has an existing metadata engine that could be used, we just need to define a suitable FourCC. |
It seems SAMPLE_HEADER.input_format can be used as a clue. |
Yes it might work, however I'm still not convinced that gaining up the image should be the automatic behavior for 8-bit source encodes -- particularly if it impacts performance or round-trip quality. |
It is not uncommon approach to scale rgb values when changing bitdepth. |
Interesting, so for 444 decodes to 12-bit, the upscale to 16-bit would be: For 8-bit input to 12-bit encodes: Changing inputs before encoding is really weird, but I understand the logic. There would need to new tuning for DC offsets, but worth the experiment. Still expect 8-bit in to 8-bit out will decrease in PSNR, can;t know until trying. |
If Active Metadata was used, the would be more compute expensive as it involves additional buffer copies and the use of multiples rather shifts. CDL values for RGB Gain would be 1.00391 to take 0xff00 to 0xffff. |
Multipliers is a bad idea, IMHO, since it would introduce rounding errors and you would not be able anymore to recover the "original" data. Filling the lower bits with the upper bits from the same value is a common thing to do in image applications. It makes sure you can retrieve the original data and it makes sure you get full-scale output when the input was also full-scale (or zero when input was zero). But I'd vote against changing anything. Just add a comment to the header that defines the CFHD_PixelFormats (i.e. CFHDTypes.h) saying that on decode only the upper 12 or 10bit contain valid data. That way the user can decide how to expand them to true 16bit. Some people might want to convert the output to float anyway, so any extra ORing-in of lower bits is wasted processing time if you can instead just use the proper multiplier to turn it into float directly. |
I am trying to verify something. |
I can not fully confirm this, but I do see also some strange stuff going on. Edit: Spoke to soon...: But since we're dealing with lossy compression, I would somehow understand when not all values come out at exactly the same numerical value they went in. But seeing something higher than 12bits looks somehow wrong indeed. I have a feeling that creating a new decoder will not give you a fully initialized decoder. Some month ago (before this was open-sourced) I was writing a plugin for a commercial application where we reuse a limited set of decoders for whatever clip the user might be playing. So sometimes a decoder that was previously used on an RGB encoded clip would then be used for an YUV clip and so on. This often gave YUV frames that were too light. Only solution we found was to destroy the decoder and create a new one when switching between RGB(A) and YUV clips. |
CFHD_PrepareToEncode() must be call between any format change, so that is not surprising. shekh, Once you enable active metadata with any change, it does go through more processing than just a vertical offset, but that error seems more pronounce than I would have expected. That will take some effort to track down. |
There seems to be one more thing wrong with alpha channel encoding. It seems the alpha channel gets a curve applied on encode that maps the 0-4095 range into 256-3823. Then it excluded 0 and "very high" values from being run through that curve. Presumably in order to make sure that even with rounding errors (i.e. lossy compression) fully transparent and fully opaque always come out of the decoder undamaged. However, the condition for applying the curve is this (in ConvertBGRA64ToFrame_4444_16s() and similar functions in frame.c):
I'd say this should be changed to:
Otherwise all alpha values from 4080 on will come out of the decoder as 4095. This curve applied to the alpha channel also explains why identical values for R, G ,B and A show different "rounding errors" for RGB and A when being encoded and decoded again. Also in frame.c there are cases where decoding from YUV or RGB into an RGBA output format causes the alpha channel being filled with 0xffff (instead of taking the max. value for the respective bit depth). Edit: There are also several cases in convert.c where the max. value gets limited to USHRT_MAX. |
I agree ConvertBGRA64ToFrame_4444_16s() and ConvertRGBA64ToFrame16s() should be using Fixing that. |
I see different maximum values depending on requested decoding format.
decode to CFHD_PIXEL_FORMAT_BGRA:
result: r/g/b/a = 0xFF as expected
decode to CFHD_PIXEL_FORMAT_R210:
result: r/g/b = 0x3FC (expected 0x3FF)
decode to CFHD_PIXEL_FORMAT_B64A:
sample: CFHD_ENCODED_FORMAT_RGB_4444
result: r/g/b = 0xFF00, a = 0xFFFF (expected r/g/b 0xFFFF)
sample: CFHD_ENCODED_FORMAT_RGB_444
result: r/g/b = 0xFF00, a = 0xFFF0
Main issue is underscaled rgb, alpha seems ok where it is present.
The text was updated successfully, but these errors were encountered: