-
Notifications
You must be signed in to change notification settings - Fork 303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add config attribute to advertise HEVC encoder features #385
Conversation
See intel/media-driver#866 and the set at https://lists.ffmpeg.org/pipermail/ffmpeg-devel/2020-March/257991.html for use of this. |
@xuguangxin @fulinjie please check interface whether it meet your requirement |
It makes sense, and got several questions raised in: |
va/va.h
Outdated
|
||
/** Minimum supported block size containing a QP delta. */ | ||
uint32_t min_cu_qp_delta_block_size_minus3 : 2; | ||
} bits; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
27 bit used, should we add "uint32_t reserved :5" ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhvwy hi Mark , could you help to add reserved bits and rebase the PR? thanks
(Only back to this recently, apologies for the long delay.) |
For FFmpeg implementing the user-side of this, see https://github.com/fhvwy/FFmpeg/tree/h265query |
* | ||
* CtbLog2SizeY must not be larger than this. | ||
*/ | ||
uint32_t log2_max_coding_tree_block_size_minus3 : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ask a question: if some device support LTU = 64, but not support LTU=32. how to report this attributes?
how to distinguish it from : support LTU=64 , also support LTU=32. both ok.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only 64x64 luma CTBs supported:
log2_max_coding_tree_block_size_minus3 = 3
log2_min_coding_tree_block_size_minus3 = 3
Both 64x64 and 32x32 luma CTBs supported:
log2_max_coding_tree_block_size_minus3 = 3
log2_min_coding_tree_block_size_minus3 = 2
LGTM, will prepare a correlated PR with https://github.com/intel/media-driver to use it, then will merge it. |
* | ||
* Allows setting amp_enabled_flag in the SPS. | ||
*/ | ||
uint32_t amp : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If amp is always supported by driver and cannot be disabled, how to set this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the driver requires amp to always be enabled then it should return VA_FEATURE_REQUIRED in this field.
* | ||
* Allows setting separate_colour_plane_flag in the SPS. | ||
*/ | ||
uint32_t separate_colour_planes : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If from driver side, only packed is supported, planar are not supported. The meaning of 0 and 1 are not identical to Recommendation ITU-T H.265, the value returned from driver is VA_FEATURE_NOT_SUPPORTED. Are my understanding correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. VA_FEATURE_NOT_SUPPORTED
from the driver indicates that the separate colour planes are not usable, so separate_colour_planes
must then always be set to zero by the user.
* Allows setting slice_sao_luma_flag and slice_sao_chroma_flag | ||
* in slice headers. | ||
*/ | ||
uint32_t sao : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Driver allows these two slice_sao_luma_flag and slice_sao_chroma_flag to be disabled/enabled respectively. How to set this field, which bit represent luma/chroma?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you thinking of hardware which supports SAO only on one type of plane but not the other? I was thinking that no such hardware would exist, but if you think it is possible then we could add separate sao_luma
and sao_chroma
fields to distinguish the cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rereading that, I'm now not sure that that was your question. To clarify my intended meaning:
If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_NOT_SUPPORTED
, then the user must set both slice_sao_luma_flag
and slice_sao_chroma_flag
to zero.
If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_SUPPORTED
, then the user can pick whatever values they want for slice_sao_luma_flag
and slice_sao_chroma_flag
, and they need not be the same.
If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_REQUIRED
, then the user must set both slice_sao_luma_flag
and slice_sao_chroma_flag
to one.
Do you have/expect hardware for which that would be a problem?
* the PPS. The pred_weight_table() data must be supplied with | ||
* every slice header when weighted prediction is enabled. | ||
*/ | ||
uint32_t weighted_prediction : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same with SAO. Driver also allows weighted_pred_flag and weighted_bipred_flag to be disabled/enabled respectively. How to set this field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So the same case but for hardware which only has partial weighted prediction support? Again, I was imagining that no such hardware would exist, but we could separate them if you do have such hardware. (Hardware which lacks support for encoding P or B frames wouldn't care about the value of the other flag, because it would never be used.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a question about weighted_prediction, media-driver support two kind of weighted-prediction: 1. explicit weighted prediction, application should provide pred_weight_table, and media-driver do weighted prediction basing on the pred_weight_table from applicaiton, so a pre-process needed (such as, fade in/out detection) before encoding 2. media driver based weighted prediction (or implicit weight prediction), media-driver will analyze the statistic datas and will provide a pred_weigt_table internally, to help the video quality. it happens inside media-driver during encoding process.
my question is: how to report this kind of capability?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XinfengZhang I think the setting for driver-produced prediction weights should be signalled by its own attribute rather than embedded here, since it's not actually related to the codec implementation - it would exist in H.264/H.266 as well.
(There are other similar cases where either a driver-internal process or some kind of preprocessing step could be used to produce values for codec processes - e.g. the segmentation map in VP8/VP9/AV1, scaling lists in H.264/H.265/H.266, global motion in AV1. It would be sensible for these to all be treated in a consistent way.)
va/va_enc_hevc.h
Outdated
* | ||
* Log2MinCuQpDeltaSize must not be smaller than this. | ||
*/ | ||
uint32_t log2_min_cu_qp_delta_size_minus3 : 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From HEVC SPEC, Log2MinCuQpDeltaSize = CtbLog2SizeY − diff_cu_qp_delta_depth, if diff_cu_qp_delta_depth can be assigned to 0 or 3 for different LCU, how to set log2_min_cu_qp_delta_size_minus3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Tianhaol The intent here is that the limit of a single hardware implementation is going to be the level of the smallest coding unit for which it can usefully vary qp. If that is, for example, at the 16x16 block size then we set log2_min_cu_qp_delta_size_minus3 = 1
, and that then implies that the max value of diff_cu_qp_delta_depth
would be 2 for 64x64 CTBs or 1 for 32x32 CTBs.
Does that fit with how you use it? If you have hardware which has significantly different implementations at different CTU sizes and therefore doesn't match that setup then we may need to come up with something else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no such concern with current HW. but , if some HW support both 64x64 CTBs and 32x32CTBs, and with different CTBs, there are different values , for 64x64CTBs, diff_cu_qp_delta_depth = 3, log2_min_cu_qp_delta_size_minus3 = 0,
for 32x32CTBs , log2_min_cu_qp_delta_size_minus3 = 2. of course, we are discussing potential risk. is it extendable ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If diff_cu_qp_delta_depth only can be assigned to two values, 3 or 0, not in the range of 0 to 3, how to set this field log2_min_cu_qp_delta_size_minus3?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhvwy hi Mark, another question: if log2_min_cu_qp_delta_size_minus3 = 0, it will mean diff_cu_qp_delta_depth = 3, does application could set a value < 3? if HW just support one setting, such like diff_cu_qp_delta_depth must equal to 3. how to report it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fhvwy , in theory, "cu_qp_delta_enabled_flag == VA_FEATURE_NOT_SUPPORTED" should has same effect with diff_cu_qp_delta_depth = 0, only difference should be about the present of this bit.
"diff_cu_qp_delta_depth specifies the difference between the luma coding tree block size and the minimum luma coding block size of coding units that convey cu_qp_delta_abs and cu_qp_delta_sign_flag. The value of diff_cu_qp_delta_depth shall be in the range of 0 to log2_diff_max_min_luma_coding_block_size, inclusive. When not present, the value of diff_cu_qp_delta_depth is inferred to be equal to 0"
TBH, I m not sure whether diff_cu_qp_delta_depth = 0 works with G12 VDEnc, @Tianhaol , does our test case cover this?
please help to confirm it.
@fhvwy , whats the concern of adding two fields: max_diff_cu_qp_delta_depth & min_diff_cu_qp_delta_depth or
log2_min_cu_qp_delta_size_minus3 & adding log2_max_cu_qp_delta_size_minus3 , of course , it just for one dedicate CTB setting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@XinfengZhang No it doesn't. If cu_qp_delta_enabled_flag
is not set then delta_qp()
never codes anything regardless of the level it happens at and the QP is fixed - see 7.3.8.14. The inference you quote from 7.4.3.3.1 only exists because it is convenient in 8.6.1 to have the Log2MinCuQpDeltaSize
for quantisation groups which isn't the whole slice.
The reason not to add partial fields now is that they would be deprecated when we add a proper attribute later, and I don't think it is a good idea to be maintaining an extra incomplete source of the same information. If you want to include that information in this PR, what do you think of the suggestion above of a VAConfigAttribValEncHEVCQPDeltaDepthSizes
attribute?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, thanks, if cu_qp_delta_enabled_flag == 1 and diff_cu_qp_delta_depth == 0, it means there are only one syntax unit delta_qp() for maximum TU when maximum TU is selected. if cu_qp_delta_enabled_flag = 0 , no delta_qp() present.
add new VAConfigAttribValEncHEVCQPDeltaDepthSizes , there will be enough bits to describe this capability. but as I mentioned, I am not very sure whether diff_cu_qp_delta_depth = 0 works on Gen12 VDEnc, even there are similar description with G11. need some test to verify it. @Tianhaol , please help to check.
about "but that is broken anyway with the current definition of the API (where people have generally hardcoded Gen9 values" , it is true for GST & FFmpeg. because we have no such capability report, GST & FFmpeg may set it to 0 by default , but I suppose MSDK set it to 3 by default for G12 VDEnc, if we add new attributes, MSDK also need to query and handle it. so it will be set to 0 instead. suppose PSNR /SSIM will be impact , and impact validation cases.
about "proper attribute" , you mean that "min/max" is not enough for some cases, so after adding these, still need "proper attribute" then deprecate "min/max". what's the case? want to support different setting for different CTB size?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If software written for one driver rather than against generic VAAPI is already using special knowledge about the behaviour of the driver it runs on then I don't see why it couldn't continue to do that. (Correct me if I'm wrong, but I assume there is no intent that the Intel Media SDK would run with, say, the Mesa VAAPI driver in future.)
Min/max is not sufficient for the 0/3 case described by @Tianhaol and in the Ice Lake documentation, so if we want to support that we need an attribute with more fine-grained information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hi @fhvwy, @Tianhaol implemented the interface with intel/media-driver#1121, could you help to take a look and review it? |
Signed-off-by: Mark Thompson <[email protected]>
@fhvwy, if no other concern, I will merge the PR firstly, then I will file another PR to change the comments from " if there are new attribute to describe the restriction of delta_depth. it could be a supplementary. is it ok? |
it is a supplement to intel#385 Signed-off-by: Carl Zhang <[email protected]>
it is a supplement to #385 Signed-off-by: Carl Zhang <[email protected]>
it is a supplement to intel#385 Signed-off-by: Carl Zhang <[email protected]>
I see there is an incomplete version of pretty much the same thing already at #379 . To compare with the most obvious change between them, I don't think there is any point in trying to make the config attribute look codec-generic because they have to be different for each codec anyway (like the existing JPEG one) and we aren't going to run out of index values any time soon. There will likely need to be further attributes for HEVC RExt and SCC if they are added, since the additional flags there will not fit in the current attribute.