Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add config attribute to advertise HEVC encoder features #385

Merged
merged 1 commit into from
Jun 2, 2021

Conversation

fhvwy
Copy link
Contributor

@fhvwy fhvwy commented Mar 5, 2020

I see there is an incomplete version of pretty much the same thing already at #379 . To compare with the most obvious change between them, I don't think there is any point in trying to make the config attribute look codec-generic because they have to be different for each codec anyway (like the existing JPEG one) and we aren't going to run out of index values any time soon. There will likely need to be further attributes for HEVC RExt and SCC if they are added, since the additional flags there will not fit in the current attribute.

@fhvwy
Copy link
Contributor Author

fhvwy commented Mar 5, 2020

@XinfengZhang
Copy link
Contributor

@xuguangxin @fulinjie please check interface whether it meet your requirement

@fulinjie
Copy link
Contributor

fulinjie commented Mar 6, 2020

It makes sense, and got several questions raised in:
https://patchwork.ffmpeg.org/project/ffmpeg/patch/[email protected]/

va/va.h Outdated

/** Minimum supported block size containing a QP delta. */
uint32_t min_cu_qp_delta_block_size_minus3 : 2;
} bits;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

27 bit used, should we add "uint32_t reserved :5" ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fhvwy hi Mark , could you help to add reserved bits and rebase the PR? thanks

@fhvwy
Copy link
Contributor Author

fhvwy commented Nov 8, 2020

(Only back to this recently, apologies for the long delay.)
Here is a new version.
The main change is that feature support is now tristate - the driver can indicate that a feature is either not supported, optionally supported, or required. This was prompted by some Intel hardware requiring cu_qp_delta_enabled_flag to be set, even in CQP mode when that isn't wanted, and I think it does make sense in general as other features may have similar constraints in other hardware.
Since all the feature fields are now two bits, that change necessitated splitting the attribute in two - the block sizes are now all in a separate attribute (though I've documented that a driver must support both if it supports either).
Documentation is more precise this time, too - it would be helpful if someone familiar with the block sizes could check that the bounds it offers are correctly defined and do the right thing, since that's the trickiest part of this.

@fhvwy
Copy link
Contributor Author

fhvwy commented Nov 8, 2020

For FFmpeg implementing the user-side of this, see https://github.com/fhvwy/FFmpeg/tree/h265query

*
* CtbLog2SizeY must not be larger than this.
*/
uint32_t log2_max_coding_tree_block_size_minus3 : 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ask a question: if some device support LTU = 64, but not support LTU=32. how to report this attributes?
how to distinguish it from : support LTU=64 , also support LTU=32. both ok.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only 64x64 luma CTBs supported:
log2_max_coding_tree_block_size_minus3 = 3
log2_min_coding_tree_block_size_minus3 = 3
Both 64x64 and 32x32 luma CTBs supported:
log2_max_coding_tree_block_size_minus3 = 3
log2_min_coding_tree_block_size_minus3 = 2

@XinfengZhang
Copy link
Contributor

LGTM, will prepare a correlated PR with https://github.com/intel/media-driver to use it, then will merge it.

*
* Allows setting amp_enabled_flag in the SPS.
*/
uint32_t amp : 2;
Copy link
Contributor

@Tianhaol Tianhaol Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If amp is always supported by driver and cannot be disabled, how to set this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the driver requires amp to always be enabled then it should return VA_FEATURE_REQUIRED in this field.

*
* Allows setting separate_colour_plane_flag in the SPS.
*/
uint32_t separate_colour_planes : 2;
Copy link
Contributor

@Tianhaol Tianhaol Jan 4, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If from driver side, only packed is supported, planar are not supported. The meaning of 0 and 1 are not identical to Recommendation ITU-T H.265, the value returned from driver is VA_FEATURE_NOT_SUPPORTED. Are my understanding correct?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. VA_FEATURE_NOT_SUPPORTED from the driver indicates that the separate colour planes are not usable, so separate_colour_planes must then always be set to zero by the user.

* Allows setting slice_sao_luma_flag and slice_sao_chroma_flag
* in slice headers.
*/
uint32_t sao : 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Driver allows these two slice_sao_luma_flag and slice_sao_chroma_flag to be disabled/enabled respectively. How to set this field, which bit represent luma/chroma?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you thinking of hardware which supports SAO only on one type of plane but not the other? I was thinking that no such hardware would exist, but if you think it is possible then we could add separate sao_luma and sao_chroma fields to distinguish the cases.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rereading that, I'm now not sure that that was your question. To clarify my intended meaning:

If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_NOT_SUPPORTED, then the user must set both slice_sao_luma_flag and slice_sao_chroma_flag to zero.

If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_SUPPORTED, then the user can pick whatever values they want for slice_sao_luma_flag and slice_sao_chroma_flag, and they need not be the same.

If VAConfigAttribValEncHEVCFeatures.sao == VA_FEATURE_REQUIRED, then the user must set both slice_sao_luma_flag and slice_sao_chroma_flag to one.

Do you have/expect hardware for which that would be a problem?

* the PPS. The pred_weight_table() data must be supplied with
* every slice header when weighted prediction is enabled.
*/
uint32_t weighted_prediction : 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same with SAO. Driver also allows weighted_pred_flag and weighted_bipred_flag to be disabled/enabled respectively. How to set this field?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the same case but for hardware which only has partial weighted prediction support? Again, I was imagining that no such hardware would exist, but we could separate them if you do have such hardware. (Hardware which lacks support for encoding P or B frames wouldn't care about the value of the other flag, because it would never be used.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a question about weighted_prediction, media-driver support two kind of weighted-prediction: 1. explicit weighted prediction, application should provide pred_weight_table, and media-driver do weighted prediction basing on the pred_weight_table from applicaiton, so a pre-process needed (such as, fade in/out detection) before encoding 2. media driver based weighted prediction (or implicit weight prediction), media-driver will analyze the statistic datas and will provide a pred_weigt_table internally, to help the video quality. it happens inside media-driver during encoding process.
my question is: how to report this kind of capability?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XinfengZhang I think the setting for driver-produced prediction weights should be signalled by its own attribute rather than embedded here, since it's not actually related to the codec implementation - it would exist in H.264/H.266 as well.

(There are other similar cases where either a driver-internal process or some kind of preprocessing step could be used to produce values for codec processes - e.g. the segmentation map in VP8/VP9/AV1, scaling lists in H.264/H.265/H.266, global motion in AV1. It would be sensible for these to all be treated in a consistent way.)

va/va_enc_hevc.h Outdated
*
* Log2MinCuQpDeltaSize must not be smaller than this.
*/
uint32_t log2_min_cu_qp_delta_size_minus3 : 2;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From HEVC SPEC, Log2MinCuQpDeltaSize = CtbLog2SizeY − diff_cu_qp_delta_depth, if diff_cu_qp_delta_depth can be assigned to 0 or 3 for different LCU, how to set log2_min_cu_qp_delta_size_minus3?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Tianhaol The intent here is that the limit of a single hardware implementation is going to be the level of the smallest coding unit for which it can usefully vary qp. If that is, for example, at the 16x16 block size then we set log2_min_cu_qp_delta_size_minus3 = 1, and that then implies that the max value of diff_cu_qp_delta_depth would be 2 for 64x64 CTBs or 1 for 32x32 CTBs.

Does that fit with how you use it? If you have hardware which has significantly different implementations at different CTU sizes and therefore doesn't match that setup then we may need to come up with something else.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no such concern with current HW. but , if some HW support both 64x64 CTBs and 32x32CTBs, and with different CTBs, there are different values , for 64x64CTBs, diff_cu_qp_delta_depth = 3, log2_min_cu_qp_delta_size_minus3 = 0,
for 32x32CTBs , log2_min_cu_qp_delta_size_minus3 = 2. of course, we are discussing potential risk. is it extendable ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If diff_cu_qp_delta_depth only can be assigned to two values, 3 or 0, not in the range of 0 to 3, how to set this field log2_min_cu_qp_delta_size_minus3?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fhvwy hi Mark, another question: if log2_min_cu_qp_delta_size_minus3 = 0, it will mean diff_cu_qp_delta_depth = 3, does application could set a value < 3? if HW just support one setting, such like diff_cu_qp_delta_depth must equal to 3. how to report it?

Copy link
Contributor

@XinfengZhang XinfengZhang Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fhvwy , in theory, "cu_qp_delta_enabled_flag == VA_FEATURE_NOT_SUPPORTED" should has same effect with diff_cu_qp_delta_depth = 0, only difference should be about the present of this bit.

"diff_cu_qp_delta_depth specifies the difference between the luma coding tree block size and the minimum luma coding block size of coding units that convey cu_qp_delta_abs and cu_qp_delta_sign_flag. The value of diff_cu_qp_delta_depth shall be in the range of 0 to log2_diff_max_min_luma_coding_block_size, inclusive. When not present, the value of diff_cu_qp_delta_depth is inferred to be equal to 0"

TBH, I m not sure whether diff_cu_qp_delta_depth = 0 works with G12 VDEnc, @Tianhaol , does our test case cover this?
please help to confirm it.

@fhvwy , whats the concern of adding two fields: max_diff_cu_qp_delta_depth & min_diff_cu_qp_delta_depth or
log2_min_cu_qp_delta_size_minus3 & adding log2_max_cu_qp_delta_size_minus3 , of course , it just for one dedicate CTB setting.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@XinfengZhang No it doesn't. If cu_qp_delta_enabled_flag is not set then delta_qp() never codes anything regardless of the level it happens at and the QP is fixed - see 7.3.8.14. The inference you quote from 7.4.3.3.1 only exists because it is convenient in 8.6.1 to have the Log2MinCuQpDeltaSize for quantisation groups which isn't the whole slice.

The reason not to add partial fields now is that they would be deprecated when we add a proper attribute later, and I don't think it is a good idea to be maintaining an extra incomplete source of the same information. If you want to include that information in this PR, what do you think of the suggestion above of a VAConfigAttribValEncHEVCQPDeltaDepthSizes attribute?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, thanks, if cu_qp_delta_enabled_flag == 1 and diff_cu_qp_delta_depth == 0, it means there are only one syntax unit delta_qp() for maximum TU when maximum TU is selected. if cu_qp_delta_enabled_flag = 0 , no delta_qp() present.

add new VAConfigAttribValEncHEVCQPDeltaDepthSizes , there will be enough bits to describe this capability. but as I mentioned, I am not very sure whether diff_cu_qp_delta_depth = 0 works on Gen12 VDEnc, even there are similar description with G11. need some test to verify it. @Tianhaol , please help to check.

about "but that is broken anyway with the current definition of the API (where people have generally hardcoded Gen9 values" , it is true for GST & FFmpeg. because we have no such capability report, GST & FFmpeg may set it to 0 by default , but I suppose MSDK set it to 3 by default for G12 VDEnc, if we add new attributes, MSDK also need to query and handle it. so it will be set to 0 instead. suppose PSNR /SSIM will be impact , and impact validation cases.

about "proper attribute" , you mean that "min/max" is not enough for some cases, so after adding these, still need "proper attribute" then deprecate "min/max". what's the case? want to support different setting for different CTB size?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If software written for one driver rather than against generic VAAPI is already using special knowledge about the behaviour of the driver it runs on then I don't see why it couldn't continue to do that. (Correct me if I'm wrong, but I assume there is no intent that the Intel Media SDK would run with, say, the Mesa VAAPI driver in future.)

Min/max is not sufficient for the 0/3 case described by @Tianhaol and in the Ice Lake documentation, so if we want to support that we need an attribute with more fine-grained information.

Copy link
Contributor

@XinfengZhang XinfengZhang Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hi @fhvwy , @Tianhaol did some tests, we try to use 0, 1, 2, 3 . it works. so maybe we could still keep previous design?log2_min_cu_qp_delta_size_minus3

@XinfengZhang
Copy link
Contributor

XinfengZhang commented Jan 11, 2021

hi @fhvwy, @Tianhaol implemented the interface with intel/media-driver#1121, could you help to take a look and review it?

@XinfengZhang
Copy link
Contributor

XinfengZhang commented May 25, 2021

@fhvwy, if no other concern, I will merge the PR firstly, then I will file another PR to change the comments from "
" * Allows setting cu_qp_delta_enabled_flag in the PPS. When
enabled, diff_cu_qp_delta_depth must be set to zero."
to
"* Allows setting cu_qp_delta_enabled_flag in the PPS"

if there are new attribute to describe the restriction of delta_depth. it could be a supplementary.

is it ok?

@XinfengZhang XinfengZhang merged commit 1f895aa into intel:master Jun 2, 2021
XinfengZhang added a commit to XinfengZhang/libva that referenced this pull request Jun 17, 2021
it is a supplement to intel#385

Signed-off-by: Carl Zhang <[email protected]>
XinfengZhang added a commit that referenced this pull request Jun 17, 2021
it is a supplement to #385

Signed-off-by: Carl Zhang <[email protected]>
FurongZhang pushed a commit to FurongZhang/libva that referenced this pull request Apr 7, 2023
it is a supplement to intel#385

Signed-off-by: Carl Zhang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants