FP8 support? #788

wuchaooooo · 2024-01-25T03:43:50Z

wuchaooooo
Jan 25, 2024

Hi! Is adding FP8 transformer engine (H100) speedup to inference planned?
If not, could you please give me an outline of what needs to be done in order for me to work on that?

Thank you!

zRzRzRzRzRzRzR · 2024-01-27T14:35:49Z

zRzRzRzRzRzRzR
Jan 27, 2024
Maintainer

Not yet, because the machine is currently tight, and I cannot test it on the H100 with FP8 inference capabilities. I don’t have much advice on related content yet, because I haven’t conducted an in-depth test in this scenario.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FP8 support? #788

{{title}}

Replies: 1 comment

{{title}}

Select a reply

FP8 support? #788

wuchaooooo Jan 25, 2024

Replies: 1 comment

zRzRzRzRzRzRzR Jan 27, 2024 Maintainer

wuchaooooo
Jan 25, 2024

zRzRzRzRzRzRzR
Jan 27, 2024
Maintainer