nv2a: Handle NaN values to be similar to HW #1780

antoniodesousa · 2024-10-05T13:23:09Z

This PR is based on @abaire work at #913. So big thanks to him for his research 🙏

Due to his implementation not working on my PC, I decided to do my own research and hopefully came up with a solution that works on any hardware. The solution was tested on 2 PCs: one with an AMD Radeon RX 6600M GPU, and the other with a NVIDIA GeForce GTX 960 GPU. Both running on Windows 11 23H2, 64 GB RAM with the latest drivers installed as of today.

HW test on the left side - PC test on the right side

Otogi: Myth of Demons

Otogi 2: Immortal Warriors

Pro Cast: Sports Fishing Game

Trigger Man

Thousand Land

antangelo

I ran the pgraph tests on my Linux workstation with a 3060 Ti and the NaN columns are black across all tests.

hw/xbox/nv2a/shaders.c

antoniodesousa · 2024-10-06T15:24:36Z

After giving it some thought and testing I concluded that HW handles -NaN/+NaN the same. It just defaults to 1.0. I changed the implementation and retest everything again based on this assumption. I got the same results as before, but I was already expecting that. I just wanted to double check.

@antangelo can you please retest the build on your PC? Thanks.

medievil1 · 2024-10-06T16:59:13Z

works on nvidia too (windows 11, my own build of the PR)

Ernegien

I agree with the .gitignore additions but they should probably be included in a separate PR.

antoniodesousa · 2024-10-07T15:38:51Z

I agree with the .gitignore additions but they should probably be included in a separate PR.

Yeah, I was thinking about it. I gonna wait for @mborgerson to review this PR and create another PR for it if needed.

Ernegien · 2024-10-07T16:58:58Z

For those following along willing to test and share their results here, you must first be signed into GitHub and can then download the xemu build for your specific system. For example, Windows users can directly use this.

You'll also need the pgragh test ISO to run within xemu. After you've loaded the pgraph test ISO within xemu it will auto-run (it may assert/crash which is likely fine) all tests. The results are stored on your Xbox HDD at E:\nxdk_pgraph_tests under the test name and case. You can retrieve these via FTP by running the LithiumX dashboard ISO.

Alternatively, you can easily manually run the ATTRIB FLOAT tests and screenshot those that differ from previous user reports. You can do this within xemu via F12; they're stored in the directory specified underneath xemu->Machine->Settings->General->Screenshot output directory.

When reporting test results within xemu, be sure to grab a copy of your build and system info via Help->About which should look similar to the following (my specific hardware config for the tests below).

Version:      0.7.132-3-g1caaf64c84
Branch:       
Commit:       1caaf64c8466b6e8f777f84c28edeb7dbcad6e11
Date:         Sun Oct  6 20:10:34 UTC 2024

CPU:          AMD Ryzen 7 4800H with Radeon Graphics         
OS Platform:  Windows
OS Version:    23H2
Manufacturer: NVIDIA Corporation
GPU Model:    NVIDIA GeForce RTX 2060/PCIe/SSE2
Driver:       4.0.0 NVIDIA 551.23
Shader:       4.00 NVIDIA via Cg compiler

My xemu test results (internal resolution scale set to 1) which do not match real hardware are as follows:

antoniodesousa · 2024-10-07T17:09:54Z

That's interesting, I wonder why in your PC is black @Ernegien and for @medievil1 seems to be working fine. Considering you both are on Windows 11 and using nvidia GPUs.

medievil1 · 2024-10-08T03:26:02Z

mine do not match hardware either, but otagi definitely does work with this pr

mborgerson · 2024-10-08T21:22:22Z

Moved to draft state because although it may fix issues in some games, I'm not convinced yet that this is ready for merge.

I'd like to know specifically, for starters:

What input/computation results in NaN values in games and why (e.g. shader code and actual values please)
Are NaNs actually coming from attribute inputs or are they the result of an undefined operation, or indicative of faulty emulation elsewhere (e.g FPU), etc
Which of these unit test results differ among different hardware vendors and why

NZJenkins · 2024-10-09T06:50:30Z

After giving it some thought and testing I concluded that HW handles -NaN/+NaN the same. It just defaults to 1.0.

Based on the tests HW maps -NaN to 0, +NaN to 1.
Note if you multiply -NaN by 1 in the VS, it ends up as 1 (instead of 0).
The first column in the test IIRC is the raw value passed into the vertex shader, converted to a pixel colour.

I am not sure GLSL provides any way to distinguish between +NaN and -NaN.
I found this in the GLSL 4.6 spec (no idea if xemu uses that version) which suggests built-in functions can return anything they want when given a NaN?

built-in functions that operate on a NaN are not required to return a NaN as the result.
However if NaNs are generated, isnan() must return the correct value.

antoniodesousa · 2024-10-09T12:45:13Z

Based on the tests HW maps -NaN to 0, +NaN to 1.

I don't know what HW tests are you referring to. My HW tests clearly shows the same result for both values, whether the value is -NaN or +NaN doesn't matter. If what you are saying was true, they wouldn't match in most of the tests.

I am not sure GLSL provides any way to distinguish between +NaN and -NaN.
I found this in the GLSL 4.6 spec (no idea if xemu uses that version) which suggests built-in functions can return anything they want when given a NaN?

Xemu uses GLSL 4.0. And yes, there's a way to distinguish between +NaN and -NaN using the sign function.

NZJenkins · 2024-10-09T20:15:38Z

I don't know what HW tests are you referring to. My HW tests clearly shows the same result for both values.

I'm talking about the pgraph test, running on Xbox hardware.
I didn't realise you were talking about testing on your HW. Nevermind then.

And yes, there's a way to distinguish between +NaN and -NaN using the sign function.

I don't think this is part of the GLSL spec.
The "correct" result of sign(-NaN) would be NaN, I suppose.

Ideally there is a way to get the correct behaviour programming to the spec instead of relying on GPU quirks and vendor specific behaviour.
As far as I can see, sign(floatBitsToInt(x)) should allow getting the sign bit reliably?

medievil1 · 2024-10-10T03:41:06Z

Based on the tests HW maps -NaN to 0, +NaN to 1. Note if you multiply -NaN by 1 in the VS, it ends up as 1 (instead of 0).

as I understand it, the issue seems to be that PC gpu hardware don't behave the same...for example on AMD it matches hardware

but on nvidia it does not

I need to look at the tests source and see what it is sending to see if I can help identify the solution

medievil1 · 2024-10-10T03:59:13Z

I wonder about this in the pgraph tests code
static float f(uint32_t v) { return *((float *)&v); }
and
"-NaNq_NaNq", "-NaN to +NaN (quiet)", {f(negNanQ), f(posNanQ)}},
as example...
NaN is already float by nature

EDIT
yes I am aware NaN can be non-float...lol

coldhex · 2024-11-16T15:38:27Z

@mborgerson I used Evox dashboard debugger on my Xbox to inspect Pro Cast: Sports Fishing Game. The NaN values (0xFFC00000) arise, because the game divides 0 by 0 (using FPU fild and fidiv).

The pushbuffer from the game with NV097_SET_LIGHT_AMBIENT_COLOR, NV097_SET_LIGHT_DIFFUSE_COLOR and NV097_SET_LIGHT_SPECULAR_COLOR is:

.Break
BP 0 @ 0012e490
EAX : 83f5b040
EBX : 00000000
ECX : 00269860
EDX : 00000000
ESI : 0013ddb0
EDI : 0013d2d0
EBP : 00241000
.db 83f5b040 30
83f5b040 : 00 10 24 00 00 00 c0 ff 00 00 c0 ff 00 00 c0 ff  | ..$...@..@..@
83f5b050 : 00 00 c0 ff 00 00 c0 ff 00 00 c0 ff 00 00 00 00  | ..@..@..@....
83f5b060 : 00 00 00 00 00 00 00 00 6c 10 36 08 30 f8 70 20  | ........l.6.0xp

Xemu calculates the same values (according to gdb.)

git: Ignore Visual Studio and compiler cache folders

cc001ce

antangelo reviewed Oct 5, 2024

View reviewed changes

hw/xbox/nv2a/shaders.c Outdated Show resolved Hide resolved

antoniodesousa force-pushed the fix_nan_handling branch 2 times, most recently from 2ab43f1 to cc001ce Compare October 6, 2024 15:04

nv2a: Handle NaN values to be similar to HW

8c1ff60

Ernegien reviewed Oct 7, 2024

View reviewed changes

mborgerson mentioned this pull request Oct 8, 2024

nv2a: Adjust NaN handling to be similar to HW #913

Closed

mborgerson marked this pull request as draft October 8, 2024 21:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

nv2a: Handle NaN values to be similar to HW #1780

nv2a: Handle NaN values to be similar to HW #1780

antoniodesousa commented Oct 5, 2024

antangelo left a comment

antoniodesousa commented Oct 6, 2024

medievil1 commented Oct 6, 2024 •

edited

Loading

Ernegien left a comment

antoniodesousa commented Oct 7, 2024

Ernegien commented Oct 7, 2024 •

edited

Loading

antoniodesousa commented Oct 7, 2024 •

edited

Loading

medievil1 commented Oct 8, 2024

mborgerson commented Oct 8, 2024 •

edited

Loading

NZJenkins commented Oct 9, 2024

antoniodesousa commented Oct 9, 2024

NZJenkins commented Oct 9, 2024

medievil1 commented Oct 10, 2024 •

edited

Loading

medievil1 commented Oct 10, 2024 •

edited

Loading

coldhex commented Nov 16, 2024

nv2a: Handle NaN values to be similar to HW #1780

Are you sure you want to change the base?

nv2a: Handle NaN values to be similar to HW #1780

Conversation

antoniodesousa commented Oct 5, 2024

HW test on the left side - PC test on the right side

Otogi: Myth of Demons

Otogi 2: Immortal Warriors

Pro Cast: Sports Fishing Game

Trigger Man

Thousand Land

antangelo left a comment

Choose a reason for hiding this comment

antoniodesousa commented Oct 6, 2024

medievil1 commented Oct 6, 2024 • edited Loading

Ernegien left a comment

Choose a reason for hiding this comment

antoniodesousa commented Oct 7, 2024

Ernegien commented Oct 7, 2024 • edited Loading

antoniodesousa commented Oct 7, 2024 • edited Loading

medievil1 commented Oct 8, 2024

mborgerson commented Oct 8, 2024 • edited Loading

NZJenkins commented Oct 9, 2024

antoniodesousa commented Oct 9, 2024

NZJenkins commented Oct 9, 2024

medievil1 commented Oct 10, 2024 • edited Loading

medievil1 commented Oct 10, 2024 • edited Loading

coldhex commented Nov 16, 2024

medievil1 commented Oct 6, 2024 •

edited

Loading

Ernegien commented Oct 7, 2024 •

edited

Loading

antoniodesousa commented Oct 7, 2024 •

edited

Loading

mborgerson commented Oct 8, 2024 •

edited

Loading

medievil1 commented Oct 10, 2024 •

edited

Loading

medievil1 commented Oct 10, 2024 •

edited

Loading