Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nv2a: Handle NaN values to be similar to HW #1780

Draft
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

antoniodesousa
Copy link

This PR is based on @abaire work at #913. So big thanks to him for his research 🙏

Due to his implementation not working on my PC, I decided to do my own research and hopefully came up with a solution that works on any hardware. The solution was tested on 2 PCs: one with an AMD Radeon RX 6600M GPU, and the other with a NVIDIA GeForce GTX 960 GPU. Both running on Windows 11 23H2, 64 GB RAM with the latest drivers installed as of today.

HW test on the left side - PC test on the right side

Attrib_float_0_1
Attrib_float_0_8
Attrib_float_-1_1
Attrib_float_-8_1
Attrib_float_-INF_INF
Attrib_float_-Max_Max
Attrib_float_-MaxSN_MaxSN
Attrib_float_-Min_Min
Attrib_float_-MinN_MinN
Attrib_float_-NaNq_NaNq
Attrib_float_-NaNs_NaNs

Otogi: Myth of Demons

xemu-2024-10-04-13-45-03
xemu-2024-10-04-13-56-02

Otogi 2: Immortal Warriors

xemu-2024-10-04-15-30-00
xemu-2024-10-04-15-31-12
xemu-2024-10-04-15-31-37

Pro Cast: Sports Fishing Game

xemu-2024-10-04-13-14-10

Trigger Man

xemu-2024-10-04-16-21-17
xemu-2024-10-04-16-24-41
xemu-2024-10-04-16-33-33

Thousand Land

xemu-2024-10-05-12-00-28
xemu-2024-10-05-12-00-58

Copy link
Contributor

@antangelo antangelo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I ran the pgraph tests on my Linux workstation with a 3060 Ti and the NaN columns are black across all tests.

0_1

hw/xbox/nv2a/shaders.c Outdated Show resolved Hide resolved
@antoniodesousa antoniodesousa force-pushed the fix_nan_handling branch 2 times, most recently from 2ab43f1 to cc001ce Compare October 6, 2024 15:04
@antoniodesousa
Copy link
Author

After giving it some thought and testing I concluded that HW handles -NaN/+NaN the same. It just defaults to 1.0. I changed the implementation and retest everything again based on this assumption. I got the same results as before, but I was already expecting that. I just wanted to double check.

@antangelo can you please retest the build on your PC? Thanks.

@medievil1
Copy link

medievil1 commented Oct 6, 2024

works on nvidia too (windows 11, my own build of the PR)
image

Copy link
Contributor

@Ernegien Ernegien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the .gitignore additions but they should probably be included in a separate PR.

@antoniodesousa
Copy link
Author

I agree with the .gitignore additions but they should probably be included in a separate PR.

Yeah, I was thinking about it. I gonna wait for @mborgerson to review this PR and create another PR for it if needed.

@Ernegien
Copy link
Contributor

Ernegien commented Oct 7, 2024

For those following along willing to test and share their results here, you must first be signed into GitHub and can then download the xemu build for your specific system. For example, Windows users can directly use this.

You'll also need the pgragh test ISO to run within xemu. After you've loaded the pgraph test ISO within xemu it will auto-run (it may assert/crash which is likely fine) all tests. The results are stored on your Xbox HDD at E:\nxdk_pgraph_tests under the test name and case. You can retrieve these via FTP by running the LithiumX dashboard ISO.

Alternatively, you can easily manually run the ATTRIB FLOAT tests and screenshot those that differ from previous user reports. You can do this within xemu via F12; they're stored in the directory specified underneath xemu->Machine->Settings->General->Screenshot output directory.

When reporting test results within xemu, be sure to grab a copy of your build and system info via Help->About which should look similar to the following (my specific hardware config for the tests below).

Version:      0.7.132-3-g1caaf64c84
Branch:       
Commit:       1caaf64c8466b6e8f777f84c28edeb7dbcad6e11
Date:         Sun Oct  6 20:10:34 UTC 2024

CPU:          AMD Ryzen 7 4800H with Radeon Graphics         
OS Platform:  Windows
OS Version:    23H2
Manufacturer: NVIDIA Corporation
GPU Model:    NVIDIA GeForce RTX 2060/PCIe/SSE2
Driver:       4.0.0 NVIDIA 551.23
Shader:       4.00 NVIDIA via Cg compiler

My xemu test results (internal resolution scale set to 1) which do not match real hardware are as follows:

xemu-2024-10-07-11-51-36
xemu-2024-10-07-11-54-35
xemu-2024-10-07-11-54-51
xemu-2024-10-07-11-55-12
xemu-2024-10-07-11-55-34
xemu-2024-10-07-11-55-54
xemu-2024-10-07-11-56-01
xemu-2024-10-07-11-56-16
xemu-2024-10-07-11-56-21

@antoniodesousa
Copy link
Author

antoniodesousa commented Oct 7, 2024

That's interesting, I wonder why in your PC is black @Ernegien and for @medievil1 seems to be working fine. Considering you both are on Windows 11 and using nvidia GPUs.

@medievil1
Copy link

mine do not match hardware either, but otagi definitely does work with this pr
image

@mborgerson mborgerson marked this pull request as draft October 8, 2024 21:16
@mborgerson
Copy link
Member

mborgerson commented Oct 8, 2024

Moved to draft state because although it may fix issues in some games, I'm not convinced yet that this is ready for merge.

I'd like to know specifically, for starters:

  • What input/computation results in NaN values in games and why (e.g. shader code and actual values please)
  • Are NaNs actually coming from attribute inputs or are they the result of an undefined operation, or indicative of faulty emulation elsewhere (e.g FPU), etc
  • Which of these unit test results differ among different hardware vendors and why

@NZJenkins
Copy link

After giving it some thought and testing I concluded that HW handles -NaN/+NaN the same. It just defaults to 1.0.

Based on the tests HW maps -NaN to 0, +NaN to 1.
Note if you multiply -NaN by 1 in the VS, it ends up as 1 (instead of 0).
The first column in the test IIRC is the raw value passed into the vertex shader, converted to a pixel colour.

I am not sure GLSL provides any way to distinguish between +NaN and -NaN.
I found this in the GLSL 4.6 spec (no idea if xemu uses that version) which suggests built-in functions can return anything they want when given a NaN?

built-in functions that operate on a NaN are not required to return a NaN as the result.
However if NaNs are generated, isnan() must return the correct value.

@antoniodesousa
Copy link
Author

Based on the tests HW maps -NaN to 0, +NaN to 1.

I don't know what HW tests are you referring to. My HW tests clearly shows the same result for both values, whether the value is -NaN or +NaN doesn't matter. If what you are saying was true, they wouldn't match in most of the tests.

I am not sure GLSL provides any way to distinguish between +NaN and -NaN.
I found this in the GLSL 4.6 spec (no idea if xemu uses that version) which suggests built-in functions can return anything they want when given a NaN?

Xemu uses GLSL 4.0. And yes, there's a way to distinguish between +NaN and -NaN using the sign function.

@NZJenkins
Copy link

I don't know what HW tests are you referring to. My HW tests clearly shows the same result for both values.

I'm talking about the pgraph test, running on Xbox hardware.
I didn't realise you were talking about testing on your HW. Nevermind then.

And yes, there's a way to distinguish between +NaN and -NaN using the sign function.

I don't think this is part of the GLSL spec.
The "correct" result of sign(-NaN) would be NaN, I suppose.

Ideally there is a way to get the correct behaviour programming to the spec instead of relying on GPU quirks and vendor specific behaviour.
As far as I can see, sign(floatBitsToInt(x)) should allow getting the sign bit reliably?

@medievil1
Copy link

medievil1 commented Oct 10, 2024

Based on the tests HW maps -NaN to 0, +NaN to 1. Note if you multiply -NaN by 1 in the VS, it ends up as 1 (instead of 0).

as I understand it, the issue seems to be that PC gpu hardware don't behave the same...for example on AMD it matches hardware
373339983-a1d73d86-69db-4753-88fe-1ca1b9860e8c

but on nvidia it does not
xemu-2024-10-07-13-04-29

I need to look at the tests source and see what it is sending to see if I can help identify the solution

@medievil1
Copy link

medievil1 commented Oct 10, 2024

I wonder about this in the pgraph tests code
static float f(uint32_t v) { return *((float *)&v); }
and
"-NaNq_NaNq", "-NaN to +NaN (quiet)", {f(negNanQ), f(posNanQ)}},
as example...
NaN is already float by nature

EDIT
yes I am aware NaN can be non-float...lol

@coldhex
Copy link

coldhex commented Nov 16, 2024

@mborgerson I used Evox dashboard debugger on my Xbox to inspect Pro Cast: Sports Fishing Game. The NaN values (0xFFC00000) arise, because the game divides 0 by 0 (using FPU fild and fidiv).

The pushbuffer from the game with NV097_SET_LIGHT_AMBIENT_COLOR, NV097_SET_LIGHT_DIFFUSE_COLOR and NV097_SET_LIGHT_SPECULAR_COLOR is:

.Break
BP 0 @ 0012e490
EAX : 83f5b040
EBX : 00000000
ECX : 00269860
EDX : 00000000
ESI : 0013ddb0
EDI : 0013d2d0
EBP : 00241000
.db 83f5b040 30
83f5b040 : 00 10 24 00 00 00 c0 ff 00 00 c0 ff 00 00 c0 ff  | ..$...@..@..@
83f5b050 : 00 00 c0 ff 00 00 c0 ff 00 00 c0 ff 00 00 00 00  | ..@..@..@....
83f5b060 : 00 00 00 00 00 00 00 00 6c 10 36 08 30 f8 70 20  | ........l.6.0xp 

Xemu calculates the same values (according to gdb.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants