Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to use 64 bits for matrix computations #8536

Merged
merged 20 commits into from
Jul 15, 2020
Merged

Add option to use 64 bits for matrix computations #8536

merged 20 commits into from
Jul 15, 2020

Conversation

Popov72
Copy link
Contributor

@Popov72 Popov72 commented Jul 14, 2020

See https://forum.babylonjs.com/t/problem-with-matrix-multiplication/12420/5 for eg, or https://forum.babylonjs.com/t/how-to-do-high-precision-transformations/2497/4.

This PR adds the option to use 64 bits for matrix computations.

There is a problem with static matrices: they are created according to the default value of Matrix.Use64Bits in effect in the Babylon package...

For eg, BABYLON.TmpVectors.Matrix[0]._m is a Float32Array if Matrix.Use64Bits = false and should be converted to a regular array if the user set Matrix.Use64Bits = true later on.

I tried to deal with the problem by keeping track of the matrix instances and recreating the _m property when the Use64Bits value change. However, it would be too much to keep track of all instances when only static instances matter, so there's a flag in the constructor for that. However, it relies on the Babylon developper to create static matrices with the right flag, else it won't be tracked... I have done it for the TmpVectors.Matrix array, but we would need to review all the source code to locate where static matrices are created and do the change.

We could simplify this by doing something like:

  • by default, the Matrix class keeps track of all instance creations
  • we add a method like Matrix.StopTrackingInstances
  • this method is called at the very beginning of the Engine constructor

That way, all matrices created before creating the engine are tracked. As creating the engine is normally the very first thing you do in a Babylon app, it's a crude way to collect static matrix instances... That would simplify the problem a lot as we don't need to review all the source code anymore, and future code is foolproof too.

[...] Even easier: use the Use64Bits setter to stop tracking the instances! As above, matrix instances are tracked by default and when Use64Bits is assigned a different value from the previous one, the tracking is disabled. It avoids adding a method to be called by the Engine constructor. That means we can change Use64Bits a single time, but I think it's ok: we should not allow the user to change it multiple times, else we would need to track all matrix creations. We should make clear in the doc that changing Use64Bits should be done early and can be done only a single time [...] Ah, that does not work if we never change the precision to 64 bits... So we still need a StopTrackingInstances method to be called early in the process => I have updated the PR accordingly, let me know what you think. To make clear we can change the precision a single time, I have renamed the method switchTo64Bits.

Note: I have set Use64Bits = true in the PR to verify that the unit tests are ok when the value is true. I will revert it to false when/if the PR is merged.

@deltakosh
Copy link
Contributor

I think we should do something simpler. The matrices created before the switch will stay the way they are
It is a setup you need to run at the very beginning of your code. We should not support an intermediate state

My recommendation is to override the functions. By default the matrix will call a function to create the content with a floatarray but if we call a static function (like SwitchToHighPrecision) then we will overwrite the creation function AT run time at tprototype level

src/Maths/math.vector.ts Outdated Show resolved Hide resolved
src/Maths/math.vector.ts Outdated Show resolved Hide resolved
@Popov72
Copy link
Contributor Author

Popov72 commented Jul 14, 2020

I'm ok to do that but that means some computation will automatically switch to low precision when the code is using matrices declared as static (which can happen in quite a place to avoid garbage collection - notably all the code using the Math.TmpVectors.Matrix array), because those matrices will stay low precision. The user won't know that it happens under the hood and won't be able to do anything to prevent it.

If that's ok for you, I will perform the changes.

@deltakosh
Copy link
Contributor

What if the flag was on the engine? This way the first time we are going to hit Matrix the flag will be already set

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 14, 2020

I don't understand...

Do you mean the Use64Bits flag? If yes, how putting it in the Engine will help regarding the static matrices problem?

@deltakosh
Copy link
Contributor

by setting it on the engine it should be ready before we go to the Matrix static creation. The static properties are defined and executed the first time you reference the matrix class

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 14, 2020

In fact overwriting the constructor does not work, because the generated js code does not use prototype.constructor to create matrix instances.

The typescript code for the Matrix class is expanded to:

var Matrix = /** @class */ (function () {
    function Matrix() {
        this._isIdentity = false;
        this._isIdentityDirty = true;
        this._isIdentity3x2 = true;
        this._isIdentity3x2Dirty = true;
        this.updateFlag = -1;
        this._m = new Float32Array(16);
        this._updateIdentityStatus(false);
    }
    Matrix.prototype._markAsUpdated = function () {
        [...]
    };
    Matrix.prototype._updateIdentityStatus = function (isIdentity, isIdentityDirty, isIdentity3x2, isIdentity3x2Dirty) {
        [...]
    };
    [...]
    return Matrix;
}());

So, I don't see a way to replace the constructor by another one. The only thing I could do is move the this._m = new Float32Array(16); initialization in a function and overwrite this function (but it's an additional function call compared to the existing code): I think this._m = Matrix._Use64Bits ? new Array(16) : new Float32Array(16); is faster.

Also, the TmpVectors static is expanded like this in js:

var TmpVectors = /** @class */ (function () {
    function TmpVectors() {
    }
    [...]
    TmpVectors.Matrix = ArrayTools.BuildArray(8, Matrix.Identity); // 8 temp Matrices at once should be enough
    return TmpVectors;
}());

So, TmpVectors.Matrix is created as soon as the code is parsed, meaning as soon as the babylon package is referenced by the script tag: it's before any user code could be executed, so before we run the Engine constructor.

@deltakosh
Copy link
Contributor

Ok so let's keep the tracking and add a call to a function to construct the content

We should then create a ummy pg wth 20000 spheres where we call comptueWorldMatrix(true) on each frame so we can see if there is an impact (or not :))

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

I have simplified a bit, let me know what you think.

I did not create a separate "init" function (called by the constructor) because I think the current if (Matrix._TrackPrecisionChange) { should execute faster than a function call (Matrix._TrackPrecisionChange = false after the engine is constructed).

Also, I think having this as an option at engine construction is a good idea, that way we don't have to explain in docs that the user should call the SetPrecision method before creating the engine (and so I made it hidden, the user has only to deal with the engine options).

I will make the PG test quickly.

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

So, regarding performances, I used:

Of course, both are the same currently on the playground because the code is not deployed yet.

It is doing 200 x 100000 = 20M computeWorldMatrix(true).

So, on my computer with Chrome it is taking:

  • ~12.5s on the playground (so with the existing code) and on my computer with the new code using 32 bits precision (so same perf for the same settings)
  • ~11.5s on my computer when enabling 64 bits precision

I get roughly the same results with Firefox.

@deltakosh
Copy link
Contributor

wait is it faster with 64bits????

Because then it will be even easier :)

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

I had to change:

    public setMatrix3x3(uniformName: string, matrix: Float32Array): Effect {
        this._valueCache[uniformName] = null;
        this._engine.setMatrix3x3(this._uniforms[uniformName], matrix);

        return this;
    }

to:

public setMatrix3x3(uniformName: string, matrix: Float32Array | Array<number>): Effect {
        this._valueCache[uniformName] = null;
        this._engine.setMatrix3x3(this._uniforms[uniformName], Array.isArray(matrix) ? new Float32Array(matrix) : matrix);

        return this;
    }

in Effect and ShaderMaterial.

However, there is a creation of a new Float32Array each time those functions are called, and this can be avoided.

Indeed, the final function called is uniformMatrix4fv which can take a regular array. So, we could do:

    public setMatrix3x3(uniformName: string, matrix: Float32Array | Array<number>): Effect {
        this._valueCache[uniformName] = null;
        this._engine.setMatrix3x3(this._uniforms[uniformName], matrix as Float32Array);

        return this;
    }

to make TS happy and still have the code working.

Note that it's already what happens:

    public setMatrix(uniformName: string, matrix: IMatrixLike): Effect {
        if (this._cacheMatrix(uniformName, matrix)) {
            this._engine.setMatrices(this._uniforms[uniformName], matrix.toArray() as Float32Array);
        }
        return this;
    }

toArray() now returns Float32Array | Array<number>, so in all strictness matrix.toArray() as Float32Array is wrong and we should check the type returned by toArray(). But as explained, those calls are used to update the matrix uniform, and the gl function also accepts a regular array, so the output of toArray() works in both cases.

Should I revert my changes and simply do a cast?

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

wait is it faster with 64bits????
Because then it will be even easier :)

Indeed.

See this interesting thread in the gl-matrix repo: toji/gl-matrix#359

Note that it could be different on mobiles...

@deltakosh
Copy link
Contributor

Should I revert my changes and simply do a cast?
Well apparently yes ;D

@Popov72 Popov72 marked this pull request as ready for review July 15, 2020 17:20
@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

I have also updated the what's new file.

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

I have also updated the NullEngine class, I think I had to?

@deltakosh
Copy link
Contributor

yep!

@deltakosh
Copy link
Contributor

Wait..I thought we were not even using FLoat32 at all right? if number is faster why not simply using number all the time?

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

You mean I should remove the Float32Array type and only use Array all the time, so no option in the Engine?

The tests were faster on my computer, now I don't know if it would be faster in all circumstances and on all devices (mobiles)...

@deltakosh
Copy link
Contributor

It is a fair point.. then let's keep as is. it is great but jsut remove the change on thinengine. I do not want to incresae its size with matrix.

thinengine will only use float32 this is fine

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

thinengine will only use float32 this is fine

There's a problem because the matrix instances are tracked, and it's when calling Matrix.SetPrecision that the tracking is disabled...

So, if I remove the call from the ThinEngine constructor to move it to Engine, all matrix instances will be tracked if one use ThinEngine and not Engine, which we don't want... But is it even leggit to create a ThinEngine instance?

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

I guess the error I get in the Module Tests step is linked to ThinEngine now importing vector math:

[18:10:47] New size: 167005 bytes is bigger than baseline size : 116308 bytes on thinEngineOnly.

?

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

If you don't want ThinEngine to link to math, then I don't see any other solution than switchin the matrix type to Array unconditionnally and removing Float32Array support as you suggested above...

[EDIT]
Or maybe making a MatrixConfigurator in its own file that would only deal with the new static code I have added? ThinEngine would still have to link to it, but it would be order of magnitude smaller than the actual math.vector.ts code.
[/EDIT]

@deltakosh
Copy link
Contributor

Yes even broader like PerformanceConfigurator (pinging @sebavan for other ideas :)) so we can in the future define there other options

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

Something like that?

/** @hiddden */
export class PerformanceConfigurator {
    /** @hidden */
    public static MatrixUse64Bits = false;
    /** @hidden */
    public static MatrixTrackPrecisionChange = true;
    /** @hidden */
    public static MatrixCurrentType: any = Float32Array;
    /** @hidden */
    public static MatrixTrackedMatrices: Array<any> | null = [];

    /** @hidden */
    public static SetMatrixPrecision(use64bits: boolean) {
        PerformanceConfigurator.MatrixTrackPrecisionChange = false;

        if (use64bits && !PerformanceConfigurator.MatrixUse64Bits) {
            if (PerformanceConfigurator.MatrixTrackedMatrices) {
                for (let m = 0; m < PerformanceConfigurator.MatrixTrackedMatrices.length; ++m) {
                    const matrix = PerformanceConfigurator.MatrixTrackedMatrices[m];
                    const values = matrix._m;

                    matrix._m = new Array(16);

                    for (let i = 0; i < 16; ++i) {
                        matrix._m[i] = values[i];
                    }
                }
            }
        }

        PerformanceConfigurator.MatrixUse64Bits = use64bits;
        PerformanceConfigurator.MatrixCurrentType = PerformanceConfigurator.MatrixUse64Bits ? Array : Float32Array;
        PerformanceConfigurator.MatrixTrackedMatrices = null; // reclaim some memory, as we don't need _TrackedMatrices anymore
    }
}

Should I add the file to the Engines directory?

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

Easier to ask forgiveness than permission, so I performed the changes, let me know if it's ok :)

Copy link
Contributor

@deltakosh deltakosh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it! how are the perf with this change (against no change at all?)

@Popov72
Copy link
Contributor Author

Popov72 commented Jul 15, 2020

There's no measurable difference in perf between applying this latest change or not, so the measuring given a number of posts above still hold.

@deltakosh
Copy link
Contributor

Good job as always buddy! thanks!
waiting for the tests to end and I'll merge

@deltakosh deltakosh added the automerge Label to flag PR that can be automerged label Jul 15, 2020
@mergify mergify bot merged commit e2eba7e into BabylonJS:master Jul 15, 2020
@Popov72 Popov72 deleted the matrix-64-bits branch July 15, 2020 20:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
automerge Label to flag PR that can be automerged
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants