Add script to generate val consts #2900

Priya2698 · 2024-09-04T18:51:07Z

Adds script to reproduce the computation that was used when generating the validation tolerances:

Lines 25 to 46 in 892b7ac

    
           struct ValidationConstants { 
        
             // Tolerances generated from randn + add + sum fusion 
        
             // compared against double precision 
        
             std::array<std::array<double, 2>, 20> sum_tolerances_float = { 
        
                 {{4, 1.68222e-06},      {8, 2.23704e-06},      {16, 2.95788e-06}, 
        
                  {32, 4.4778e-06},      {64, 6.75395e-06},     {128, 8.57934e-06}, 
        
                  {256, 1.30594e-05},    {512, 2.19122e-05},    {1024, 3.3451e-05}, 
        
                  {2048, 5.78476e-05},   {4096, 0.000108292},   {8192, 0.00012207}, 
        
                  {16384, 0.000136882},  {32768, 0.000248561},  {65536, 0.000407594}, 
        
                  {131072, 0.000500901}, {262144, 0.000923019}, {524288, 0.00156909}, 
        
                  {1048576, 0.00223107}, {2097152, 0.00343043}}}; 
        
             // Tolerances generated from randn + add + sum fusion 
        
             // compared against double precision 
        
             std::array<std::array<double, 2>, 20> sum_tolerances_half = { 
        
                 {{4, 0.00390625},    {8, 0.0078125},    {16, 0.0078125}, 
        
                  {32, 0.0155334},    {64, 0.0156269},   {128, 0.0312042}, 
        
                  {256, 0.0312548},   {512, 0.0619979},  {1024, 0.0625103}, 
        
                  {2048, 0.124686},   {4096, 0.12501},   {8192, 0.24945}, 
        
                  {16384, 0.250049},  {32768, 0.498946}, {65536, 0.500071}, 
        
                  {131072, 0.985087}, {262144, 1.00006}, {524288, 1.99234}, 
        
                  {1048576, 2.00032}, {2097152, 3.99073}}};

naoyam

LGTM

tools/generate_validation_tolerances.py

wujingyue · 2024-09-05T20:52:31Z

tools/generate_validation_tolerances.py

+import torch
+from datetime import datetime
+
+sizes = [2**i for i in range(2, 22)]  # {4, 2097152}


Question: IIRC, the reduction size is computed with respect to fusion inputs, so it tends to grow very fast. For a two-layer MLP (#2905), the reduction size of the second linear output is already 4 * hidden_size * hidden_size, close to 2M. I'd imagine the whole transformer block will generate an even larger reduction size, much larger than the max size specified here. What's the implication of getTolerance querying a size larger than the max?

It would use twice the max error in the list(

Fuser/csrc/validator_utils.cpp

Lines 134 to 138 in 1158543

} else {

// If we hit the end of the list, return twice the max error we

// measured

abs_tol = sum_tolerance_entry[sum_tolerance_entry.size() - 1][1] * 2.;

}

). So far, for the unit tests, these existing sizes have been sufficient.
If we have very few examples with larger (~2M) reduction sizes, it would be simpler to set a threshold manually. If we have several such examples, we may benefit from having more cases.

If we have several such examples, we may benefit from having more cases.

In case you are looking for concrete use cases, I believe https://github.com/NVIDIA/Fuser/blob/main/tests/cpp/test_multidevice_transformer.cpp#L569-L578 will give you reduction sizes much larger than 2M, when you change validation to use testValidate.

The other issue with MLP is we have compounding error from the running consecutive ops

add script to generate val consts

9da0ffe

Priya2698 requested review from wujingyue and naoyam September 4, 2024 18:51

naoyam approved these changes Sep 4, 2024

View reviewed changes

wujingyue reviewed Sep 4, 2024

View reviewed changes

tools/generate_validation_tolerances.py Outdated Show resolved Hide resolved

tools/generate_validation_tolerances.py Outdated Show resolved Hide resolved

tools/generate_validation_tolerances.py Outdated Show resolved Hide resolved

move to main

bef6bb5

wujingyue approved these changes Sep 4, 2024

View reviewed changes

tools/generate_validation_tolerances.py Show resolved Hide resolved

comment

568106c

wujingyue reviewed Sep 5, 2024

View reviewed changes

tools/generate_validation_tolerances.py Show resolved Hide resolved

wujingyue reviewed Sep 5, 2024

View reviewed changes

add safety factor

a2121f1

Priya2698 merged commit 694ec59 into main Sep 10, 2024
5 checks passed

Priya2698 deleted the pm/upload_gen_const_script branch September 10, 2024 20:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add script to generate val consts #2900

Add script to generate val consts #2900

Priya2698 commented Sep 4, 2024

naoyam left a comment

wujingyue Sep 5, 2024

Priya2698 Sep 5, 2024

wujingyue Sep 5, 2024

cowanmeg Sep 5, 2024

	struct ValidationConstants {
	// Tolerances generated from randn + add + sum fusion
	// compared against double precision
	std::array<std::array<double, 2>, 20> sum_tolerances_float = {
	{{4, 1.68222e-06}, {8, 2.23704e-06}, {16, 2.95788e-06},
	{32, 4.4778e-06}, {64, 6.75395e-06}, {128, 8.57934e-06},
	{256, 1.30594e-05}, {512, 2.19122e-05}, {1024, 3.3451e-05},
	{2048, 5.78476e-05}, {4096, 0.000108292}, {8192, 0.00012207},
	{16384, 0.000136882}, {32768, 0.000248561}, {65536, 0.000407594},
	{131072, 0.000500901}, {262144, 0.000923019}, {524288, 0.00156909},
	{1048576, 0.00223107}, {2097152, 0.00343043}}};

	// Tolerances generated from randn + add + sum fusion
	// compared against double precision
	std::array<std::array<double, 2>, 20> sum_tolerances_half = {
	{{4, 0.00390625}, {8, 0.0078125}, {16, 0.0078125},
	{32, 0.0155334}, {64, 0.0156269}, {128, 0.0312042},
	{256, 0.0312548}, {512, 0.0619979}, {1024, 0.0625103},
	{2048, 0.124686}, {4096, 0.12501}, {8192, 0.24945},
	{16384, 0.250049}, {32768, 0.498946}, {65536, 0.500071},
	{131072, 0.985087}, {262144, 1.00006}, {524288, 1.99234},
	{1048576, 2.00032}, {2097152, 3.99073}}};

	} else {
	// If we hit the end of the list, return twice the max error we
	// measured
	abs_tol = sum_tolerance_entry[sum_tolerance_entry.size() - 1][1] * 2.;
	}

Add script to generate val consts #2900

Add script to generate val consts #2900

Conversation

Priya2698 commented Sep 4, 2024

naoyam left a comment

Choose a reason for hiding this comment

wujingyue Sep 5, 2024

Choose a reason for hiding this comment

Priya2698 Sep 5, 2024

Choose a reason for hiding this comment

wujingyue Sep 5, 2024

Choose a reason for hiding this comment

cowanmeg Sep 5, 2024

Choose a reason for hiding this comment