[SYSTEMDS-3551] Extended Performance TestSuite #1850

Sheypex · 2023-06-23T18:10:41Z

[SYSTEMDS-3551] Extended performance testsuite

This contains new .dml and .sh scripts to include perftests for some of the components in scripts/nn.
As a start, 2 perftests for a simple SGD trained regression neural network and a neural network classifier trained while using Nesterov momentum as described in the scripts/nn/README.md were added.
(Incidentally also resolved a bug with the batching of training samples in both of these examples in the readme.)

Additionally, a semi-broken perftest for staging/NCF.dml is included. As far as I am aware, the perfest scripts are fine (though currently untested) but the implementation of NCF in staging crashes on launch due to yet undetermined cause.

The general structure of the new tests follows observed standards of presently implemented perftests:

scripts/datagen houses individual .dml scripts for data generation
scripts/perftest/datagen contains .sh scripts that wrap these .dml scripts and designate parameters for the generation based on required file sizes
scripts/perftest/scripts houses .dml scripts that implement the workload that is to be tested
scripts/perftest contains .sh scripts to run individual or all perftests pertaining to a designated type of component

Currently, the parameters in the datagen scripts fail to meet the expected output sizes of 80MB, 800MB etc.
Their output is currently too small.

The .sh scripts for the neural network components test various input sizes as given by the MAXMEM variable in runAll.sh like other perftests and additionally perform individual tests both with a base number of epochs as well as ten times that many epochs eg. 5 and 50 epochs. This seems appropriate since the number of epochs in neural network training is a shorthand parameter for the maximum amount of individual training iterations.

Finally, runAll.sh contains a flag to enable/disable the execution of neural network perftests as well as a flag to toggle the use of the gpu in these tests.

…DME.md in nn as this is the source for simpleSGD and nesterov

… should be ready

…pu for nn tests

phaniarnab · 2023-06-23T19:54:35Z

Thanks, @Sheypex for the commit. I will have a look into the changes in a day or two.

phaniarnab · 2023-06-26T13:51:48Z

It is a good start @Sheypex.

Few of the new files are missing licenses. Please add.
The data generation scripts might produce invalid data, which cannot produce valid weights. Is it possible to reuse the existing scripts, genRandData4LogisticRegression, genRandData4MultiClassSVM etc.?
Is the new simple SGD script running? Can you execute the first two dataset sizes with -stats and paste the statistics here?

Sheypex · 2023-06-26T14:54:41Z

Just fixed my implenentation of the -gpu flag
Perftests are running/working again
Incidentally found out, that there is some error with systemds utilizing my gpu .. apparently some shared libraries cant be found? So not sure what exactly is the problem there.
Fixed/added missing licenses

Stats are as follows:
For training simple sgd on smallest data:

SystemDS Statistics:
Total elapsed time: 0.553 sec.
Total compilation time: 0.236 sec.
Total execution time: 0.317 sec.
Number of compiled Spark inst: 2.
Number of executed Spark inst: 0.
Cache hits (Mem/Li/WB/FS/HDFS): 6082/0/0/0/2.
Cache writes (Li/WB/FS/HDFS): 0/1/0/4.
Cache times (ACQr/m, RLS, EXP): 0.164/0.001/0.003/0.039 sec.
HOP DAGs recompiled (PRED, SB): 0/0.
HOP DAGs recompile time: 0.000 sec.
Spark ctx create time (lazy): 0.000 sec.
Spark trans counts (par,bc,col):0/0/0.
Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
Spark async. count (pf,bc,op): 0/0/0.
Total JIT compile time: 1.604 sec.
Total JVM GC count: 0.
Total JVM GC time: 0.0 sec.
Heavy hitter instructions:
# Instruction Time(s) Count
1 sp_csvrblk 0.163 2
2 write 0.039 4
3 ba+* 0.038 800
4 -* 0.010 640
5 + 0.009 645
6 * 0.006 647
7 / 0.006 323
8 r' 0.006 480
9 createvar 0.005 3850
10 rand 0.005 4

For training simple sgd on next larger data:

SystemDS Statistics:
Total elapsed time: 0.861 sec.
Total compilation time: 0.252 sec.
Total execution time: 0.610 sec.
Number of compiled Spark inst: 2.
Number of executed Spark inst: 0.
Cache hits (Mem/Li/WB/FS/HDFS): 18242/0/0/0/2.
Cache writes (Li/WB/FS/HDFS): 1/962/0/4.
Cache times (ACQr/m, RLS, EXP): 0.232/0.001/0.012/0.045 sec.
HOP DAGs recompiled (PRED, SB): 0/0.
HOP DAGs recompile time: 0.000 sec.
Spark ctx create time (lazy): 0.000 sec.
Spark trans counts (par,bc,col):0/0/0.
Spark trans times (par,bc,col): 0.000/0.000/0.000 secs.
Spark async. count (pf,bc,op): 0/0/0.
Total JIT compile time: 2.509 sec.
Total JVM GC count: 0.
Total JVM GC time: 0.0 sec.
Heavy hitter instructions:
# Instruction Time(s) Count
1 sp_csvrblk 0.231 2
2 ba+* 0.143 2400
3 -* 0.053 1920
4 write 0.045 4
5 r' 0.018 1440
6 + 0.015 1925
7 rightIndex 0.014 960
8 * 0.012 1927
9 uak+ 0.010 960
10 rmvar 0.010 13450

Sheypex · 2023-06-26T15:06:56Z

With respect to the datagen:
Looking it over, genRandData4LogisticRegression and genRandData4MultiClassSVM should work fine as datagen scripts for the SGD test case, where only a vector of target data is required
I suppose genRandData4Kmeans may work for the classification use case .. but I'm less sure on that

I've been looking into adding a convolution/deep learning perftest along the lines of the MNIST examples and since we're on the topic of datagen: Using MNIST (or a subset of corresponding size) is probably preferable to generating random data, correct?

phaniarnab · 2023-06-26T15:23:30Z

Thanks for the stats. Do not worry about the GPU-related issue. It is fine if you cannot manage to make the GPU work.
I agree about using MNIST for the NN scripts. Otherwise, try to stick to the existing datagen scripts.

…n and genRandData4Multinomial. now also running tests for sparse and dense data. not yet utilizing generated test data sets

…dataset based on MAXMEM setting, and using whole MNIST only for biggest MAXMEM

phaniarnab · 2023-07-03T20:10:29Z

scripts/perftest/datagen/genMNISTData.sh

+  target_num_train=$(python -c "from math import floor; print( ${min_num_examples_train} + floor(${span_num_examples_train} * ${percent_size}))")  # todo couldn't work out how to do this using bc so using slower python calls instead
+  target_num_test=$(python -c "from math import floor; print( ${min_num_examples_test} + floor(${span_num_examples_test} * ${percent_size}))")


I recommend not to inline Python calls here. You can find another way or push some of the logic inside the dml script.

Sheypex · 2023-07-06T17:29:13Z

As far as i can tell, the perftest for conv2d (that in turn uses the mnist lenet implementation) is done now .. however I'm getting an error in the lenet implementation

An Error Occurred : 
        HopsException -- ERROR: ./../../nn/examples/mnist_lenet.dml line 282, column 4 -- In LeftIndexingOp Hop, error in constructing Lops 
        HopsException -- ERROR: ./nn/layers/softmax.dml line 53, column 2 -- error constructing Lops for UnaryOp Hop -- 
IllegalCallerException -- java.nio is not open to unnamed module @6a7c0ffd

I've been staring at the source of the lenet implementation in nn/examples for some time now, but I can't pinpoint the actual problem..

I'm guessing the sizes of the softmax output and the probs buffer may be mismatched? (lines 279-282 in nn/examples/mnist_lenet.dml)
But I feel like an error like that would produce a different error message

probs_batch = softmax::forward(outa4)
# Store predictions
probs[beg:end,] = probs_batch

Any idea perhaps on how to fix this?

Baunsgaard · 2023-07-07T09:55:57Z

As far as i can tell, the perftest for conv2d (that in turn uses the mnist lenet implementation) is done now .. however I'm getting an error in the lenet implementation
An Error Occurred : 
        HopsException -- ERROR: ./../../nn/examples/mnist_lenet.dml line 282, column 4 -- In LeftIndexingOp Hop, error in constructing Lops 
        HopsException -- ERROR: ./nn/layers/softmax.dml line 53, column 2 -- error constructing Lops for UnaryOp Hop -- 
IllegalCallerException -- java.nio is not open to unnamed module @6a7c0ffd
I've been staring at the source of the lenet implementation in nn/examples for some time now, but I can't pinpoint the actual problem..

I'm guessing the sizes of the softmax output and the probs buffer may be mismatched? (lines 279-282 in nn/examples/mnist_lenet.dml) But I feel like an error like that would produce a different error message
probs_batch = softmax::forward(outa4)
# Store predictions
probs[beg:end,] = probs_batch
Any idea perhaps on how to fix this?

Could you run it again, with a '-debug' argument.
Also the IO error is typically related to Operating system or JDK issues, what are you using?
Please write 'java --version' in your terminal and answer with your output.

Sheypex · 2023-07-07T10:01:52Z

Im on JDK 17

openjdk version "17.0.7" 2023-04-18
OpenJDK Runtime Environment (build 17.0.7+7-Ubuntu-0ubuntu123.04)
OpenJDK 64-Bit Server VM (build 17.0.7+7-Ubuntu-0ubuntu123.04, mixed mode, sharing)

-debug yields this

An Error Occurred : 
        HopsException -- ERROR: ./../../nn/examples/mnist_lenet.dml line 282, column 4 -- In LeftIndexingOp Hop, error in constructing Lops 
        HopsException -- ERROR: ./nn/layers/softmax.dml line 53, column 2 -- error constructing Lops for UnaryOp Hop -- 

IllegalCallerException -- java.nio is not open to unnamed module @6a7c0ffd

org.apache.sysds.hops.HopsException: ERROR: ./../../nn/examples/mnist_lenet.dml line 282, column 4 -- In LeftIndexingOp Hop, error in constructing Lops 
at org.apache.sysds.hops.LeftIndexingOp.constructLops(LeftIndexingOp.java:155)
at org.apache.sysds.hops.DataOp.constructLops(DataOp.java:311)
at org.apache.sysds.parser.DMLTranslator.constructLops(DMLTranslator.java:435)
at org.apache.sysds.parser.DMLTranslator.constructLops(DMLTranslator.java:400)
at org.apache.sysds.parser.DMLTranslator.constructLops(DMLTranslator.java:424)
at org.apache.sysds.parser.DMLTranslator.constructLops(DMLTranslator.java:339)
at org.apache.sysds.api.DMLScript.execute(DMLScript.java:457)
at org.apache.sysds.api.DMLScript.executeScript(DMLScript.java:320)
at org.apache.sysds.api.DMLScript.main(DMLScript.java:208)
Caused by: org.apache.sysds.hops.HopsException: ERROR: ./nn/layers/softmax.dml line 53, column 2 -- error constructing Lops for UnaryOp Hop -- 

at org.apache.sysds.hops.UnaryOp.constructLops(UnaryOp.java:180)
at org.apache.sysds.hops.BinaryOp.constructLopsBinaryDefault(BinaryOp.java:503)
at org.apache.sysds.hops.BinaryOp.constructLops(BinaryOp.java:237)
at org.apache.sysds.hops.LeftIndexingOp.constructLops(LeftIndexingOp.java:145)
... 8 more
Caused by: java.lang.IllegalCallerException: java.nio is not open to unnamed module @6a7c0ffd
at java.base/java.lang.Module.addOpens(Module.java:836)
at org.apache.sysds.runtime.controlprogram.context.SparkExecutionContext.handleIllegalReflectiveAccessSpark(SparkExecutionContext.java:209)
at org.apache.sysds.runtime.controlprogram.context.SparkExecutionContext$SparkClusterConfig.<init>(SparkExecutionContext.java:1831)
at org.apache.sysds.runtime.controlprogram.context.SparkExecutionContext.getSparkClusterConfig(SparkExecutionContext.java:1753)
at org.apache.sysds.runtime.controlprogram.context.SparkExecutionContext.getBroadcastMemoryBudget(SparkExecutionContext.java:1763)
at org.apache.sysds.hops.AggBinaryOp.optFindMMultMethodSpark(AggBinaryOp.java:1093)
at org.apache.sysds.hops.AggBinaryOp.constructLops(AggBinaryOp.java:217)
at org.apache.sysds.hops.BinaryOp.constructLopsBinaryDefault(BinaryOp.java:514)
at org.apache.sysds.hops.BinaryOp.constructLops(BinaryOp.java:237)
at org.apache.sysds.hops.BinaryOp.constructLopsBinaryDefault(BinaryOp.java:503)
at org.apache.sysds.hops.BinaryOp.constructLops(BinaryOp.java:237)
at org.apache.sysds.hops.UnaryOp.constructLops(UnaryOp.java:171)
... 11 more

phaniarnab · 2023-07-08T22:15:50Z

SystemDS is not tested for JDK 17. Our official support version is 11.
Can you please downgrade to JDK 11 and try?

Sheypex · 2023-07-10T15:00:22Z

Ok, tested on Java 11.
Can confirm, it was apparently just the Java version.
Also just adjusted the number of epochs in the MNIST test because it was just taking too long.
Might want to consider reducing the number of epochs from 5 and 50 down to 5 and 25.

phaniarnab · 2023-07-10T15:35:31Z

Glad that worked.
Are both the NN tests working? I understand NCF is untested and may have bugs. If so, comment out the calls to the NCF files for now, so that the perf tests don't fail in the middle.
Also, please summarize the changes and additions.

Sheypex · 2023-07-10T16:12:06Z

NN classifier and regression tests are working
conv2d/mnist test has been found to work on java 11 but not on java 17
NCF has the same error on java 11 as on java 17
reduced number of epochs for MNIST test again, since they were still taking way too long, with 5 and 25 epochs they now take about 15 minutes in total for MAXMEM=800

Overall summary:

Datagen

added datagen scripts for NN regression and classification,
- only new shell script, backend uses existing .dml implementations for regression and classification datagen
for NCF,
- data is generated as demonstrated in scripts/nn/examples/ncf-dummy-data.dml
and for MNIST (conv2d)
- added seperate script, that downloads MNIST dataset in .csv format from Github repo and additional scripts to trim this data down into separate datasets given the MAXMEM flag in runAll.sh
Generally:
- scripts/datagen has respective .dml implementations
- scripts/perftest/datagen has .sh implementation parts of this datagen

Perftests

added perftest scripts for NN regression and classification,
- NN tests run on a sparse and a dense input dataset
for NCF,
- NCF perftest is structurally complete and should work but is untested, as the .dml NCF implementation fails
and for MNIST (conv2d)
Generally:
- These perftests adhere to the common structure of other tests: scripts/perftest/scripts houses .dml implementations that are to be tested, scripts/perftest/run[xyz].sh implement staging data and collecting timing data of test runs
- tests run 2 rounds of tests with separate number of training epochs per given dataset
- tests are split into a training test and a simple prediction test: the prediction test only runs a single prediction to check the accuracy/loss of the trained model

Miscellaneous

runAll.sh now has a flag to enable the use of the GPU for NN, NCF, and MNIST (conv2d) tests and a flag to enable all of these
- NN tests are currently on by default ie. flag to run them is set in runAll.sh, while the use of the GPU is disabled
NCF test and datagen have been disabled in runAll.sh, since .dml NCF implementation fails
example algorithms in scirpts/nn/examples/README.md have been altered to correctly pick and batch training data

phaniarnab · 2023-07-10T21:53:45Z

Great. Thanks @Sheypex, for your contribution. 👍🏽

Sheypex added 9 commits June 20, 2023 20:07

set up file structure for nn perf tests

6858bd4

got datagen for nn tests running

361b340

got datagen for nn tests running

b9f59d9

nn simple sgd running

0bb5a48

nn simple sgd running

c218449

nn simple nesterov running

c32bfc4

fixed batching in nesterov and simpleSGD training .. also updated REA…

4da031d

…DME.md in nn as this is the source for simpleSGD and nesterov

weirdness with execution NCF.dml in staging .. otherwise NCF perftest…

580caf6

… should be ready

added flags in runAll.sh to toggle execution of nn tests and use of g…

f9da87c

…pu for nn tests

Baunsgaard changed the title ~~SYSTEMDS-3551~~ [SYSTEMDS-3551] Extended Performance TestSuite Jun 26, 2023

fixed -gpu flag

cd83bd4

Sheypex added 4 commits June 26, 2023 18:29

moved to using existing datagen scripts genRandData4LogisticRegressio…

68e7184

…n and genRandData4Multinomial. now also running tests for sparse and dense data. not yet utilizing generated test data sets

fixed NN datagen parameters

6a7f7a3

downloader for MNIST Dataset

58eb33f

MNIST "datagen" done by producing smaller version of the whole MNIST …

a116a46

…dataset based on MAXMEM setting, and using whole MNIST only for biggest MAXMEM

phaniarnab reviewed Jul 3, 2023

View reviewed changes

Sheypex added 3 commits July 4, 2023 21:16

replaced inline python calls in genMNISTData.sh

fb6b15e

debugged mnist "datagen" pipeline .. works now :)

0e3e754

mnist perftest done .. but lenet implementation faulty?

472e3c1

lowered number of epochs in MNIST perftest

9ebe54b

cleanup and reduced epochs for MNIST tests

53d1353

j143 added this to the systemds-3.2.0 milestone Dec 4, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYSTEMDS-3551] Extended Performance TestSuite #1850

[SYSTEMDS-3551] Extended Performance TestSuite #1850

Sheypex commented Jun 23, 2023

phaniarnab commented Jun 23, 2023

phaniarnab commented Jun 26, 2023

Sheypex commented Jun 26, 2023

Sheypex commented Jun 26, 2023

phaniarnab commented Jun 26, 2023

phaniarnab Jul 3, 2023

Sheypex commented Jul 6, 2023

Baunsgaard commented Jul 7, 2023

Sheypex commented Jul 7, 2023

phaniarnab commented Jul 8, 2023

Sheypex commented Jul 10, 2023

phaniarnab commented Jul 10, 2023

Sheypex commented Jul 10, 2023

phaniarnab commented Jul 10, 2023

		target_num_train=$(python -c "from math import floor; print( ${min_num_examples_train} + floor(${span_num_examples_train} * ${percent_size}))") # todo couldn't work out how to do this using bc so using slower python calls instead
		target_num_test=$(python -c "from math import floor; print( ${min_num_examples_test} + floor(${span_num_examples_test} * ${percent_size}))")

[SYSTEMDS-3551] Extended Performance TestSuite #1850

Are you sure you want to change the base?

[SYSTEMDS-3551] Extended Performance TestSuite #1850

Conversation

Sheypex commented Jun 23, 2023

phaniarnab commented Jun 23, 2023

phaniarnab commented Jun 26, 2023

Sheypex commented Jun 26, 2023

Sheypex commented Jun 26, 2023

phaniarnab commented Jun 26, 2023

phaniarnab Jul 3, 2023

Choose a reason for hiding this comment

Sheypex commented Jul 6, 2023

Baunsgaard commented Jul 7, 2023

Sheypex commented Jul 7, 2023

phaniarnab commented Jul 8, 2023

Sheypex commented Jul 10, 2023

phaniarnab commented Jul 10, 2023

Sheypex commented Jul 10, 2023

Datagen

Perftests

Miscellaneous

phaniarnab commented Jul 10, 2023