forked from NOAA-FIMS/collaborative_workflow
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy path10-testing.Rmd
785 lines (595 loc) · 37.6 KB
/
10-testing.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
# Testing
This section describes testing for FIMS. FIMS uses Google Test for C++
unit testing and testthat for R unit testing.
## Introduction
FIMS testing framework will include different types of testing to make
sure that changes to FIMS code are working as expected. The unit and
functional tests will be developed during the initial development stage
when writing individual functions or modules. After completing
development of multiple modules, integration testing will be developed
to verify that different modules work well together. Checks will be
added in the software to catch user input errors when conducting
run-time testing. Regression testing and platform compatibility testing
will be executed before pre-releasing FIMS. Beta-testing will be used to
gather feedback from users (i.e., members of FIMS implementation team
and other users) during the pre-release stage. After releasing the first
version of FIMS, the development team will go back to the beginning of
the testing cycle and write unit tests when a new feature needs to be
implemented. One-off testing will be used for testing new features and
fixing user-reported bugs when maintaining FIMS. More details of each
type of test can be found in the Glossary section.
FIMS will use GoogleTest to build a C++ unit testing framework and R
testthat to build an R testing framework. FIMS will use Google Benchmark
to measure the real time and CPU time used for running the produced
binaries.
## C++ unit testing and benchmarking
### Requirements
To use GoogleTest, you will need:
- A compatible operating system (e.g. Windows, masOS, or Linux).
- A C++ compiler that supports at least C++ 11 standard or newer (e.g.
gcc 5.0+, clang 5.0+, or MSVC 2015+). For macOS users, Xcode 9.3+
provides clang 5.0. For R users,
[rtools4](https://cran.r-project.org/bin/windows/Rtools/rtools40.html)
includes gcc.
- A build system for building the testing project.
[[CMake]{.ul}](https://cmake.org/) and a compatible build tool such as
[[Ninja]{.ul}](https://ninja-build.org/) are approved by NMFS HQ.
### Setup for Windows users
1. Download [[CMake 3.22.1
(cmake-3.22.1-windows-x86_64.zip)]{.ul}](https://github.com/Kitware/CMake/releases)
and put the file folder to `Documents\Apps` or other preferred folder.
2. Download [[ninja v1.10.2
(ninja-win.zip)]{.ul}](https://github.com/ninja-build/ninja/releases)
and put the application to `Documents\Apps` or other preferred folder.
3. Open your Command Prompt and type `cmake`. If you see details of
usage, cmake is already in your PATH. If not, follow the the
instructions below to add cmake to your PATH.
4. In the same command prompt, type `ninja`. If you see a message that
starts with `ninja:`, even if it is an error about not finding
build.ninja, this means that ninja is already in your PATH. If ninja is
not found, follow the instructions below to add ninja to your path.
### Adding cmake and ninja to your PATH on Windows
1. In the Windows search bar next to the start menu, search for `Edit
environment variables for your account` and open the `Environment
Variables` window.
2. Click `Edit...` under the `User variables for firstname.lastname`
section.
3. Click `New`, add path to cmake, if needed (e.g.,
`cmake-3.22.1-windows-x86_64\bin` or `C:\Program Files\CMake\bin` are
common paths), and click `OK`. Click `New`, add path to the location of
the Ninja executable, if needed (e.g., `Documents\Apps\ninja-win` or `C:\Program Files\ninja-win`), and click
`OK`.
4. You may need to restart your computer to update the envirionmental
variables. You can check that the path is working by running `where cmake` or `where ninja` in a command terminal.
5. Note that in certain Fisheries centers, NOAA employees do not have
administrative privileges enabled to edit the local environmental path.
In this situation it is necessary to create a ticket with IT to add
cmake and ninja to your PATH on Windows.
### Setup for Linux and Mac users
1. See [[CMake installation
instructions]{.ul}](https://cmake.org/install) for installing CMake on
other platforms. Add cmake to your PATH. You can check that the path is working by running `which cmake` in a command window.
2. Download [[ninja v1.10.2
(ninja-win.zip)]{.ul}](https://github.com/ninja-build/ninja/releases)
and put the binary in your preferred location. Add Ninja to your PATH. You can check that the path is working by running `which ninja` in a command window.
3. Open a command window and type `cmake`. If you see usage, cmake is
found. If not, cmake may still need to be added to your PATH.
4. Open a command window and type `ninja`. If you see a message starting
with `ninja:`, ninja is found. Otherwise, try changing the permissions
or adding to your path.
### How to edit your PATH and change file permissions for Linux and Mac
To check if the binary is in your path, assuming the binary is named
ninja: open a Terminal window and type `which ninja` and hit enter. If
you get nothing returned, then ninja is not in your path. The easiest
way to fix this is to move the ninja binary to a folder that's already
in your path. To find existing path folders type `echo $PATH` in the
terminal and hit enter. Now move the ninja binary to one of these
folders. For example, in a Terminal window type:
sudo cp ~/Downloads/ninja /usr/bin/
To move ninja from the downloads folder to /usr/bin. You will need to
use `sudo` and enter your password after to have permission to move a
file to a folder like `/usr/bin/`.
Also note that you may need to add executable permissions to the ninja
binary after downloading it. You can do that by switching to the folder
where you placed the binary (`cd /usr/bin/` if you followed the
instructions above), and running the command:
sudo chmod +x ninja
Check that ninja is now executable and in your path:
which ninja
If you followed the instructions above, you will see the following line
returned:
/usr/bin/ninja
### Set up FIMS testing project
Clone the [FIMS repository](https://github.com/NOAA-FIMS/FIMS) on the
command line using:
```bash
git clone https://github.com/NOAA-FIMS/FIMS.git
cd FIMS
```
There is a file called CMakeLists.txt in the top level of the directory.
This file instructs Cmake on how to create the build files, including
setting up Google Test.
The Google Test testing code is in the tests/gtest subdirectory. Within
this subdirectory is a file called CMakeLists.txt. This file contains
additional specifications for CMake, in particular instructions on how
to register the individual tests.
### Build and run the tests
Three commands on the command line are needed to build the tests:
```bash
cmake -S . -B build -G Ninja
```
This generates the build system using Ninja as the generator. Note there
is now a subfolder called build.
Next, in the same command window, use cmake to build in the build
subfolder:
```bash
cmake --build build
```
Finally, run the C++ tests:
```bash
ctest --test-dir build
```
The output from running the tests should look something like:
```bash
Internal ctest changing into directory: C:/github_repos/NOAA-FIMS_org/FIMS/build
Test project C:/github_repos/NOAA-FIMS_org/FIMS/build
Start 1: dlognorm.use_double_inputs
1/5 Test #1: dlognorm.use_double_inputs ....... Passed 0.04 sec
Start 2: dlognorm.use_int_inputs
2/5 Test #2: dlognorm.use_int_inputs .......... Passed 0.04 sec
Start 3: modelTest.eta
3/5 Test #3: modelTest.eta .................... Passed 0.04 sec
Start 4: modelTest.nll
4/5 Test #4: modelTest.nll .................... Passed 0.04 sec
Start 5: modelTest.evaluate
5/5 Test #5: modelTest.evaluate ............... Passed 0.04 sec
100% tests passed, 0 tests failed out of 5
```
### Adding a C++ test
Create a file dlognorm.hpp within the src subfolder that contains a simple function:
```c
#include <cmath>
template<class Type>
Type dlognorm(Type x, Type meanlog, Type sdlog){
Type resid = (log(x)-meanlog)/sdlog;
Type logres = -log(sqrt(2*M_PI)) - log(sdlog) - Type(0.5)*resid*resid - log(x);
return logres;
}
```
Then, create a test file dlognorm-unit.cpp in the tests/gtest subfolder
that has a test suite for the dlognorm function:
```c
#include "gtest/gtest.h"
#include "../../src/dlognorm.hpp"
// # R code that generates true values for the test
// dlnorm(1.0, 0.0, 1.0, TRUE) = -0.9189385
// dlnorm(5.0, 10.0, 2.5, TRUE) = -9.07679
namespace {
// TestSuiteName: dlognormTest; TestName: DoubleInput and IntInput
// Test dlognorm with double input values
TEST(dlognormTest, DoubleInput) {
EXPECT_NEAR( dlognorm(1.0, 0.0, 1.0) , -0.9189385 , 0.0001 );
EXPECT_NEAR( dlognorm(5.0, 10.0, 2.5) , -9.07679 , 0.0001 );
}
// Test dlognorm with integer input values
TEST(dlognormTest, IntInput) {
EXPECT_NEAR( dlognorm(1, 0, 1) , -0.9189385 , 0.0001 );
}
}
```
`EXPECT_NEAR(val1, val2, absolute_error)` verifies that the difference
between `val1` and `val2` does not exceed the absolute error bound
`absolute_error`. `EXPECT_NE(val1, val2)` verifies that `val1` is not
equal to `val2`. Please see GoogleTest [assertions
reference](https://google.github.io/googletest/reference/assertions.html)
for more `EXPECT_` macros.
### Add tests to `tests/gtest/CMakeLists.txt` and run a binary
To build the code, add the following contents to the end of the `tests/gtest/CMakeLists.txt` file:
```cmake
add_executable(dlognorm_test
dlognorm-unit.cpp
)
target_include_directories(dlognorm_test
PUBLIC
${CMAKE_SOURCE_DIR}/../
)
target_link_libraries(dlognorm_test
gtest_main
)
include(GoogleTest)
gtest_discover_tests(dlognorm_test)
```
The above configuration enables testing in CMake, declares the C++ test
binary you want to build (dlognorm_test), and links it to GoogleTest
(gtest_main). Now you can build and run your test. Open a command window
in the FIMS repo (if not already opened) and type:
```bash
cmake -S . -B build -G Ninja
```
This generates the build system using Ninja as the generator.
Next, in the same command window, use cmake to build:
```bash
cmake --build build
```
Finally, run the tests in the same command window:
```bash
ctest --test-dir build
```
The output when running `ctest` might look like this. Note there is a
failing test:
```bash
Internal ctest changing into directory: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
Test project C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
Start 1: dlognorm.use_double_inputs
1/7 Test #1: dlognorm.use_double_inputs ....... Passed 0.04 sec
Start 2: dlognorm.use_int_inputs
2/7 Test #2: dlognorm.use_int_inputs .......... Passed 0.04 sec
Start 3: modelTest.eta
3/7 Test #3: modelTest.eta .................... Passed 0.04 sec
Start 4: modelTest.nll
4/7 Test #4: modelTest.nll .................... Passed 0.04 sec
Start 5: modelTest.evaluate
5/7 Test #5: modelTest.evaluate ............... Passed 0.04 sec
Start 6: dlognormTest.DoubleInput
6/7 Test #6: dlognormTest.DoubleInput ......... Passed 0.04 sec
Start 7: dlognormTest.IntInput
7/7 Test #7: dlognormTest.IntInput ............***Failed 0.04 sec
86% tests passed, 1 tests failed out of 7
Total Test time (real) = 0.28 sec
The following tests FAILED:
7 - dlognormTest.IntInput (Failed)
Errors while running CTest
Output from these tests are in: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.
```
### Debugging a C++ test
There are two ways to debug a C++ test, interactively using `gdb` or via print statements. To use `gdb`, make sure it is installed and on your path.
Debug C++ code (e.g., segmentation error/memory corruption) using gdb:
```bash
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Debug
cmake --build build --parallel 16
ctest --test-dir build --parallel 16
gdb ./build/tests/gtest/population_dynamics_population.exe
c // to continue without paging
run // to see which line of code is broken
print this->log_naa // for example, print this->log_naa to see the value of log_naa;
print i // for example, print i from the broken for loop
bt // backtrace
q // to quit
```
Debug C++ code without using gdb:
Update code in a .hpp file by calling `std::ofstream out(“file_name.txt”)`
Then use `out << variable;` to print out values of the variable
```c
nfleets = fleets.size();
std::ofstream out("debug.txt");
out <<nfleets;
```
More complex examples with text identifying the quantities
```c
out <<" fleet_index: "<<fleet_index<<" index_yaf: "<<index_yaf<<" index_yf: "<<index_yf<<"\n";
out <<" population.Fmort[index_yf]: "<<population.Fmort[index_yf]<<"\n";
```
Git Bash
```bash
cmake -S . -B build -G Ninja
cmake --build build --parallel 16
ctest --test-dir build --parallel 16
```
The output of the print statements will be in this test file: `FIMS/build/tests/gtest/debug.txt`
### Benchmark example
Google Benchmark measures the real time and CPU time used for running
the produced binary. We will continue using the dlognorm.hpp example.
Create a benchmark file dlognorm_benchmark.cpp and put it in the
tests/gtest subfolder:
```c
#include "benchmark/benchmark.h"
#include "../../src/dlognorm.hpp"
void BM_dlgnorm(benchmark::State& state)
{
for (auto _ : state)
dlognorm(5.0, 10.0, 2.5);
}
BENCHMARK(BM_dlgnorm);
```
This file runs the dlognorm function and uses BENCHMARK to see how long
it takes.
A more comprehensive feature overview of benchmarking is available in
the [Google Benchmark GitHub repository](https://github.com/google/benchmark).
### Add benchmarks to `tests/gtest/CMakeLists.txt` and run the benchmark
To build the code, add the following contents to the end of your
`tests/gtest/CMakeLists.txt` file:
```cmake
FetchContent_Declare(
googlebenchmark
URL https://github.com/google/benchmark/archive/refs/tags/v1.6.0.zip
)
FetchContent_MakeAvailable(googlebenchmark)
add_executable(dlognorm_benchmark
dlognorm_benchmark.cpp
)
target_include_directories(dlognorm_benchmark
PUBLIC
${CMAKE_SOURCE_DIR}/../
)
target_link_libraries(dlognorm_benchmark
benchmark_main
)
```
To run the benchmark, open the command line open in the FIMS repo (if
not already open) and run cmake, sending output to the build subfolder:
```bash
cmake --build build
```
Then run the dlognorm_benchmark executable created:
```bash
build/tests/gtest/dlognorm_benchmark.exe
```
The output from `dlognorm_benchmark.exe` might look like this:
```bash
Run on (8 X 2112 MHz CPU s)
CPU Caches:
L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 8192 KiB (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------
Benchmark Time CPU Iterations
-----------------------------------------------------
BM_dlgnorm 153 ns 153 ns 4480000
```
#### Remove files produced by this example
If you don't want to keep any of the files produced by this example and
want to completely clear any uncommitted changes and files from the git
repo, use
```bash
git restore .
```
to get rid of un committed changes in git tracked files. To get rid of
all untracked files in the repo, use:
```bash
git clean -fd
```
### Clean up after running C++ tests
#### Clean up CMake-generated files and re-run tests
After running the examples above, the build generates files (i.e., the
source code, libraries, and executables) and saves the files in the
`build` subfolder. The example above demonstrates an "out-of-source"
build which puts generated files in a completely separate directory, so
that the source tree is unchanged after running tests. Using a separate
source and build tree reduces the need to delete files that differ
between builds. If you still would like to delete CMake-generated files,
just delete the `build` folder, and then build and run tests by
repeating the commands below. The files from the `build` folder are
included in the FIMS repository's .gitignore file, so should *not* be
pushed to the FIMS repository.
<!-- KD: I didn't think this was necessary, but left just in case others disagreed. -->
<!-- #### Example of rebuilding without deleting the build subfolder -->
<!-- To build and run tests without navigating to the `build` folder, you can use the commands below: -->
<!-- ```bash -->
<!-- cmake -S . -B build -G Ninja -->
<!-- cmake --build build --parallel 16 -->
<!-- ctest --test-dir build -->
<!-- ``` -->
#### Clean up individual tests
For simple C++ functions like the examples above, we do not need to
clean up the tests. Clean up is only necessary in a few situations.
- If memory for an object was allocated during testing and not
deallocated - The object needs to be deleted (e.g., `delete object`).
- If you used a test fixture from GoogleTest to use the same data
configuration for multiple tests, `TearDown()` can be used to clean up
the test and then the test fixture will be deleted. Please see more
details from [GoogleTest user's
guide](https://google.github.io/googletest/primer.html#same-data-multiple-tests).
## Templates for GoogleTest testing
This section includes templates for creating unit tests and benchmarks.
This is the code that would go into the .cpp files in tests/gtest.
### Unit test template
```c
#include "gtest/gtest.h"
#include "../../src/code.hpp"
// # R code that generates true values for the test
namespace {
// Description of Test 1
TEST(TestSuiteName, Test1Name) {
... test body ...
}
// Description of Test 2
TEST(TestSuiteName, Test2Name) {
... test body ...
}
}
```
### Benchmark template
```c
#include "benchmark/benchmark.h"
#include "../../src/code.hpp"
void BM_FunctionName(benchmark::State& state)
{
for (auto _ : state)
// This code gets timed
Function()
}
// Register the function as a benchmark
BENCHMARK(BM_FunctionName);
```
### tests/gtest/CMakeLists.txt template
These lines are added each time a new test suite (all tests in a file)
is added:
```c
// Add test suite 1
add_executable(TestSuiteName1
test1.cpp
)
target_link_libraries(TestSuiteName1
gtest_main
)
gtest_discover_tests(TestSuiteName1)
```
These lines are added each time a new benchmark file is added:
```c
// Add benchmark 1
add_executable(benchmark1
benchmark1.cpp
)
target_link_libraries(benchmark1
benchmark_main
)
```
## R testing
FIMS uses {testthat} for writing R tests. You can install the packages
following the instructions on [testthat website](https://testthat.r-lib.org/).
If you are not familiar with testthat, the
[testing chapter](https://r-pkgs.org/tests.html) in R
packages gives a good overview of testing workflow, along with structure
explanation and concrete examples.
### Testing FIMS locally
To test FIMS R functions interactively and locally, use `devtools::install()` rather than `devtools::load_all()`. This is because using `load_all()` will turn on the debugger, bloating the .o file, and may lead to a compilation error (e.e., `Fatal error: can't write 326 bytes to section .text of FIMS.o: 'file too big'
as: FIMS.o: too many sections (35851)`). Note that useful interactive tests should should be converted into {testthat} or googletest tests.
### R testthat template
The format for an individual testthat test is is:
```r
test_that("TestName", {
...test body...
})
```
Multiple testthat tests can be put in the same file.
## Test case documentation template and examples
A testing plan must be developed while designing (i.e., before coding)
new FIMS features or Rcpp modules. Please update the test cases in the [FIMS/tests/milestoneX_test_cases.md file](https://github.com/NOAA-FIMS/FIMS/tree/main/tests) (e.g., FIMS/tests/miletone1_test_cases.md).
This testing plan is documented using the test case documentation
template below.
### Test case documentation template
Individual functional or integration test cases will be designed
following the template below.
- *Test ID*. Create a meaningful name for the test case.
- *Features to be tested*. Provide a brief statement of test
objectives and description of the features to be tested. (Identify the
test items following the [[FIMS software design specification
document]{.ul}](https://docs.google.com/document/d/1iSEhJqcpSD269QdABeDE4aBZGqGcBrIrLnS7eMkSYv0/edit?usp=sharing)
and identify all features that will not be tested and the rationale for
exclusion)
- *Approach*. Specify the approach that will ensure that the features
are adequately tested and specify which type of test is used in this
case.
- *Evaluation criteria*. Provide a list of expected results and
acceptance criteria.
- Pass/fail criteria. Specify the criteria used to determine
whether each feature has passed or failed testing.
- In addition to setting pass/fail criteria with specific
tolerance values, a documentation that just views the outputs of
some tests may be useful if the tests require additional
computations, simulations, and comparisons
- *Test deliverables*. Identify all information that is to be
delivered by the test activity.
- Test logs and automated status reports
### Test case documentation examples
#### General test case documentation
The test case documentation below is a general case to apply to many
functions/modules. For individual functions/modules, please make
detailed test cases for specific options, noting "same as the general
test case" where appropriate.
+-----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Test ID | General test case |
+=======================+=====================================================================================================================================================+
| Features to be tested | - The function/module returns correct output values given different input values |
| | |
| | - The function/module returns error messages when users give wrong types of inputs |
| | |
| | - The function/module notifies an error if the input value is outside the bound of the input parameter |
+-----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Approach | - Prepare expected true values using R |
| | |
| | - Run tests in R using testthat and compare output values with expected values |
| | |
| | - Push tests to the working repository and run tests using GitHub Actions |
| | |
| | - Run tests in different OS environments (windows latest, macOS latest, and ubuntu latest) using GitHub Actions |
| | |
| | - Submit pull request for code review |
+-----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Evaluation Criteria | - The tests pass if the output values equal to the expected true values |
| | |
| | - The tests pass if the function/module returns error messages when users give wrong types of inputs |
| | |
| | - The tests pass if the function/module returns error messages when user provides an input value that is outside the bound of the input parameter |
+-----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
| Test deliverables | - Test logs on GitHub Actions. Document results of logs in the feature pull request. |
+-----------------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+
#### Functional test example: TMB probability mass function of the multinomial distribution
+-----------------------+----------------------------------------------------------------------------------+
| Test ID | Probability mass function of the multinomial distribution |
+=======================+==================================================================================+
| Features to be tested | - Same as the general test case |
+-----------------------+----------------------------------------------------------------------------------+
| Approach | Functional test |
| | |
| | - Prepare expected true values using R function dmultinom from package 'stats' |
+-----------------------+----------------------------------------------------------------------------------+
| Evaluation Criteria | - Same as the general test case |
+-----------------------+----------------------------------------------------------------------------------+
| Test deliverables | - Same as the general test case |
+-----------------------+----------------------------------------------------------------------------------+
#### Integration test example: [[Li et al. 2021 age-structured stock assessment model comparison]{.ul}](https://doi.org/10.7755/FB.119.2-3.5)
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Test ID | Age-structured stock assessment comparison ([[Li et al. 2021]{.ul}](https://doi.org/10.7755/FB.119.2-3.5)) |
+=======================+================================================================================================================================================================================================================================================+
| Features to be tested | - Null case (update standard deviation of the log of recruitment from 0.2 to 0.5 based on [[Siegfried et al. 2016]{.ul}](http://dx.doi.org/10.1139/cjfas-2015-0398) snapper-grouper complex) |
| | |
| | - Recruitment variability |
| | |
| | - Stochastic Fishing mortality (F) |
| | |
| | - F patterns (e.g., roller coaster: up then down and down then up; constant F~low~, F~MSY~, and F~high~) |
| | |
| | - Selectivity patterns |
| | |
| | - Recruitment bias adjustment |
| | |
| | - Initial condition |
| | |
| | - (unit of catch: number or weight) |
| | |
| | - Model misspecification (e.g., growth, natural mortality, and steepness, catchability etc) |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Approach | Integration test |
| | |
| | - Prepare expected true values from an operating model using R functions from [[Age_Structured_Stock_Assessment_Model_Comparison]{.ul}](https://github.com/Bai-Li-NOAA/Age_Structured_Stock_Assessment_Model_Comparison) GitHub repository |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Evaluation Criteria | - Summarize median absolute relative error (MARE) between true values from the operating model and the FIMS estimation model |
| | |
| | - If all MAREs from the null case are less than 10% and all MARES are less than 15%, the tests pass. If the MAREs are greater than 15%, a closer examination is needed. |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| Test deliverables | - In addition to the test logs on GitHub Actions, a document that includes comparison figures from various cases (e.g., Fig 5 and 6 from Li et al. 2021) will be automatically generated |
| | |
| | - A table that shows median absolute relative errors in unfished recruitment, catchability, spawning stock biomass, recruitment, fishing mortality, and reference points (e.g., Table 6 from Li et al. 2021) will be automatically generated |
+-----------------------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
#### simulation testing: challenges and solutions
One thing that might be challenging for comparing simulation results is
that changes to the order of calls to simulate will change the simulated
values. Tests may fail even though it is just because different random
numbers are used or the order of the simulation changes through model
development. Several solutions could be used to address the simulation
testing issue. Please see
[discussions](https://github.com/NOAA-FIMS/FIMS-planning/issues/25) on
the FIMS-planning issue page for details.
- Once we start developing simulation modules,we can use these two ways
to compare simulated data from FIMS and a test:
- Add a TRUE/FALSE parameter in each FIMS simulation module for
setting up testing seed. When testing the module, set the paramter to
TRUE to fix the seed number in R and conduct tests.
- If adding a TRUE/FALSE parameter does not work as expected, then
carefully check simulated data from each component and make sure it is
not a model coding error.
- FIMS will use `set.seed()` from R to set the seed. The {rstream}
package will be investigated if one of the requirements of FIMS
simulation module is to generate multiple streams of random numbers to
associate distinct streams of random numbers with different sources of
randomness. {rstream} was specifically designed to address the issue of
needing very long streams of pseudo-random numbers for parallel
computations. Please see [rstream
paper](https://www.iro.umontreal.ca/~lecuyer/myftp/papers/rstream.pdf)
and
[RngStreams](http://www-labs.iro.umontreal.ca/~lecuyer/myftp/streams00/)
for more details.