10-testing.Rmd

# Testing

## Introduction

Our testing framework includes several types of tests. Unit and functional tests are created for individual functions and modules, respectively, where unit tests test each of the smallest components and functional tests ensure that the application of small components function within the requirements. Integration tests verify that modules work well together. Run-time tests include checks that are integrated into the code to catch errors related to user input. Regression testing and platform compatibility testing are executed before pre-releasing FIMS to ensure that previously tested pieces are still performing after the code has been changed on all platforms of interest. Beta-testing is used to gather feedback from users (i.e., members of FIMS Implementation Team and other users) during the pre-release stage. Finally, one-off testing is used for replicating and fixing user-reported bugs. Some of these one-off tests may eventually be integrated into the permanent testing framework. Or, if while you are coding you find a useful interactive tests that helped you build something, then it should be converted into {testthat} or GoogleTest tests.

FIMS uses GoogleTest to build a C++ unit testing framework and {testthat} to build an R testing framework. FIMS uses Google Benchmark to measure the real time and CPU time used for running the produced binaries. All required software for testing can be installed using the instructions in the [Software to install section](#software-to-install).

## C++ unit testing and benchmarking

Inside of your cloned version of FIMS, the file `CMakeLists.txt`, in the top-level directory, instructs `Cmake` on how to create the build files, including setting up Google Test. The Google Test code is in `tests/gtest`. Within this subdirectory is another `CMakeLists.txt` that contains additional specifications on how to register the individual tests.

### Build and run the tests

The following commands on the command line (note that Windows users cannot use Git bash and must use a native Windows shell) execute the outlined steps and are needed to build the tests:

1. Generate the build system using `Ninja` as the generator, which creates the `build` directory.
1. Use `Cmake` to build in the `build` directory in parallel using 16 jobs but `--parallel 16` can be deleted.
1. Run the tests using `ctest` in parallel using 16 jobs but `--parallel 16` can be deleted.

```bash
cmake -S . -B build -G Ninja 
cmake --build build --parallel 16
ctest --test-dir build --parallel 16
```

The output from running the tests should look something like the following:

```bash
Internal ctest changing into directory: C:/github_repos/NOAA-FIMS_org/FIMS/build
Test project C:/github_repos/NOAA-FIMS_org/FIMS/build
    Start 1: dlognorm.use_double_inputs
1/5 Test #1: dlognorm.use_double_inputs .......   Passed    0.04 sec
    Start 2: dlognorm.use_int_inputs
2/5 Test #2: dlognorm.use_int_inputs ..........   Passed    0.04 sec
    Start 3: modelTest.eta
3/5 Test #3: modelTest.eta ....................   Passed    0.04 sec
    Start 4: modelTest.nll
4/5 Test #4: modelTest.nll ....................   Passed    0.04 sec
    Start 5: modelTest.evaluate
5/5 Test #5: modelTest.evaluate ...............   Passed    0.04 sec

100% tests passed, 0 tests failed out of 5

```

### Adding a C++ test

Create a file dlognorm.hpp within the `src` subfolder that contains a simple function:

```c
#include <cmath>

template<class Type>
Type dlognorm(Type x, Type meanlog, Type sdlog){
  Type resid = (log(x)-meanlog)/sdlog;
  Type logres = -log(sqrt(2*M_PI)) - log(sdlog) - Type(0.5)*resid*resid - log(x);
  return logres;
}
```

Then, create `dlognorm-unit.cpp` in `tests/gtest` that has a test suite for the `dlognorm` function:

```c
#include "gtest/gtest.h"
#include "../../src/dlognorm.hpp"

// # R code that generates true values for the test
// dlnorm(1.0, 0.0, 1.0, TRUE) = -0.9189385
// dlnorm(5.0, 10.0, 2.5, TRUE) = -9.07679

namespace {

  // TestSuiteName: dlognormTest; TestName: DoubleInput and IntInput
  // Test dlognorm with double input values

  TEST(dlognormTest, DoubleInput) {

    EXPECT_NEAR( dlognorm(1.0, 0.0, 1.0) , -0.9189385 , 0.0001 );
    EXPECT_NEAR( dlognorm(5.0, 10.0, 2.5) , -9.07679 , 0.0001 );

  }

  // Test dlognorm with integer input values

  TEST(dlognormTest, IntInput) {

    EXPECT_NEAR( dlognorm(1, 0, 1) , -0.9189385 , 0.0001 );

  }

}
```

`EXPECT_NEAR(val1, val2, absolute_error)` verifies that the difference between `val1` and `val2` does not exceed the absolute error bound `absolute_error`. `EXPECT_NE(val1, val2)` verifies that `val1` is not equal to `val2`. See GoogleTest [assertions reference](https://google.github.io/googletest/reference/assertions.html) for more `EXPECT_` macros.

Finally, the test must be added to `tests/gtest/CMakeLists.txt` before running the tests. Add the following contents to the end of `tests/gtest/CMakeLists.txt` to enable testing in CMake, declare the C++ test
binary you want to build (dlognorm_test), and link it to GoogleTest (gtest_main). :

```cmake

add_executable(dlognorm_test
  dlognorm-unit.cpp
)

target_include_directories(dlognorm_test
  PUBLIC
    ${CMAKE_SOURCE_DIR}/../
)

target_link_libraries(dlognorm_test
  gtest_main
)

include(GoogleTest)
gtest_discover_tests(dlognorm_test)
```

Now you can build and run your test using the [instructions in Build and run the tests](#build-and-run-the-tests). The output when running `ctest` will look something like the following, note that there is a failing test:

```bash
Internal ctest changing into directory: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
Test project C:/Users/Kathryn.Doering/Documents/testing/FIMS/build
    Start 1: dlognorm.use_double_inputs
1/7 Test #1: dlognorm.use_double_inputs .......   Passed    0.04 sec
    Start 2: dlognorm.use_int_inputs
2/7 Test #2: dlognorm.use_int_inputs ..........   Passed    0.04 sec
    Start 3: modelTest.eta
3/7 Test #3: modelTest.eta ....................   Passed    0.04 sec
    Start 4: modelTest.nll
4/7 Test #4: modelTest.nll ....................   Passed    0.04 sec
    Start 5: modelTest.evaluate
5/7 Test #5: modelTest.evaluate ...............   Passed    0.04 sec
    Start 6: dlognormTest.DoubleInput
6/7 Test #6: dlognormTest.DoubleInput .........   Passed    0.04 sec
    Start 7: dlognormTest.IntInput
7/7 Test #7: dlognormTest.IntInput ............***Failed    0.04 sec

86% tests passed, 1 tests failed out of 7

Total Test time (real) =   0.28 sec

The following tests FAILED:
          7 - dlognormTest.IntInput (Failed)
Errors while running CTest
Output from these tests are in: C:/Users/Kathryn.Doering/Documents/testing/FIMS/build/Testing/Temporary/LastTest.log
Use "--rerun-failed --output-on-failure" to re-run the failed cases verbosely.

```

### Debugging a C++ test

There are two ways to debug a C++ test, interactively using `gdb` or via print statements.

To debug C++ code (e.g., segmentation error/memory corruption) using gdb:
```bash
cmake -S . -B build -G Ninja -DCMAKE_BUILD_TYPE=Debug
cmake --build build --parallel 16
ctest --test-dir build --parallel 16
gdb ./build/tests/gtest/population_dynamics_population.exe
c // to continue without paging
run // to see which line of code is broken
print this->log_naa // for example, print this->log_naa to see the value of log_naa; 
print i // for example, print i from the broken for loop
bt // backtrace
q // to quit
```

To debug C++ code with print statements you must update the C++ in the desired `.hpp` file by adding `std::ofstream out(“file_name.txt”)` and using `out << variable;` to print out values of the variable. The output of the print statements will be in `FIMS/build/tests/gtest/debug.txt` after you run the `cmake` and `ctest` calls the [instructions in Build and run the tests](#build-and-run-the-tests).

```c
nfleets = fleets.size();
std::ofstream out("debug.txt");
out <<nfleets;
```

More complex examples with text identifying the quantities

```c
out <<" fleet_index: "<<fleet_index<<" index_yaf: "<<index_yaf<<" index_yf: "<<index_yf<<"\n";
out <<" population.Fmort[index_yf]: "<<population.Fmort[index_yf]<<"\n";
```

### Benchmark example

[Google Benchmark](https://github.com/google/benchmark) measures the real time and CPU time used for running the produced binary. We will continue using the dlognorm.hpp example. Create `dlognorm_benchmark.cpp` in `tests/gtest` with the following code to run the `dlognorm` function and use BENCHMARK to see how long it takes.

```c
#include "benchmark/benchmark.h"
#include "../../src/dlognorm.hpp"

void BM_dlgnorm(benchmark::State& state)
{
  for (auto _ : state)
    dlognorm(5.0, 10.0, 2.5);
}
BENCHMARK(BM_dlgnorm);

```

Next, the following needs to be added to the end of `tests/gtest/CMakeLists.txt`:

```cmake
FetchContent_Declare(
  googlebenchmark
  URL https://github.com/google/benchmark/archive/refs/tags/v1.6.0.zip
)
FetchContent_MakeAvailable(googlebenchmark)

add_executable(dlognorm_benchmark
  dlognorm_benchmark.cpp
)

target_include_directories(dlognorm_benchmark
  PUBLIC
    ${CMAKE_SOURCE_DIR}/../
)

target_link_libraries(dlognorm_benchmark
  benchmark_main
)

```

To run the benchmark, run cmake sending output to the build subfolder and run the created executable:

```bash
cmake --build build
build/tests/gtest/dlognorm_benchmark.exe
```
The output from `dlognorm_benchmark.exe` might look like this:

```bash
Run on (8 X 2112 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
L1 Instruction 32 KiB (x4)
L2 Unified 256 KiB (x4)
L3 Unified 8192 KiB (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------
  Benchmark           Time             CPU   Iterations
-----------------------------------------------------
  BM_dlgnorm        153 ns          153 ns      4480000
```

### Clean up after running C++ tests

After running the examples above, the build generates files (i.e., the source code, libraries, and executables) and saves the files in the `build` subfolder. The example above demonstrates an "out-of-source" build which puts generated files in a completely separate directory, so that the source tree is unchanged after running tests. Using a separate source and build tree reduces the need to delete files that differ between builds. If you still would like to delete CMake-generated files, just delete the `build` folder, and then build and run tests by repeating the commands below. The files from the `build` folder are included in the FIMS repository's .gitignore file, so should *not* be pushed to the FIMS repository.

For simple C++ functions like the examples above, we do not need to clean up the tests. Clean up is only necessary in a few situations.

- If memory for an object was allocated during testing and not deallocated then the object needs to be deleted (e.g., `delete object`).
- If you used a test fixture from GoogleTest to use the same data configuration for multiple tests, `TearDown()` can be used to clean up the test and then the test fixture will be deleted. See [GoogleTest user's guide](https://google.github.io/googletest/primer.html#same-data-multiple-tests) for more details.
- If you do not want to keep any of the files produced by the example and want to completely clear any uncommitted changes and files from the git repo, run `git restore .`, which removes any committed changes in files tracked by git or `git clean -fd` to get remove all untracked files in the repository.

## Templates for GoogleTest testing

This section includes templates for creating unit tests and benchmarks.
This is the code that would go into the .cpp files in tests/gtest.

### C++ test templates

#### Unit test template

```c
#include "gtest/gtest.h"
#include "../../src/code.hpp"

// # R code that generates true values for the test

namespace {

  // Description of Test 1
  TEST(TestSuiteName, Test1Name) {

    ... test body ...

  }

  // Description of Test 2
  TEST(TestSuiteName, Test2Name) {

    ... test body ...

  }

}

```

#### Benchmark template

```c
#include "benchmark/benchmark.h"
#include "../../src/code.hpp"

void BM_FunctionName(benchmark::State& state)
{
  for (auto _ : state)
    // This code gets timed
    Function()
}

// Register the function as a benchmark
BENCHMARK(BM_FunctionName);

```

#### `tests/gtest/CMakeLists.txt` template

These lines are added each time a new test, e.g., `TestSuiteName1`, is added:

```c
// Add test suite 1
add_executable(TestSuiteName1
  test1.cpp
)

target_link_libraries(TestSuiteName1
  gtest_main
)

gtest_discover_tests(TestSuiteName1)
```

These lines are added each time a new benchmark, e.g., `benchmark1`, is added:

```c
// Add benchmark 1
add_executable(benchmark1
  benchmark1.cpp
)

target_link_libraries(benchmark1
  benchmark_main
)

```

## R testing

R tests are written using {testthat}, which can be [installed as an R package](https://testthat.r-lib.org/). More details on {testthat} can be found in the [testing chapter of R packages](https://r-pkgs.org/tests.html). 

### Testing FIMS locally 

To test FIMS R functions, interactively and locally, use `devtools::install()` rather than `devtools::load_all()` because using `load_all()` will turn on the debugger, bloating the .o file and may lead to a compilation error (see [Fixing Fatal Error](#fixing-fatal-error) for more information).

### Testing using `gdbsource`

You can interactively debug C++ code using `TMB::gdbsource()` in RStudio.

###  and file organization

- Group functions and their helpers together, i.e., the "main function plus helpers" approach.
- {testthat} tests that are a test of Rcpp code should be called `test-rcpp-[description].R`.
- Integration tests that do not have a corresponding `.R` file should use the follwing convention: `test-integration-[description].R`.

### R template

Naming conventions for {testthat} files follow the [tidyverse test convention](https://style.tidyverse.org/tests.html) as well as the using `test-rcpp-[description].R` for tests of Rcpp code and `test-integration-[description].R` for integration tests without a corresponding R function.

The format for an individual testthat test is is:

```r
test_that("TestName", {

  ...test body...

})

```

### Random numbers

Simulation results might be dependent on the order of calls, leading to failed tests just because different random numbers are used or the order of the simulation changes through model
development (see [FIMS-planning discussion 25](https://github.com/NOAA-FIMS/FIMS-planning/issues/25) for details). Below are some potential soluations.
  - Add a TRUE/FALSE parameter in each FIMS simulation module for
  setting up testing seed. When testing the module, set the parameter to
  TRUE to fix the seed number in R and conduct tests. If adding a TRUE/FALSE parameter does not work as expected, then carefully check simulated data from each component and make sure it is not a model coding error.
  - Use `set.seed()` from R to set the seed and investigate using {rstream} to generate multiple streams of random numbers to associate distinct streams of random numbers with different sources of randomness. {rstream} was specifically designed to address the issue of needing very long streams of pseudo-random numbers for parallel computations. See [the rstream paper](https://www.iro.umontreal.ca/~lecuyer/myftp/papers/rstream.pdf) and [RngStreams](http://www-labs.iro.umontreal.ca/~lecuyer/myftp/streams00/) for more details.