Add tests for splitting with equal sized windows, restructure tests #222

joba00002 · 2022-03-14T14:05:03Z

Addresses issue #141
Additionally, test-split.R has been restructured

Signed-off-by: Jonathan Baumann [email protected]

Prerequisites

I adhere to the coding conventions (described here) in my code.
I have updated the copyright headers of the files I have modified.
I have written appropriate commit messages, i.e., I have recorded the goal, the need, the needed changes, and the location of my code modifications for each commit. This includes also, e.g., referencing to relevant issues.
I have put signed-off tags in all commits.
I have updated the changelog file NEWS.md appropriately.
I have checked whether I need to adjust the showcase file showcase.R with respect to my changes.
The pull request is opened against the branch dev.

Description

Changelog

joba00002 · 2022-04-14T15:24:44Z

The length in the splitting info used to be set incorrectly when splitting based on a number of windows, I fixed this a few weeks ago.

@bockthom We said that stating the length in seconds, as is the default (and current) behaviour, is not ideal and we would rather have an approximation that is easier to read.
However, I have not been able to find any function that would achieve this.

joba00002 · 2022-04-14T15:27:15Z

I also stumbled across this in test-split.R:

##
## TODO
##

## - net.conf$update.values(list(pasta = TRUE, synchronicity = TRUE))

It seems to me that this indicates something that still needs to be tested. Is there a corresponding issue?

bockthom · 2022-04-22T10:20:09Z

@bockthom We said that stating the length in seconds, as is the default (and current) behaviour, is not ideal and we would rather have an approximation that is easier to read. However, I have not been able to find any function that would achieve this.

Just a short answer for documentation reasons:
We would like to have a duration format which is both easily human readable and also reusable (e.g., passing the splitting length to another function which takes a duration as input parameter). This can be achieved by calling as.period on a duration object (since the string representation of the duration object itself cannot be reused as treating it as a duration ends up in NA).

bockthom · 2022-04-22T10:21:58Z

I also stumbled across this in test-split.R:
##
## TODO
##

## - net.conf$update.values(list(pasta = TRUE, synchronicity = TRUE))
It seems to me that this indicates something that still needs to be tested. Is there a corresponding issue?

No, there is no corresponding issue. But, this indeed still needs to be tested. Should be easily testable via parameterized test for tests which already check for pasta and synchronicity data (hopefully...).

bockthom

The new structure of the tests looks good, it is indeed more clear than it was before, thanks for restructuring @joba00002!

However, I found a few inconsistencies, please have a look at my individual comments below.

In addition, I am wondering if we have lost some of the tests?
If I look at the current dev branch and grep for test_that(, we have 18 tests in test-split and 11 tests in test-split-sliding-window, in sum 29 tests. However, if I grep on the four new test-split- files, I get 33 tests. However, there are 5 new tests on "equal sized windows" and 3 new tests due to separating "number window" and "activity amount" in 'split-network-activity-based', if I have counted correctly. So, in sum, we get 33 - 8 = 25 < 29. So, I am wondering whether we miss 4 tests or whether there is something wrong in my counting method. We definitively should make sure that we still have all the tests we had before.

tests/test-split-data-activity-based.R

tests/test-split-data-time-based.R

tests/test-split-network-time-based.R

tests/test-split-data-activity-based.R

tests/test-split-data-time-based.R

joba00002 · 2022-04-22T17:28:49Z

So, I am wondering whether we miss 4 tests

No, I don't think so. There are 4 tests still in test-split.R, because they did not make sense to move to any other files (although we could consider renaming this file).
If I understand you correctly, you did not count these 4.

bockthom · 2022-04-22T17:42:50Z

There are 4 tests still in test-split.R, because they did not make sense to move to any other files (although we could consider renaming this file). If I understand you correctly, you did not count these 4.

You are right. I assumed that test-split.R is empty now, but it actually is not. I would appreciate it if we could find another name for this file. At least, 3 of the 4 tests deal with data and networks (if have identified that correctly), so we could think about something like test-split-data-and-networks.R or something similar. However, I am not sure if the 4th test also is in line with such a name. In addition, the comment at the beginning of test-split about "splitting raw data" also seems to be outdated or somehow misleading, maybe we can also rephrase that one.

bockthom · 2022-04-28T23:41:12Z

@joba00002 I did not have a look at your most recent commits yet, but I just saw that your pull request seems to be based on the wrong branch, as there are 14 merge commits appearing in this PR which originate from version releases on the master branch but do not appear on the dev branch. So, please base your changes onto the dev branch, not onto the master branch. Thanks!

Add tests for split.data.time.based, split.network.time.based and split.networks.time.based with the number.windows parameter. Signed-off-by: Jonathan Baumann <[email protected]>

Move time period calculation to new function get.time.period.by.amount, which is called by generate.date.sequence for splitting and by split.data.time.based to set the splitting information correctly. Adjust tests accordingly. Signed-off-by: Jonathan Baumann <[email protected]>

In test-split-equal-sized-windows.R, add explicit checks of the splitting info in the test for split.networks.time.based. Signed-off-by: Jonathan Baumann <[email protected]>

In test-split-equal-sized-windows.R, use get.date.from.string from util-misc instead of calling as.POSIXct directly for consistency. Signed-off-by: Jonathan Baumann <[email protected]>

Move tests for split.data.time.based, split.data.activity.based, split.network.time.based and split.network.activity.based to new, respective files for each. In the process, move all tests from test-split-sliding-windows.R and test-split-equal-sized-windows.R to the proper files also, to have a consistent structure throughout. Signed-off-by: Jonathan Baumann <[email protected]>

Splitting by activity amount and splitting by number of windows are now seperate tests instead of one large one. Signed-off-by: Jonathan Baumann <[email protected]>

Converting from duration to time period yields a format that is human readable, exact and can be converted back to a duration. Signed-off-by: Jonathan Baummann <[email protected]>

Fix inconsistencies and improve comments and test names in test-split.R, test-split-data-activity-based.R, test-split-data-time-based.R, test-split-network-activity-based.R and test-split-network-time-based.R. Signed-off-by: Jonathan Baumann <[email protected]>

Affects tests for splitting data activity based with number.windows. Previously, tests for ignoring sliding.window in this case were included in the tests for sliding.window, instead, they are now parameterized tests in the number.windows category. Signed-off-by: Jonathan Baumann <[email protected]>

Now that many tests have been moved to new files, update the copyright information so that each developer is listed in the files with functions they worked on. Signed-off-by: Jonathan Baumann <[email protected]>

joba00002 · 2022-04-29T07:32:43Z

@bockthom I must have picked the wrong branch to rebase onto. Should be fixed now.

bockthom · 2022-04-29T12:47:36Z

Should be fixed now.

Yes, thanks!

Is this PR already complete or are you planning to add further commits?
If it is complete (or you think that it may be complete 😉), then we will review it. But as it is still marked as a "Draft PR", we will not automatically review it unless you ask us to do so or convert to a non-draft PR.

joba00002 · 2022-04-29T13:53:37Z

I know that the changelog file still needs to be updated, but I'll do after a review.

I originally also wanted to add tests for the TODO item I found (pasta and synchronicity data), but I think it makes more sense to do that in another PR. I will mark this one as ready now.

hechtlC

Looks good to me. Thank you for the nice additions and cleanups @joba00002!

As for renaming the test-split.R file: I would suggest something like test-split-general.R or test-split-misc.R. These are not the best but the tests in here can not sensibly be grouped, except for the miscellaneous group. What do you think @bockthom and @joba00002?

Other than that the pull request can be merged once @bockthom approves.

Since most tests are now in other files, this file being called test-split.R was somewhat confusing. Signed-off-by: Jonathan Baumann <[email protected]>

List changes to util-split.R under "Fix". Signed-off-by: Jonathan Baumann <[email protected]>

bockthom

Thanks for renaming, I agree with the the new name of the former test-split.R 😄

I started reviewing this PR, but unfortunately, I was not able to finish my review yet.
But I already stumbled over two issue sI would already like to mention:

(1) The test pipeline fails in two of your new tests on equally sized windows (for R version 3.4) as you can see in the drone CI pipeline. According to the logs, there is used a expect_identical where the compared objects are not identical but equal. I was not able to locate which tests and particular checks are meant, but could you please check in the drone logs and check whether replacing expect_identical by expect_equal at the two particular lines would make sense and potentially fix that problem? Maybe we can also discuss this tomorrow. If it does not make sense to replace these two statements, we might just ignore it and maybe we will stop testing R 3.4... but let's see...

(2) I spotted one code snippet where the indentation looks a bit weird. Please have a look at the comment below.

I try to continue my review tomorrow before our meeting, but I don't expect to spot additional issues. But we'll see...

util-split.R

Fix indentation in util-split.R Signed-off-by: Jonathan Baumann <[email protected]>

Fix copyright headers in test-split-*.R Signed-off-by: Jonathan Baumann <[email protected]>

"Split a list of networks time-based with equal-sized windows" fails on R3.4 when using expect_identical. Therefore, use expect_equal instead. Signed-off-by: Jonathan Baumann <[email protected]>

joba00002 · 2022-05-06T21:07:07Z

expect_equal seems to work.
I suspect the issue is something with comparing POSIXct values, but I don't understand what the difference is, since the documentation of test_that does not go into detail.

bockthom · 2022-05-07T12:15:14Z

expect_equal seems to work. I suspect the issue is something with comparing POSIXct values, but I don't understand what the difference is, since the documentation of test_that does not go into detail.

Thanks for investigating this problem. I also had a look at the test logs (where you printed the structure of the objects to be compared) and further search for potential causes for the test failure, but I also couldn't make out anything. I also checked the R release notes for changes between R 3.4 and R 3.5 to find changes related to our problem. Between R 3.4 and R 3.5, they have indeed changed something regarding POSIXct and also regarding functions of S4 classes which are potentially used by testthat. So, it could be the case that expect_identical works on more recent R versions due to these changes.

Anyway, we are fine with expect_equal here. So I will merge right away (and if we will decide to not support R 3.4 anymore somewhere in the future, we can easily find and adjust the one statement due to your comment mentioning R 3.4 😄 )

joba00002 changed the title ~~Add test-split-equal-sized-windows.R~~ Add tests for splitting with equal sized windows, restructure tests Apr 14, 2022

bockthom reviewed Apr 22, 2022

View reviewed changes

joba00002 force-pushed the eq_windows_tests branch from 56fb9bb to ad1eb54 Compare April 22, 2022 16:52

joba00002 added 10 commits April 29, 2022 09:27

Add test-split-equal-sized-windows.R

6447695

Add tests for split.data.time.based, split.network.time.based and split.networks.time.based with the number.windows parameter. Signed-off-by: Jonathan Baumann <[email protected]>

Add explicit check of splitting information when splitting networks

e46cb8c

In test-split-equal-sized-windows.R, add explicit checks of the splitting info in the test for split.networks.time.based. Signed-off-by: Jonathan Baumann <[email protected]>

Use get.date.from.string instead of as.POSIXct

55f9b4b

In test-split-equal-sized-windows.R, use get.date.from.string from util-misc instead of calling as.POSIXct directly for consistency. Signed-off-by: Jonathan Baumann <[email protected]>

Split tests in test-split-network-activity-based.R

e9159c3

Splitting by activity amount and splitting by number of windows are now seperate tests instead of one large one. Signed-off-by: Jonathan Baumann <[email protected]>

Fix format of split.length when determined by number.windows

b467a01

Converting from duration to time period yields a format that is human readable, exact and can be converted back to a duration. Signed-off-by: Jonathan Baummann <[email protected]>

Update copyright information to reflect moved tests

6b639cd

Now that many tests have been moved to new files, update the copyright information so that each developer is listed in the files with functions they worked on. Signed-off-by: Jonathan Baumann <[email protected]>

joba00002 force-pushed the eq_windows_tests branch from fdb686b to 6b639cd Compare April 29, 2022 07:28

joba00002 marked this pull request as ready for review April 29, 2022 13:53

hechtlC previously approved these changes May 2, 2022

View reviewed changes

Rename test-split.R to test-split-misc.R

27cad5f

Since most tests are now in other files, this file being called test-split.R was somewhat confusing. Signed-off-by: Jonathan Baumann <[email protected]>

joba00002 dismissed hechtlC’s stale review via 27cad5f May 3, 2022 11:24

Update NEWS.md

390fb53

List changes to util-split.R under "Fix". Signed-off-by: Jonathan Baumann <[email protected]>

bockthom reviewed May 3, 2022

View reviewed changes

util-split.R Outdated Show resolved Hide resolved

joba00002 added 2 commits May 3, 2022 21:01

Fix indentation

aca1aac

Fix indentation in util-split.R Signed-off-by: Jonathan Baumann <[email protected]>

Fix copyright headers and Typo in NEWS.MD

0070c10

Fix copyright headers in test-split-*.R Signed-off-by: Jonathan Baumann <[email protected]>

joba00002 force-pushed the eq_windows_tests branch from efabf57 to 0070c10 Compare May 6, 2022 20:30

Fix test-split-network-time-based for R 3.4

b9cc5e9

"Split a list of networks time-based with equal-sized windows" fails on R3.4 when using expect_identical. Therefore, use expect_equal instead. Signed-off-by: Jonathan Baumann <[email protected]>

bockthom approved these changes May 7, 2022

View reviewed changes

bockthom merged commit 38e815e into se-sic:dev May 7, 2022

joba00002 deleted the eq_windows_tests branch May 7, 2022 13:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add tests for splitting with equal sized windows, restructure tests #222

Add tests for splitting with equal sized windows, restructure tests #222

joba00002 commented Mar 14, 2022 •

edited

Loading

joba00002 commented Apr 14, 2022

joba00002 commented Apr 14, 2022

bockthom commented Apr 22, 2022

bockthom commented Apr 22, 2022

bockthom left a comment

joba00002 commented Apr 22, 2022

bockthom commented Apr 22, 2022

bockthom commented Apr 28, 2022

joba00002 commented Apr 29, 2022

bockthom commented Apr 29, 2022 •

edited

Loading

joba00002 commented Apr 29, 2022

hechtlC left a comment

bockthom left a comment

joba00002 commented May 6, 2022

bockthom commented May 7, 2022

Add tests for splitting with equal sized windows, restructure tests #222

Add tests for splitting with equal sized windows, restructure tests #222

Conversation

joba00002 commented Mar 14, 2022 • edited Loading

Prerequisites

Description

Changelog

joba00002 commented Apr 14, 2022

joba00002 commented Apr 14, 2022

bockthom commented Apr 22, 2022

bockthom commented Apr 22, 2022

bockthom left a comment

Choose a reason for hiding this comment

joba00002 commented Apr 22, 2022

bockthom commented Apr 22, 2022

bockthom commented Apr 28, 2022

joba00002 commented Apr 29, 2022

bockthom commented Apr 29, 2022 • edited Loading

joba00002 commented Apr 29, 2022

hechtlC left a comment

Choose a reason for hiding this comment

bockthom left a comment

Choose a reason for hiding this comment

joba00002 commented May 6, 2022

bockthom commented May 7, 2022

joba00002 commented Mar 14, 2022 •

edited

Loading

bockthom commented Apr 29, 2022 •

edited

Loading