Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix run now errors and interrupts #192

Merged
merged 12 commits into from
Nov 25, 2024
Merged

Conversation

jcheng5
Copy link
Member

@jcheng5 jcheng5 commented Oct 25, 2024

Fixes #191

The introduction of unwind protection in Rcpp required us to stop
working so hard to catch and re-propagate different kinds of errors
across the R->C++->R boundary. This went unnoticed for so long
because the interrupt-related tests were already misbehaving on
CI and on CRAN. I've now refactored the tests so that the test
skipping can be narrowly scoped to just the interrupts, and even
those, I'd like to try again on CI.
@jcheng5 jcheng5 force-pushed the fix-run-now-errors-and-interrupts branch from f873492 to 2922839 Compare October 25, 2024 05:21
@jcheng5
Copy link
Member Author

jcheng5 commented Oct 25, 2024

Re: remaining failures for R 4.1 and 3.6 on Windows: On these builds, having Rcpp code throw an exception that's caught by the calling R code, seems to be escaping the try/catch somehow. But only for i386, not x64!?

@jcheng5
Copy link
Member Author

jcheng5 commented Oct 25, 2024

Minimal repro looks like the below. It actually works when called from RStudio, but fails when called from R at a terminal or if you put it in a file and call R -f test.R.

library(later)

Rcpp::cppFunction(
  depends = "later",
  includes = '
      #include <later_api.h>

      void oof(void* data) {
        throw std::runtime_error("This is a C++ exception.");
      }
    ',
  code = '
      void cpp_error(int value) {
        later::later(oof, 0, 0);
      }
    '
)

tryCatch(
  {cpp_error(1); later::run_now()},
  error = function(e) {message("got here")}
)

@shikokuchuo
Copy link
Collaborator

I notice you're not using rlang::interrupt(), so just in case this behaviour is relevant:

mirai::mirai(rlang::interrupt())[]
#> 'miraiInterrupt' chr ""

mirai::mirai(tools::pskill(Sys.getpid(), tools::SIGINT))[]
#> [1] TRUE

Created on 2024-10-29 with reprex v2.1.1

FYI mirai handles errors entirely on the R side with a combination of with withRestarts() and withCallingHandlers() here: https://github.com/shikokuchuo/mirai/blob/main/R/daemon.R#L204.

@shikokuchuo
Copy link
Collaborator

shikokuchuo commented Nov 14, 2024

UPDATE: R CMD check does work on this branch for me. It was failing with devtools::test() 😓 . It looks like relying on Rcpp Unwind Protect does give the desired result.

Using rlang::interrupt() did result in different behaviour when using the previous invoke wrapper.

@jcheng5
Copy link
Member Author

jcheng5 commented Nov 19, 2024

@shikokuchuo Thank you for the suggestion to use rlang::interrupt(). The pskill approach was definitely not doing the right thing on Windows, neither was sending Ctrl+C the way I was doing it.

After I removed those, I was able to remove almost all of the testthat skips. The only problem that still remains is that on Windows i386 only, native later::later callbacks that raise C++ exceptions that are raised via later::run_now blow through my tryCatch calls. (The reason why this only breaks on the R 4.1 and 3.6 CI runs is that R 4.2 removed i386!) If we have to ship with testthat skipping those tests for Windows i386, I can probably live with that.

src/later.cpp Show resolved Hide resolved
@jcheng5
Copy link
Member Author

jcheng5 commented Nov 20, 2024

@shikokuchuo Due to a record-setting windstorm here, I have no access to Windows at the moment; my only Windows PC is a desktop and I have to leave the house with my laptop to have internet. I also will be in meetings for most of today that will require my undivided attention.

Given that you already understand what's happening here much better than I do, and have also talked to @lionel-, can I leave it to you to decide the path forward for this release? As long as it works for all supported platforms including Windows 32-bit. And ideally passing the tests that are currently on this branch (rather than on main, which are using pskill and also skipping all the hard tests on ci and cran)

@shikokuchuo
Copy link
Collaborator

Oh wow - well stay safe! We've had our first snow here in London and it's bitterly cold... Have made some further progress with Lionel. The odd behaviour on Win32 was a bit of a red flag which is why I sought out Lionel to make sure we're doing the right thing! Will provide updates, which you can pickup whenever - we'll carry on trying to get to the best solution.

@shikokuchuo shikokuchuo mentioned this pull request Nov 20, 2024
* use Rcpp::uwindProtect()

* remove unwindProtect for Rcpp::function; update std::function<void(void)> calls to closely mirror their Rcpp std::function<SEXP(void)> counterparts

* use BEGIN_RCCP/END_RCPP macros inside Rcpp::unwindProtect

* keep const qualifier
src/later.cpp Show resolved Hide resolved
@shikokuchuo
Copy link
Collaborator

shikokuchuo commented Nov 21, 2024

I'm fine to merge this PR if you're ok with my replies @jcheng5. Do we need Winston (as maintainer) to kick off the revdepchecks / CRAN submission after that?

@shikokuchuo
Copy link
Collaborator

Finally just want to confirm if we're keeping the DetailedSummaryReporter for testthat. Probably doesn't hurt to do so, but does add R6 to 'suggests'.

@shikokuchuo
Copy link
Collaborator

shikokuchuo commented Nov 22, 2024

As a result of testing on a selection of rhub platforms, have made a couple of further changes to get this into shape:

  1. d0c8c0c fixes compiler warning seen on rchk machine here. The if () {} trick also silences the original "ignoring return value" warning without producing a new one. I've used it in code that's shipped to and tested on CRAN.
  2. cfa562b skips the final block of C++ error and exception handling on CRAN as this was always skipped before and adding it in is just asking for trouble. We gain nothing from doing so, and we still test on CI.

    I saw some strangeness on the very last test on one of the platforms here where the interrupt was returning an error. EDIT: I think this is a concrete manifestation of throwing exceptions across module boundaries being imperfect - likely related to some mismatch in the toolchains used. I tested that this goes away if I wrap the 'oof' function with BEGIN_RCPP/VOID_END_RCPP. I've not changed the test though so we can still see the effect of what we do on the later side.

    There is also a ubsan skip for this block, which is redundant, but just for clarity as there is also some noise here.

Copy link
Collaborator

@shikokuchuo shikokuchuo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm re-approving as I did make this commit a16ac2c. This never caused valgrind to actually fail, but it bugged me that the results weren’t clean. I managed to track it down to something as simple as finalizers not being run on exit (no change in actual behaviour) rather than anything more involved!

@jcheng5 jcheng5 merged commit e7bcc1b into main Nov 25, 2024
16 checks passed
@jcheng5
Copy link
Member Author

jcheng5 commented Nov 25, 2024

Sorry for the delay, finally back home with power and internet restored.

@jcheng5 jcheng5 deleted the fix-run-now-errors-and-interrupts branch November 25, 2024 08:02
@shikokuchuo
Copy link
Collaborator

Thanks! Hope everything is fine and you weren't otherwise affected.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Error handling is no longer correct
3 participants