Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle exceptions raised during log message formatting #12

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ruck94301
Copy link

Motivation -- see https://stackoverflow.com/questions/66587941/what-happens-if-a-python-logging-handler-raises-an-exception
Offering a way to de-risk the lazy eval could increase acceptance.

If the package attribute raiseExceptions is False, exceptions get silently ignored. This is what is mostly wanted for a logging system - most users will not care about errors in the logging system, they are more interested in application errors. ... (The default value of raiseExceptions is True, as that is more useful during development).

This is implemented by

  1. new package function raiseExceptions which keeps a persistent flag value (default true)
  2. new package function throwExceptions which provides an alias for raiseExceptions
  3. in five +logger/Logger methods, I wrapped the lazy eval in a try/catch.

If the package attribute
raiseExceptions is False, exceptions get silently ignored. This is what
is mostly wanted for a logging system - most users will not care about
errors in the logging system, they are more interested in application
errors.  ... (The default value of raiseExceptions is True, as that is
more useful during development).
@apjanke
Copy link
Member

apjanke commented Aug 31, 2022

Ah, interesting! I had actually overlooked this possibility of formatMessage and the actual jLogger calls raising errors during the lazy message evaluation. Doing something to avoid that sounds like a good idea.

Wrapping all these sections in try/catches that can quash the errors like you're doing here sounds like a good way to handle it.

Though maybe that should be the only behavior SLF4M supports, and we shouldn't offer a raiseExceptions control at all. My concern is that would be a package-global setting that affects control flow. And as a library, SLF4M could be used by multiple client packages in the same contexts, which might have different expectations about the library behavior. Let's say you've got libraries Foo and Bar, each of which takes SLF4M as a dependency, and there's an application Banana which uses both Foo and Bar in the same Matlab session. Library Foo enables raiseExceptions, but Bar has some code like this:

    try
        % ... do some work that writes data in a transaction ...
    catch err
        logging.error('Oops: %s %s %s', x, y, z);
        rollback_transaction();  % required for data integrity!
        rethrow(err);
    end

Now if Foo or Banana has enabled logging.raiseExceptions, Bar is going to corrupt some of its data.

The raiseExceptions control would be useful for debugging, but not necessary I think: A developer could get similar behavior to dbstop if error in combination with raiseExceptions by putting breakpoints inside the catch blocks of all the logging functions. A bit less convenient, but same functionality.

But if raiseExceptions is something the Python logging library offers, they probably know what they're doing, and can hopefully rely on client libraries to just not do that. Let me do a bit of reading on how this works in the Python and Java logging frameworks.

If we do provide raiseExceptions, its value should probably be false, since that's what users will generally want, especially in a production context. This is a behavior change from how SLF4M works now, but I think that's fine. We probably have a major-version release coming up soon anyway.

There's a bit of concern about performance: try/catch introduces overhead, and the logging framework tries to be low overhead. But that's probably fine here: this try/catch happens after the jLogger.isenabled() short-circuit test, so it's only incurred when a message is actually emitted, and the try/catch cost is probably low compared to the cost of generating the message string itself. I'll do a little benchmarking, but this is likely fine.

If we do provide a raiseExceptions control, its state should probably be stored in a Matlab appdata(0) item instead of a persistent variable, so that its setting survives a clear classes or something else that clears the Logger class definition. Those are things which can happen in development frequently.

Regardless, this lazy-eval error suppression sounds to me like something SLF4M should do. Thanks!

@apjanke apjanke self-assigned this Aug 31, 2022
@apjanke apjanke added the enhancement New feature or request label Aug 31, 2022
@apjanke apjanke added this to the 2.0.0 milestone Aug 31, 2022
@apjanke
Copy link
Member

apjanke commented Aug 31, 2022

(Even though raiseExceptions defaults to true in Python because that's useful for development, IMHO Matlab users tend to be less sophisticated at the software-development side of things, so I think a Matlab logging library should default to, or only provide, the "safe/best for production use" behavior.)

@apjanke apjanke changed the title implement raiseExceptions aka throwExceptions Handle exceptions raised during log message formatting Aug 31, 2022
@apjanke
Copy link
Member

apjanke commented Aug 31, 2022

Oh, and conversely: if raiseExceptions defaults to true, but is settable to false, some library or code using SLF4M might accidentally rely on that behavior for its control flow, and not realize that it might be turned off in a production scenario. Hmmm.

@apjanke
Copy link
Member

apjanke commented Aug 31, 2022

Hmm. On the other other hand, maybe some client application will be using logging to output important information that they'll use in their business processes, and if a log message output fails in production code execution, they want that to raise an error so that job errors out and it sends an alert to their ops team. That's not something you can do with debugger breakpoints, so maybe SLF4M should provide a raiseExceptions control to support that scenario. And just put a big warning message on it that library code should not enable raiseExceptions itself; only end applications, developers during development, and deployers should enable raiseExceptions, and rely on developers to do the right thing.

@apjanke
Copy link
Member

apjanke commented Aug 31, 2022

Also, I dunno about silently ignoring errors during message formatting. I think that error is something users would want to know about, so they know why log messages they were expecting didn't appear, or why something is failing mysteriously but there's no diagnostic log output. Especially because this isn't providing a generic error-hander hook like Python's logging module does.

I need to read more about this before making a decision. But in the mean time, what do you think about this as the general form of the try/catch code in the logging methods?

      try
        msgStr = formatMessage(msg, varargin{:});
        this.jLogger.info(msgStr);
      catch err
        errLogMsg = sprintf('ERROR: Caught an error during log message formatting. Error: %s', err.message);
        this.jLogger.error(errLogMsg);
        if logger.raiseExceptions
          rethrow(err)
        end
      end

That message formatting is pretty unlikely to error, since err.message is a simple charvec string. The "ERROR" is included in the message string itself so that regardless of what log message format is in place (that is, whether the log level is included), it still pops up as an obvious error in the log output.

This ignores the case of the jLogger.error(...) call itself raising an exception, like if the log destination was unreachable due to an IO or permissions error. I don't even know if that's a thing that happens.

And I think using rethrow instead of throw will result in better stack trace info in the raised messages.

Maybe this should instead throw a fresh exception, with a message that explicitly indicates that the error happened during SLF4M log message construction, and includes the original exception as a chained "cause" in the new exception. But I don't think we can actually do that in SLF4M, or at least would need to be clever about it, because the exception cause feature was introduced in a newer version of Matlab, and SLF4M wants to support Matlab releases going back to like R2015b.

(I know "ME" is the name that MathWorks likes to use for exception variables, but to my ears that sounds like the word "me", which is like "this" or "self" or "obj", which are very different roles for a variable. I prefer "err" here.)

more useful during development)."
%}

persistent flag
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's back the flag value with appdata, so that its state survives a clear classes or other action that clears the raiseException function definition. Something like this:

function out = raiseExceptions(newval)

persistent flag
if isempty(flag)
  flag = getappdata(0, 'SLF4M_raiseExceptions');
  if isempty(flag)
    flag = true;
  end
end

if nargin == 0
  out = flag;
else
  if ~(islogical(newval) && isscalar(newval))
    error('The raiseExceptions value must be a scalar logical; got a %s %s', mat2str(size(newval)), class(newval));
  end
  flag = newval;
  setappdata(0, 'SLF4M_raiseExceptions', flag);
end

end

(Oops: I was going to say that SLF4M's code base uses 2-space indents, but it looks like my code actually has a mix of 2-space or 4-space indents, on a per-file basis. And the .editorconfig file says it's 4 spaces. I need to pick one and make it all uniform. #13)

@ruck94301
Copy link
Author

4 responses :-)
(1)

Though maybe that should be the only behavior SLF4M supports, and we shouldn't offer a raiseExceptions control at all. My concern is that would be a package-global setting that affects control flow. And as a library, SLF4M could be used by multiple client packages in the same contexts, which might have different expectations about the library behavior.

  • right, I agree that there could be an entirely different design approach where logrecord formatting never throws exceptions.
  • But the package-global setting is not actually worrying me. In your example, if you leave the flag at the default (true), then the logic is unchanged from existing. If you elect to protect production code by setting flag to false, then you don't risk skipping the rollback -- on the contrary you ensure it. So in this example at least, Bar's behavior is improved!

(2)

The raiseExceptions control would be useful for debugging, but not necessary I think...
If we do provide raiseExceptions, its value should probably be false, since that's what users will generally want, especially in a production context.

  • As you point out, raiseExceptions isn't the only way to protect production code from unexpected logrecord formatting errors while noticing them during dev.
    If choosing betw (a) raiseExceptions is implemented, but default is false instead of true, or (b) logrecord formatting never throws exceptions, I would favor (b).

(3)

use appdata(0) instead of persistent variable.

  • ah, I was unfamiliar with this technique

(4)

about silently ignoring errors during message formatting. I think that error is something users would want to know about, so they know why log messages they were expecting didn't appear, or why something is failing mysteriously but there's no diagnostic log output.

  • I also worried about silently ignoring those errors. My instinct is just like yours, issue a replacement logrecord.
    BUT the logging frameworks and python logging framework are so evolved that I'm willing to disregard my instinct and just mimic an existing behavior that is well-considered.

@apjanke
Copy link
Member

apjanke commented Sep 6, 2022

But the package-global setting is not actually worrying me. In your example, if you leave the flag at the default (true), then the logic is unchanged from existing. If you elect to protect production code by setting flag to false, then you don't risk skipping the rollback -- on the contrary you ensure it. So in this example at least, Bar's behavior is improved!

But what if Bar has code like this:

   write_some_data_to_db(a);
   if data_is_invalid_or_incorrect(b)
       logger.error('Bad thing happened; data in b is bogus! Aborting!');
   end
   write_some_data_to_db(b);

This code relies on the logging framework's raiseExceptions being true for control flow, to ensure correctness. If Foo or Banana turns off raiseExceptions to "protect production", now this code sequence is going to end up writing incorrect data to the prod database.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants