Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Switch to logback, SLF4J and logstash as logging framework #965

Open
roamingthings opened this issue Oct 29, 2022 · 18 comments
Open

RFC: Switch to logback, SLF4J and logstash as logging framework #965

roamingthings opened this issue Oct 29, 2022 · 18 comments
Assignees
Labels
priority:2 High - core feature or affects 60% of the users RFC status/staged-major-release This change will go with the next major release v2 Version 2
Milestone

Comments

@roamingthings
Copy link
Contributor

roamingthings commented Oct 29, 2022

Key information

  • RFC PR:
  • Related issue(s), if known:
  • Area: Logger
  • Meet tenets: Yes

Summary

Switch to logback GitHub repository, SLF4J and logstash as logging framework used by the Logging module.

Motivation

Currently the Lambda Powertools are using log4j2 as logging framework. Log4j2 is a very popular and widely used logging library in the Java ecosystem. It offers a lot of features that are most likely not required by most Lambda functions. One of these features (JNDI support) has gained a lot of popularity through CVE-2021-44228.

Another popular logging framework that derived from log4j1 is [logback]((https://logback.qos.ch). When using logbook it's also required to use SLF4J as an abstraction layer. For example logback is the default logging framework used by Spring Boot. It has a smaller jar-artefact footprint than log4j2.

logstash is another library that adds JSON formatting and structured logging to logback.

This would allow applications to add structured logging in a clean and convenient way without adding and removing additional keys explicitly (see proposal for an example). This is important because it reduces boilerplate code and the business logic gets more visibility.

Proposal

Replace the current usage of log4j2 with logback, SLF4J and also add logstash.

The interface of the Logging module would not have to change much. However, the configuration of the layout has to be changed to use logback and logstash.

As stated in the motivation section the usage of logstash would allow for clean implementation when using structured logging.

For example

LoggingUtils.appendKey("foo", "bar");
log.info("Hello World!")
LoggingUtils.removeKey("foo");

would become

log.info("Hello World! {}", kv("foo", "bar"));

Beside this common use-case logstash offers other convenient methods to generate these structured arguments. Structured Logging with Structured Arguments for more details

Dependency analysis

License

Logback

Dual license model:
Eclipse Public License v1.0
GNU Lesser General Public License version 2.1

SLF4J

MIT license

Logstash

Apache 2

Releases

New releases on a regular basis

Community

PR activity on the project is moderate on both projects
Project maintainer take part in active community discussions for logback on GitHub
Jira for issue tracking and logstash on GitHub

Size of library and dependencies (current versions as of 2022-10-29)

Jackson dependency is already part of Lambda Powertools for Java and required for both approaches.

logback + SLF4J + Logstash
  • slf4j-api-2.0.3.jar, 61.41 kB
  • logback-classic-1.4.4.jar, 266.09 kB
  • logback-core-1.4.4.jar, 576.91 kB
  • logstash-logback-encoder-7.2.jar, 408 kB
    Sum: ~1.3 MB

Log4J2

  • log4j-api-2.19.0.jar, 317.57 kB
  • log4j-core-2.19.0.jar, 1.86 MB
    Sum: ~2.2 MB
    Please note that Lambda Powertools requires more Dependencies like Jackson for JSON layout serialization.

Drawbacks

Changing the logging framework would mean a breaking change for existing applications and also effect other modules of the Lambda Powertools for Java.

It may also be possible that organizations don't want/allow using logback or prefer log4j over logback for other reasons. This could be mitigated by SLF4J which offers a unified API to use either logback or log4j2. This would still be a breaking change for existing applications.

Rationale and alternatives

Unresolved questions

  • Does logstash allow to define a layout that is similar to LambdaJsonLayout.json
@jeromevdl
Copy link
Contributor

Thanks for your RFC @roamingthings and the deep analysis. It definitely makes sense to get rid of log4j. The question I have and that I'd like to investigate is: do we actually need to have another implementation or could we provide the same level of service with just slf4j and let the users choose their implementation. Because some developers are using log4j anyway so adding logback (and potentially logstash) would increase the package size for them. A library (like Lambda Powertools) should not bring a log implementation and force its users to inherit from this...

@roamingthings
Copy link
Contributor Author

@jeromevdl I like your idea of only providing slf4j support and not forcing any framework to be used.
While writing this RFC this conclusion also came to my mind. So I would also prefer this option if it slf4j provides enough functionality required by the Logging Powertools. I could think of having documentation about how to integrate log4j, logback and/or logstash to support users.

@jeromevdl
Copy link
Contributor

@pankajagrawal16 Hi Pankaj, I'm currently looking at this RFC and browsing our current logging module, I was not sure what PowertoolsResolverFactory and PowertoolsResolver are used for. Is it still useful, even with the JsonTemplateLayout?

@pankajagrawal16
Copy link
Contributor

pankajagrawal16 commented Nov 10, 2022

Hey @jeromevdl

Yeah its used for the layout config. Basically if you look at https://github.com/awslabs/aws-lambda-powertools-java/blob/master/powertools-logging/src/main/java/software/amazon/lambda/powertools/logging/internal/PowertoolsResolver.java

getname method in there with powertools is mapped as a resolver in json layouts under resources folder. That how log4j new way of configuring custom layout is. It also helped us making sure that when we moved to new way of layout, the log format and keys remained same as before to be backwards compatibility for customers and having to do minimal changes.

Let me know if it makes sense?

@pankajagrawal16
Copy link
Contributor

Also, regarding this rfc, just my two cents. Supporting slf4j was heavily discussed during initial implementation and was decided to start with log4j2 as it practically support everything that slf4j has to offer and more.

That being said, it was also considered if there is community interest in future, we could easily plugin slf4j support without need to rewrite and get rid of log4j rather having another option for customer to pick one or the other.

This has benefits that the customers who are already heavily invested in log4j2 (which there are many) won't need to deal with any breaking change. Also if slf4j support is plugged inn, it reaches even wider set of customers without requiring them do alot of changes.

@driverpt
Copy link

driverpt commented Mar 21, 2023

IMHO, PowerTools should be as agnostic as possible, JUL should be used for Logging as it's the Standard in the Java Language.
The Implementer should have the ability to choose the underlying Logging Backend via SLF4J. E.g.: If the Implementer wants to use Logback, he can just bridge all calls using jul-to-slf4j.jar.

@garretwilson
Copy link

garretwilson commented Jun 5, 2023

Please see related #1180 for updating the outdated Log4J SLF4J implementation which the library brings in automatically.

@garretwilson
Copy link

garretwilson commented Jun 5, 2023

Here's another reason to switch from Log4j to Logback, or at least switch to pure SLF4J and allow the developer to choose the logging implementation. Log4j has some complicated under-the-hood stuff using sun.reflect.Reflection.getCallerClass. (You know it can't end well when a library is using something in a sun.* package.) See LOG4J2-2537: Log4j 2 Performance issue with Java 11. If you are using Maven Shade Plugin on your JAR (as the AWS docs say to do), then supposedly you can tiptoe around this using <Multi-Release>true</Multi-Release> with a ManifestResourceTransformer; see Is log4j2 compatible with Java 11?.

Having fun yet? Well it gets funner. Unfortunately AWS Lambda just unzips the JAR file, paying no attention to the multi-release part. Read AWS Lambda using incorrect classfiles from a Multi-Release JAR?.

So we still wind up with this warning in our AWS Lambda logs when using Log4J:

WARNING: sun.reflect.Reflection.getCallerClass is not supported. This will impact performance.

And as mentioned, LOG4J2-2537, this will actually affect performance. So basically AWS Lambda logging is forcing us to reduce performance if we use Log4J (and AWS Lambda PowerTools is forcing us to use Log4J, thereby forcing us to reduce performance.)

Supposedly one workaround is to use the Maven Assembly Plugin, but the person who proposed this hasn't tested it.

There was even a question about this on aws/aws-lambda-java-libs#204 but the ticket was closed with no solution.

I'm more than ready to get away from Log4j.

@garretwilson
Copy link

garretwilson commented Jun 5, 2023

I've found an ugly-but-it-works way to get the latest Log4j 9+ classes included in a shaded JAR and avoid the sun.reflect.Reflection.getCallerClass warning. First remove any <Multi-Release>true</Multi-Release> configuration you've had for your ManifestResourceTransformer. Then add this to your Maven Shade Plugin <configuration>:

<relocations>
  <relocation>
    <pattern>META-INF/versions/9/org/apache/logging/log4j/</pattern>
    <shadedPattern>org/apache/logging/log4j/</shadedPattern>
  </relocation>
</relocations>

More discussion here. Basically this copies over the Java 9+ classes that are in the Java 9 section of the multi-release POM, replacing the versions targeting older Java versions (which were causing this warning).

The bigger issue is that AWS Lambda needs to add support for multi-release JARs.

So I'll bet few if any people here knew that following the AWS docs and using Log4j with the Maven Shade Plugin was leaving everyone using Java 9+ with old pre-Java 9 versions of some Log4j classes, huh?

@driverpt
Copy link

driverpt commented Jun 6, 2023

@garretwilson , there's still an issue with that approach. It's opinionated. You're pretty much telling everyone that they would be forced to use Logback

@garretwilson
Copy link

garretwilson commented Jun 6, 2023

there's still an issue with that approach. It's opinionated. You're pretty much telling everyone that they would be forced to use Logback

@driverpt , I'm a little confused. With what approach? Are you talking about something I wrote, or something somebody else wrote? Right now PowerTools forces us to use Log4J. I don't think that the library should force the developer to use Log4J or to use Logback; in my opinion it should use the SLF4J API.

I merely provided one more reason not to force the user to use Log4j by recounting a drawback that nobody probably even realized of using Log4j. After more research I found a workaround to that drawback, so I reported it. Otherwise I would prefer that none of my libraries try to tell me what logging implementation I should use.

@jeromevdl
Copy link
Contributor

Hi @garretwilson @driverpt, thanks for your comments. I was like you and wanted to only use the SLF4J but we actually need an implementation in our code.

I've started working on this, you can have a look at this PR #995. Happy to get your thought on it. The idea is to let the developer choose between log4j / logback (eventually others)

@garretwilson
Copy link

we actually need an implementation in our code

What specifically do you need an implementation for? Wouldn't you just be collecting special AWS-related information, but and then sending that to the SLF4J APIs? What would require you to call a logging implementation? Is there somewhere this is explained in written form?

@garretwilson
Copy link

garretwilson commented Jun 6, 2023

Let me share something for brainstorming. (I'm not trying to convince you to use my own libraries—I'm suggesting an option, which at the least may give you some ideas.) I created a library Clogr which is a very thin layer on top of SLF4J. It basically just makes it even easier to get an SLF4J logger. But in addition it provides a few additional methods to do things like setting the log level in an implementation-agnostic way (as this is not something SLF4J provides).

If you come across things that you need to access at a low level in Log4j/Logback, if you think it's something that could be generalized, let me know. I might be interested in integrating an abstraction layer for it in Clogr. That way all the general code could be targeted to the SLF4J API, and the stuff that needs to know about an implementation could use the additional Clogr abstractions. (This is likely what you'll be doing here, but in a general library rather than as a one-off for AWS Lambda. Plus it could be maintained separately.)

Otherwise, if you roll it all yourself, you might considering using service loaders, which is the most correct way to do a lot of this dynamic, pluggable implementation stuff. That's what Clogr uses (and what SLF4J 2.x uses as well, which is why #1180 is so important).

@garretwilson
Copy link

garretwilson commented Jun 6, 2023

Ah! So your LoggingManager interface is exactly what I already provide in Clogr! See io.clogr.LoggingConcern. We had precisely the same need! (I first released Clogr back in 2016.)

Full disclosure: I never got around to adding the Log4J implementation of LoggingConcern, but that's easily done. But otherwise it looks like exactly the same thing. Clogr doesn't have a "reset log level" method. (Clogr allows some other benefits, such as a Clogged interface to allow you to get a logger on the fly simply by saying getLogger().)

Is there anything else other than setting the log level that you need to access the logging implementation for?

@garretwilson
Copy link

garretwilson commented Jun 6, 2023

I'll look more closely at the PR in a day or two. (I'm behind on my project because I spent half of yesterday digging into the Log4j warnings I mentioned above.)

Otherwise if you're interested in using the Clogr io.clogr.LoggingConcern, let me know and I'll release an update with log-level reset and a Log4j implementation. That way you wouldn't have to maintain the abstraction interface and service loader framework, since they are already written. And I would be happy to have an official user of Clogr. 😄

@jeromevdl jeromevdl added priority:2 High - core feature or affects 60% of the users and removed triage labels Jul 17, 2023
@scottgerring scottgerring moved this from Backlog to Working on it in Powertools for AWS Lambda (Java) Sep 28, 2023
@jeromevdl
Copy link
Contributor

@jeromevdl jeromevdl moved this from Working on it to Coming soon in Powertools for AWS Lambda (Java) Dec 11, 2023
@jeromevdl jeromevdl added the status/staged-major-release This change will go with the next major release label Dec 11, 2023
@scottgerring scottgerring added this to the v2 milestone Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority:2 High - core feature or affects 60% of the users RFC status/staged-major-release This change will go with the next major release v2 Version 2
Projects
Status: Coming soon
Development

No branches or pull requests

8 participants