-
Notifications
You must be signed in to change notification settings - Fork 4.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
java_library() needs provided_deps attribute #1402
Comments
One problem with that is that the entire transitive closure of your binary has to do that consistently. If there is any library that has a normal deps on that, it will be linked in after all. With neverlink, it's easier to have consistency, because the annotation is on the rule, not the edge. Both proposals (neverlink and compile_deps) break down if you sometimes want to link a library and sometimes you don't want to link it, for example, because you have multiple binaries that you want to deploy to different environments that provide different libraries out of the box. Both proposals encourage duplicating the dependency graph in that case, which is very costly. Unfortunately, it's not clear from the feature request why you'd want to have such a feature and how you'd use it, making it harder for us to come up with a workable proposal. Another possible solution would be to use neverlink in combination with select. That'd allow you to say on the build command-line whether you want a certain library linked into a binary or not. However, the downside is that you can't build two libraries differently in the same build (at least, not right now), though we're working on that. Yet another solution is the deploy_env attribute on java_binary, that we have internally. It's specified at the java_binary rule, and it allows performing classpath subtraction at the binary level. I.e., it takes the usual runtime classpath, and also computes a runtime classpath for the things referenced through the deploy_env attribute, and then builds a binary that only contains those classpath entries on the first classpath that aren't on the second. That avoids most of the problems above - you can have multiple binaries with different environments, in the same build, and it's guaranteed to be consistent for each binary. The downsides are that you have to specify it on each binary, which could be expensive (or easy to forget) if you have lots of binaries that want to use the same deploy_env. It could potentially also slow down builds. Also, you have to avoid any manual munging of classpath jars - if you post-process parts of the classpath with a genrule or by piping it through a java_binary deploy_jar, then the classpath entries don't match and the subtraction doesn't do what you might have expected. We could easily add the depoly_env attribute to java_binary in Bazel, since all the code already exists. I'm not sure why this specific attribute was left out when we open sourced the Java rules. |
For reference facebook/buck#63 is the Buck feature request. This is actually a very important feature for some Java "applications". Stepping back from a lot of the discussion around Gerrit, look at a Java servlet "application". When writing a servlet to be deployed in a servlet container my sources must compile against the servlet spec JAR, but must not include the servlet spec jar in its WEB-INF/lib directory. To compile and deploy a Java servlet, you must use a compile_deps/provided_deps type of strategy in the build system to ensure this particular dependency is visible at compile time for some sources, but is not caught into the transitive dependency graph when assembling all other necessary libraries for the application to run. Cryptography libraries also can fall into this category. E.g. in Gerrit we "work with" BouncyCastle if the end-user downloads and installs it next to Gerrit. If it is not present, we gracefully degrade and support a reduced set of functionality that is still useful. For simplicity in development, some source classes are compiled against BouncyCastle APIs, but we break the dependency graph to prevent BC from going into the release binary. I guess both of those cases might be workable by deploy_env telling the java_binary rule that it provides those things, and therefore should not be included in the transitive closure output. A downside of this approach is it forces us to construct a java_binary for something that maybe should be a java_library instead. E.g. when we package Gerrit, we don't shade everything into a single flat JAR; we collect the dependencies into unique JAR names under WEB-INF/lib and let the container deal with a pile of (mostly original) JARs in the classpath, rather than a single flattened JAR. Its also awkward for Gerrit plugin developers to specify the Gerrit API they are building against twice; once in the java_library as a dep and again in their java_binary as a deploy_env to subtract out the very thing they had to add in as a dependency. Its somewhat cleaner to add the dep just once in the library target as a known part of the runtime environment and not have the build system force you to document it twice. If you look at the rest of the Java build tools space, I think every other system either has this, or has had to add it after the fact, because there are just too many corner cases where you need to clip the transitive dependency edge. |
java_binary does not generate a single flattened JAR by default, only if you request the _deploy.jar implicit output. However, it doesn't generate a WEB-INF/lib layout, but I have seen Skylark rules that do. For the gerrit plugin use case, (something like) neverlink seems like a better solution. You're suggesting to write something like this:
With neverlink, you'd instead write something like this:
And the gerrit-api target would specify neverlink (though note that neverlink currently only applies to a specific library, not to all libraries reachable through that library). This seems better because a) you can't accidentally get it wrong, and b) it's easier to automate with tools. Don't get me wrong, I'm not against adding more mechanisms to the Java rules to support such use cases, I'm just pointing out the advantages and disadvantages of different approaches. I don't think copying what someone else did is necessarily the right approach, but it's also not necessarily wrong. I prefer to make decisions based on technical arguments, not on the say-so from someone else. |
So if we write:
and servlet-api and other-lib have neverlink=0, a "my_plugin" does not include gerrit-api, but does include servlet-api and other-lib? I do see the value in neverlink for this plugin-api case. Its certainly easier for the my_plugin author. |
Ok. We discussed offline and that sounds reasonable to cover that use case. Repriotized accordingly. Hopefully we will get manpower to do it earlier. |
Currently too big files are published, because some unwanted transitive dependencies are included in the final artifacts. That will be fixed in follow-up change by using neverlink option in java_library rule or using provided_deps attribute that will be addded in future releases of Bazel: [1]. TEST PLAN: $ VERBOSE=1 tools/maven/api.sh install bazel $ VERBOSE=1 tools/maven/api.sh install buck * [1] bazelbuild/bazel#1402 Change-Id: Ie73d4ae34d96be7f97f6329c4c30c814f54688d5
Given that the feature request: [1], still wasn't implemented, expose neverlink artifact for every artifact per default. This is needed to support in Gerrit tree build for plugins, because every dependency can be potentially used from a plugin, in which case it must be used in neverlink form to avoid shipping it twice, in plugin artifact in addition to gerrit.war itself. With this change in place, we can write this build rule: gerrit_plugin( name = "verify-status", [...] provided_deps = [ "@commons_dbcp//jar:neverlink", ], [...] that would now work in both build mode: standalone (using bazlets) and in Gerrit tree mode. [1] bazelbuild/bazel#1402 Change-Id: I1240d25c576b13bd4d7450a0e5ba143df27a3d3a
Is this still going to be implemented? |
I'd like to add another few use-cases where we need to be able to exclude specific targets from the final deployable jar. For example, we have a situation where we need to use our library with different runtimes, e.g. Hadoop. Some of those environments need to have a specific Jackson binary, others don't. This is why the neverlink solution doesn't work for us, as we need to have it excluded sometimes. Reading this thread, the Our options right now are to either create a genrule that would crack open the produced deployable, and manually remove the packages we'd like to remove (or, use @johnynek's jarjar wrapper and use the Another option is to use another top-level rule that would perform this classpath subtraction manually, then delegate it to the SingleJar utility that handles the rest. So instead of creating our own e.g.
it would be really useful to have the ability to specify cc @ittaiz |
I think in addition to the style issue (non bazelish) then there are two
major challenges with using the solutions @hmemcpy describes:
1. People usually filter out jars which exist in the runtime and not
classes/packages. Choosing the available approach means people need to
“translate” between the paradigms.
2. If someone upgrades a dependency and the dependency changes or adds
packages then those might sleep in making upgrades more brittle.
@ulfjack,
Wdyt about the exclude_for_deployment attribute?
…On Thu, 19 Jul 2018 at 10:23 Igal Tabachnik ***@***.***> wrote:
I'd like to add another few use-cases where we need to be able to exclude
specific targets from the final deployable jar. For example, we have a
situation where we need to use our library with different runtimes, e.g.
Hadoop. Some of those environments need to have a specific Jackson binary,
others don't. This is why the neverlink solution doesn't work for us, as we
need to have it excluded *sometimes*.
Reading this thread, the deploy_env may be a good solution,
unfortunately, as @ulfjack <https://github.com/ulfjack> mentioned, it's
an internal feature with no documentation to speak of.
Our options right now are to either create a genrule that would crack open
the produced deployable, and manually remove the packages we'd like to
remove (or, use @johnynek <https://github.com/johnynek>'s jarjar wrapper
and use the zap rule for this purpose). This works, but feels
non-bazelish, as we don't deal in the label/target level anymore, and have
to manually provide a list of packages to remove.
Another option is to use another top-level rule that would perform this
classpath subtraction manually, then delegate it to the SingleJar utility
that handles the rest. So instead of creating our own e.g.
scala_assembly(
name = "my-output-jar"
for-target="//foo/bar/mybinary"
exclude=["//com/quux/dependency"]
)
it would be really useful to have the ability to specify
exclude_for_deployment list of targets on the java_binary level.
cc @ittaiz <https://github.com/ittaiz>
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1402 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF2DC0-fL5B0_8nfkRKyo05oOdi1vks5uIDPrgaJpZM4I1iJM>
.
|
@ittaiz how's "exclude_for_deployment" different from the existing "deploy_env" attribute? It looks like the attribute isn't defined in Bazel, but the code is otherwise all there. I'm not sure who decided to exclude "deploy_env" from Bazel, but I generally think it's a mistake to introduce divergence between the rules. |
It may be enough. I couldn’t find it in Bazel.
Can I give it a list of labels and then the contents of the related jars
are filtered from the deploy jar?
If so that’s exactly what we want and we’ll be happy to use it.
…On Fri, 20 Jul 2018 at 13:19 Ulf Adams ***@***.***> wrote:
@ittaiz <https://github.com/ittaiz> how's "exclude_for_deployment"
different from the existing "deploy_env" attribute? It looks like the
attribute isn't defined in Bazel, but the code is otherwise all there. I'm
not sure who decided to exclude "deploy_env" from Bazel, but I generally
think it's a mistake to introduce divergence between the rules.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1402 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIF_czyTS5DRrbLOe3Qx5gBNRiaDjwks5uIa6ygaJpZM4I1iJM>
.
|
It's filtering by jar file, it does not unpack jars and filter their contents. |
I was unclear. Does the attribute receive a list of labels?
Assuming it is any idea what’s the diff for it to be in bazel open source?
…On Fri, 20 Jul 2018 at 14:15 Ulf Adams ***@***.***> wrote:
It's filtering by jar file, it does not unpack jars and filter their
contents.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1402 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ABUIFwNYVuBzGpI1ZhVh_yf7xh7eJ4PEks5uIbvNgaJpZM4I1iJM>
.
|
See here: I haven't tried this (not even to compile), but my earlier code digging says it should just work. |
Hi @ulfjack, I'm trying to investigate whether bazel/src/main/java/com/google/devtools/build/lib/rules/java/JavaBinary.java Lines 121 to 122 in 4a20020
This returns an empty list. Is there anything I'm missing? How can I configure |
Are you referencing a java_binary in deploy_env? |
Oh wow, my bad! |
Ok, just to make I'm doing this correctly. Suppose I have a binary target like this:
My goal is to produce a deployable without the reference to
and adding After building However, it seems this will subtract all the transitive dependencies of |
Yes, that's right, unless you pass through a non-Java rule, like so:
In this case, original.jar ends up on the classpath, even though it is in the transitive closure of the deploy_env. |
I'm excited to find this issue, this is exactly what I need in order to use bazel to build jars that are plugins for a keycloak.org authentication server. |
@ulfjack the |
Note that another common use case for |
Hi, I'm trying to see how to add this, but an example is if you move Guava into Bazel google/guava#2850 under certain WORKSPACE that contains other packages. Guava itself(guava-*.jar) is an OSGi bundle. Image if there are several other packages in the workspace that are OSGi bundles, and they Though I'm not totally sure if this rule should be under java_library or there should be an osgi_bundle rule by itself. |
This adds support for `deploy_env`, an internal feature that supports transitive classpath subtraction to create deployable jars suited for different environments. This PR cherrypicks @ulfjack's commit, as described in #1402, and will help with scenarios describe there, as well as #5856. Closes #6013. PiperOrigin-RevId: 228690507
@s50600822 Did you build with the |
I don't think so. The jar name above (in the decompiler) also seems to say that too (cause it's been a while). If you meant https://docs.bazel.build/versions/master/tutorial/java.html#package-a-java-target-for-deployment, I wouldn't think of it since java application deployment is something usually not covered by build tool(since the logic isn't trivial). For example, in Maven the "deploy" semantic means adding the artifact to a remote repository(https://maven.apache.org/plugins/maven-deploy-plugin/). I think Bazel strength is incremental build behavior and was trying to use it to that extend. |
Well if you don't create a 'Fat Jar' with the |
From the beginning gerrit-acceptance-framework artifact shipped the same content as gerrit-plugin-api. Given that provided_deps attribute is not available in java_library rules, see issue: [1], there was no way to exclude libraries already shipped in gerrit-plugin-api. However with this commit: [2], that is available in recent Bazel versions, deploy_env attribute is exposed in java_binary rule. With it it is now possible to achieve transitive classpath subtraction to create deployable jars suited for different environments. Use this feature to subtract transitive dependencies already shipped in gerrit-plugin-api. As the consequence the size of the test framework artifact is reduced from ca. 70 MB to 6 MB. While already on it, remove some related TODO comments as well. [1] bazelbuild/bazel#1402 [2] bazelbuild/bazel@a92347e Change-Id: I0deada504648d27465f57021787885705635b8b4
From the beginning gerrit-acceptance-framework artifact shipped the same content as gerrit-plugin-api. Given that provided_deps attribute is not available in java_library rules, see issue: [1], there was no way to exclude libraries already shipped in gerrit-plugin-api. However with this commit: [2], that is available in recent Bazel versions, deploy_env attribute is exposed in java_binary rule. With it it is now possible to achieve transitive classpath subtraction to create deployable jars suited for different environments. Use this feature to subtract transitive dependencies already shipped in gerrit-plugin-api. As the consequence the size of the test framework artifact is reduced from ca. 70 MB to 6 MB. While already on it, remove some related TODO comments as well. [1] bazelbuild/bazel#1402 [2] bazelbuild/bazel@a92347e Change-Id: I0deada504648d27465f57021787885705635b8b4
Does deploy_env help in a case when you have a diamond dependency problem: A depends on B and C, B depends on C (possibly in a longer transitive chain), and I'd like to use deploy_env to remove from the *_deploy.jar B and it's transitive dependencies, but not C (because A depends on C directly). According to my memories regular Maven supported this case, marking B as provided, but using C as a normal dependency. With what I can see with deploy_env (testing it), in this case it's not possible to use C from A. Example: Building software that would depend on scala.util.parsing (direct dependency) on a Hadoop platform. If I start including elements of the Hadoop platform in the java_binary that I use as deploy_env, then I cannot use scala.util.parsing any more in my software (I can compile, but I cannot run it, because scala.util.parsing gets cut out - as it's frequently a direct or transitive dependency of elements of the Hadoop platform). My gut feeling would be, that if I have a direct dependency, it should not be cut out with deploy_env. |
java_binary deploy_env should be used for this use case |
As discussed in https://gerrit-review.googlesource.com/77946 the java_library() rule needs to support a "provided_deps" or "compile_deps" attribute that does not include the provided libraries into the transitive dependency closure when the rule is used in another rule. E.g.:
When building
app
, it includes//third_party:foo
, but not//third_party:ssl
. Howeverlib
is able to see//third_party:ssl
symbols and its sources can compile against those.The text was updated successfully, but these errors were encountered: