-
Notifications
You must be signed in to change notification settings - Fork 206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Renaming tensorflow-tools
to ndarray
#69
Conversation
I'd still recommend to keep the |
Hi @saudet , the whole purpose of renaming |
If that's the case, then the package name should also be changed. |
Not necessarily since the artifact is still under the |
Ok this PR is ready to be reviewed and merged. Of course, scrutinizing all impacted files would be a dreadful task so I'm summarizing the changes here:
Like explained previously in this thread, the purpose of this change is to decouple the previous |
The prefix thing is just a nice convention, nothing more. If you leave out the tensorflow- prefix and decide to create a submodule in it, what are you going to name it? We should think a bit further down the road about these things so we don't end up renaming everything all the time. |
What about the following convention:
We'd get the following directory structure:
I don't really like it because we end up with JAR files named "core.jar" and "ndarray.jar", but it's better than naming modules randomly without following a convention like you're proposing. |
I don't agree, in your last example, by removing the Logically, I see the project structure at a high-level more something like this:
Adding the second Now, when you talk about adding submodules under
On the other hand, if we end up having a single artifact that contains everything we need, then the current proposal is fine (where |
The module still gets published under the |
Hehe, yes we discussed about that as well if you remember but we need to check how possible it is to move out this code now that it has been submitted in that repository. I agree that it is just a small change for now, still I think it can attract more potential users than preserving its actual name, I added this topic as part of the agenda for this Friday, I feel there is more to be discussed about this change as well. |
TensorFlow is open source under Apache v2. We can fork it and do whatever we want with it, there's no issues at all. What would cause problems is if we wanted to move the code to an organization that requires to own the copyright on the code. That would be problematic, because everyone that contributes code to TensorFlow keeps their copyright. Though organizations like Eclipse these days don't require copyright transfer, so that's usually alright. In any case, I think trying to distance ndarray from TensorFlow would be a mistake. As I mentioned at bytedeco/javacpp#391 (comment), this library actually becomes useful to end users when it starts supporting GPU memory and things like that. And to support that, we need to code ops in native code, at which point, we might as well be using existing ops from the core of TensorFlow, something that BTW is going to look a lot like PyTorch: https://pytorch.org/tutorials/beginner/blitz/tensor_tutorial.html Because of that, regardless of how we name things here, in the end, if ndarray ends up supporting GPUs, TensorFlow for Java will become known as the NumPy of Java anyway, so why try to stop that? |
I disagree that the library needs to support GPU memory to be useful to users. Most DL inference happens on CPUs at the moment, and decent multidimensional array support is a precondition for deep learning. Numpy doesn't support GPUs, I would say that's been pretty successful even before GPUs and deep learning took off. The main thing we need is for everyone who is using Tensors on the JVM to agree on a common interface, the same way people in Python have agreed on numpy as the interface. Having hundreds of ndarray classes, all of which support a different set of operations makes it very difficult for people to switch between frameworks to exploit their various strengths. Plus the cognitive overhead for developers when they have to learn yet another way of doing indexing and other operations to learn a new library isn't helpful. |
The reason the Python community came up with NumPy in the first place is because the language is a lot slower than C/C++. Fortran never needed anything like NumPy, and most of its users still don't use any similar library. It's (almost) the same for Java.
Oh, sure, I fully agree. For example, even if most users use ndarray on their laptop CPUs, that's still useful, because their code will be able to run on server GPUs and TPUs. That's the whole point, in my opinion: Having the option of acceleration. |
How about we create a little poll for this?
|
Sure, except that pushes the cost of writing it onto the users of the library. We can write fairly fast linear algebra on CPU in Java, C2 and GraalVM EE have pretty good autovectorisation but it requires tight code to get the autovectorisation to trigger in the right way. Having a CPU tensor library that everyone agrees on is still beneficial, especially if it has NUMA support.
Agreed, but large CPUs in servers are also acceleration over laptops (especially with AVX-512 or Epyc 2s). |
So I was thinking that the best way to make sure that we won't rename this library later is to check how it could actually scale by adding a linear algebra layer. I want to do a few experiments first and I'll come back to you with a proposal. |
Using GPUs? What are you proposing exactly?
|
No, I mean by adding basic linear algebra operations to this API, accelerated by TensorFlow when dealing with tensors. |
If you're wondering about performance on CPU, try out this sample code:
So, Java is basically 45x times slower. We can optimize the Java code some more, but we can also optimize the native code some more, and that's what libraries like MKL do. Granted, this is only a single example of very large matrix multiplication, but it's a very important example. In any case, the point is, in general, Java code just doesn't cut it for that kind of computation. |
I was curious about MKL's performance, so I updated the sample code with a comparison against it, and it turns out to be fully 12x faster than LLVM with Polly and OpenMP:
That makes it over 550x faster than unoptimized Java (and that's on my dinky little laptop, which can also probably pull off 10x that with CUDA). You can try as you might to optimize that code in Java, but there's no chance in hell you'll get closer than 10x from MKL, as per, for example, https://github.com/fommil/netlib-java#performance (which says that MKL is slower than Java for small matrices blah blah blah but Intel has already solved this, see https://software.intel.com/content/www/us/en/develop/articles/intel-math-kernel-library-improved-small-matrix-performance-using-just-in-time-jit-code.html), so it's really rather pointless to code that kind of thing in Java... |
I think Karl is proposing to figure out what the API for linear algebra would look like on top of the ndarrays. It's not about picking a specific package or backend like MKL, but before embarking on doing it it would be good to know what the API shape is. We can always get a pointer out to the memory and hand it to MKL for the low level operations. Also the pure Java matrix multiply you linked isn't a good test against MKL, it's not blocked, nor is it trying any of the standard tricks to accelerate matrix multiplies in any language, so it's not surprising it's very slow by comparison. Plus C2 and GraalVM EE can occasionally be very fussy about what they autovectorise and how they do it, so you need to contort the Java code a little to make the compilers happy to get the highest performance. I have some hopes that the Vector API coming in Panama will allow us to get closer to MKL's performance in pure Java, but as in Java we don't have control over register allocation we're not likely to get all the way there. |
Which is exactly my point, why bother optimizing it in Java in the first place? There was actually a blog post about that, I was trying to find it back, but still can't find it anymore... I think it's this one, but it doesn't give numbers: https://richardstartin.github.io/posts/vectorised-algorithms-in-java
That was 2 years ago, and basically nothing more since. I'm not saying it's impossible to do that kind of thing in Java, but no one is putting resources into it, and there's political issues with OpenJDK and/or Oracle not wanting to implement "fast math" in the JDK, which is not helping, and is something you could help with since you work at Oracle, so please try to do something about it. :) Anyway, chip shops put a lot more money in things like CUDA and MKL, and they give them out for free, so you know, let's use that. MKL-DNN/DNNL/oneDNN or whatever it's named these days, is open source and Intel is even allowing contributions from ARM now: https://github.com/oneapi-src/oneDNN/
So, you know, why fight this? Java is fine to do prototyping and to implement custom one-shot algorithms, but for widely used algorithms like matrix multiplication, it's not useful. |
Intel and Arm are both contributing heavily to the vector API. It's moving ahead and the performance issues are being fixed (there are plenty of commits each week as you could see by checking the panama mailing list). It won't be all the way done till the JVM has inline types as they are necessary to remove all the boxes and give C2 a type wide enough to store the vectors in. I have ported several internal ML libraries to use various versions of the vector api, helped find bugs in it, and my code was faster with it. You're welcome to contribute to it as well. There are some issues with fast math in Java as the language spec doesn't allow things to be reordered if it changes the outcome. However that doesn't apply to the vector API because if you use it you've allowed the JVM to compile your operations in a way that improves perf. I happen to quite like writing Java, and I'm excited to write more numerical code in it. I'm hopeful that the ability to fuse together operations by having them all in the same source language (plus allowing the compiler to inline across matrix multiplication and activation function boundaries) will make it possible to develop fast enough ml libraries without using native code. On the CPU anyway, the accelerator story is still further out. Anyway, I'm not arguing against basing things on top of the mkl, in the short term that's the best choice for CPUs. Intel ones at least, the last time I checked it ran pretty badly on Epyc 2s, which is very disappointing given the theoretical performance of AMDs chips. |
So after experimenting a few things on my side, a functional API at the tensor level (including linear algebra operations) would probably fit better in the The main reason of this is that it will allow us to chain operations using TensorFlow without fetching unnecessary tensor data from intermediate results. Also, TensorFlow operates on constants, not tensors, meaning that we would also need to convert a tensor automatically to a constant before each function call if have this API at the I won't get to deep on the details here (let me know if you are interested) but basically what this mean is that having a generic linear algebra API in the Still, So I suggest that we merge this PR as proposed and move on other improvements (e.g. I see a possibility to work exclusively with the rich |
I also happen to share this personal preference, but we have to be realistic...
Right, OpenBLAS is a better candidate as foundation than MKL.
I'm OK with all you're saying. However, I'm seriously skeptical that a module using the group and package |
To give it another organization name, it needs to be moved out the TensorFlow org as well. Let me check with Joana what it is possible to do here. |
Here's another idea. Swift is putting their Keras-like API in a separate repository https://github.com/tensorflow/swift-apis, but we could go the other way around instead putting stuff like ndarray in a repository like https://github.com/tensorflow/java-ndarray and clarifying in the README.md file that this is a 100% pure Java library that doesn't depend on the TensorFlow C/C++ runtime. I think that would put enough distance with the rest of the code, making it sufficiently clear, even if it remains in the BTW, TensorFlow.js https://github.com/tensorflow/tfjs has plenty of backends that don't depend on C/C++ and all their subdirectories start with "tfjs-"... |
I like that idea, actually. I've already requested Joana about moving the code out of the TensorFlow org but I will follow up with her with that suggestion as well. I guess it is a matter of how fast we can get this new repository ready. |
Thinking about this some more, and from a strictly branding point of view, the fact that it's "TensorFlow" clearly isn't causing any issues for TensorFlow.js and its non-native backends, so I don't think this is the problem here either. It's more about emphasizing its state as "100% pure Java", so what about naming the module "tensorflow-java-ndarray"? That is like the "Java NdArray" project from the TensorFlow organization. I feel it contrasts enough with the other modules too, even with everything in the same repository:
We could also have a |
Hey @saudet , sorry if I didn't come back to you sooner on this, I was waiting for some feedbacks from Joana before continuing this discussion (maybe to be discussed at the meeting on Friday, I won't be there though). Personally, I think that the best of your suggestions is to have a separate repository under the TensorFlow organization to host this code, i.e. |
From a branding perspective, putting it in another repository makes sense I think, but from a technical perspective, if the API still isn't stable, it will entail additional burden on the part of developers whenever the API changes. Have you weighted this in your decision? Trust me, syncing multiple repositories manually all the time isn't pretty. You need to be sure about this. BTW, I think what I like about "java-ndarray" is the fact that it can refer to a project that is named "Java NdArray at the TensorFlow organization". If we don't put "java" everywhere, it ends up being the "NdArray project at the TensorFlow organization", which doesn't have the same ring to it... In any case, as I mentioned many times already, I don't think that a 100% pure Java NumPy-like library is going to be useful. We need native backends... |
It might get things a bit more complex yes but I really see this project as something that can evolve at a different pace than the Tensorflow Java API itself so yes, make sense again to have it in a separate repository once we have it.
Personally, I still prefer with the So based on this, and I think @Craigacp also agreed on this, I'll rebase my actual PR and merge it with this name, increasing the version to 0.2.0 to prevent bad surprises from users actually using our snapshots. Thanks a lot for all your advices. |
87d4607
to
25abbbf
Compare
9bc2e15
to
de6058d
Compare
I did some interesting discoveries with this PR, our actual configuration for running unit tests in The reason is simple: we depend on the I tried to depend then on So as a temporary solution before we find something cleaner to fix this issue (maybe out-of-scope of this PR), I suggest to disable the |
Build with |
@saudet the problem is occurring in |
Maven profiles work with transitive dependencies, that's fine. |
OK you are right, in fact I mixed up things a bit and realized that I got this error only locally when I was not setting the |
de6058d
to
4465028
Compare
(draft for now, to trigger compilation across platforms...)