Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix aliasing in matrix algebra constructor #692

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

martinra
Copy link
Contributor

@martinra martinra commented Dec 3, 2020

Previously diagonal elements of matrices converted from the base ring were aliased.

@rfourquet
Copy link
Contributor

I think we already discussed this problem somewhere, without reaching a conclusion on whether elements should avoid being aliased (I totally may be misremembering). Do you have a use-case for avoiding aliasing?

@martinra
Copy link
Contributor Author

martinra commented Dec 3, 2020

Yes, absolutely. Thanks for the quick reaction.

I found the aliasing while working on my modular forms code. If you use inplace arithmetic and Hecke is loaded, then you get wrong results without the fix.

using Hecke

R,t = PolynomialRing(ZZ, "t")

# MatrixAlgebra(R,2) does not work, because Hecke does not overwite mul! in
# that case.
M = MatrixSpace(R,2,2)

a = one(M)
b = M([1 0; 3 0])
c = M([1 1;0 0])

b*c == mul!(a,b,c)

From a performance perspective, I think Hecke is totally right here. For instance, I have an implementation of FareySymbols which is slower by a factor of 100 without inplace arithmetics. Obviously, one can hand-code this in that setting.

Was the aliasing allowed to save memory or to save the deepcopy, which I know can also be a huge source of slowness?

@wbhart
Copy link
Contributor

wbhart commented Dec 4, 2020

I think if it fixes a bug and clearly isn't going to introduce new bugs we should merge this.

@thofma
Copy link
Member

thofma commented Dec 4, 2020

Yes, looks fine to me. Do we actually document somewhere which functions produce values that can safely be used as the first argument in mul! and friends?

@fieker
Copy link
Contributor

fieker commented Dec 4, 2020 via email

@wbhart
Copy link
Contributor

wbhart commented Dec 4, 2020

Yes, it is documented on a ticket because we never got around to writing developer documentation:

Nemocas/Nemo.jl#278

The relevant rule seems to be:

"matrix functions with an exclamation mark should not mutate the objects that occur as entries of the output matrix, though should be allowed to arbitrarily replace/swap the entries that appear in the matrix. In other words, these functions should be interpreted as inplace operations, rather than operations that are allowed to mutate the actual entries themselves."

@thofma
Copy link
Member

thofma commented Dec 4, 2020

With this ticket in mind, we don't need the PR here, as !-functions are not allowed to mutate the entries of the output.

The problem is that the add! function we define in Hecke does add!(A[i, j], B[i, j], C[i, j]) which I think is illegal according to Nemocas/Nemo.jl#278. In particular according to the rule you cited.

@wbhart
Copy link
Contributor

wbhart commented Dec 4, 2020

Ok then. I guess the question becomes whether Hecke can be modified to obey the rule and still give good performance in @martinra 's case?

@thofma
Copy link
Member

thofma commented Dec 4, 2020

No, it cannot. It is fast because it mutates the entries of the output matrix.

The function works as long as you know that in your output matrix, none of the entries alias each other. At the moment the output of zero_matrix and identity_matrix satisfy this. We usually use those as the first argument to !-functions. But of course this works because of an implementation detail.

I am fully behind Nemocas/Nemo.jl#278, but maybe we need some more unsafe functions for matrices? Like add!! or something like that, where it is assumed that the entries of the output do not alias each other. I am not sure, but from our experience in the last 5 years I can tell that working with matrices is quite inefficient if we don't allow those entry-mutating functions.

@martinra
Copy link
Contributor Author

martinra commented Dec 4, 2020

In my specific case, I was working with matrices over QQab, the abelian closure of QQ, which in the background relies on Antic. So if we now work with, say, mul!! as suggested, we formally have the same problem: In mul!!(::MatElem{nf_elem}...) you would want to call mul!(::nf_elm...), which then may assume that entries are unaliased per Nemocas/Nemo.jl#278, since nf_elem is essentially polynomials. But that would not be true for the current version of the matrix constructors. Or to say it more abstractly, the total aliasing of elements in the MatElem constructor, contradicts any nontrivial assertion of partially unaliased entries.

@thofma
Copy link
Member

thofma commented Dec 4, 2020

I hope I understood it correctly. I agree that the current one(M) (where M is a matrix space) or M(a) constructors yield elements which cannot be used with the proposed add!! (or the current add! from Hecke).

I meant that the current zero_matrix(R, n, m) and identity_matrix(R, n) work fine with add!! (or the current add! from Hecke), because they don't do any aliasing at all.

So my suggestion was to introduce a constructor which guarantees no aliasing, and which is safe to use with add!!.

P.S.: We have purged all MatrixSpace calls from Hecke, because the zero_matrix/identity_matrix/matrix(R, ...) constructors are faster and do no caching.

@martinra
Copy link
Contributor Author

martinra commented Dec 5, 2020

I guess I was misunderstanding your suggestion. Having an additional constructor (or keyword) so that add!! works can be a solution.

Thanks for telling about the MatrixSpace constructors in Hecke. I will consider making similar changes in my code.

One thing that I thought of is that the proposed solution and in fact Nemocas/Nemo.jl#278 would make it very difficult to write generic but efficient implementations for ring elements. For example, look at a hypothetical function power_by_squaring(a::RingElem, ::Int), which I actually have in my code and which is the way I discovered this aliasing originialy. Then depending on whether a <: MatAlgElem or, say, a <: PolyElem you need different implementations to provide good performance. While mul! is semantically always correct, you can write the function in such a way that the much faster mul!! is also possible. Notice that this cannot be solved by providing two implementations power_by_squaring(a::RingElem, ::Int) and power_by_squaring(a::MatAlgElem, ::Int), since there might be types outside of AbstractAlgebra that are not subtypes of MatAlgElem but effectively wrap such a type. One example that comes to my mind is ResElem{MatAlgElem{PolyElem}} which can be used to provide a more memory local implementation of some MatAlgElem{NumFieldElem}. In my code I have quite some types modelling various kinds of finite dimensional representations, which fall in the same category.

I can for now think of at least two solutions to this, but neither is particularly appealing. One way is of cause to revisit Nemocas/Nemo.jl#278, but I'm sure you had your reasons, so I understand that this might not be realistic. Yesterday evening I was also thinking about the rational behind it, and I understand where these guidelines come from.

The second solution could be to provide a general fallback mul!!(as::RingElem...) = mul!(as...), and specialize it where necessary (relatively little code in AbstractAlgebra and also on my side no big deal). The constructor is more work, since keywords are not dispatched on and keywords would appear to me the preferred design here. But it would mean to provide an additional ignored keyword for every RingElem constructor. Quite a mess, in my opinion, and a lot of work. There is a hack in the stile of

struct CompletelyUnaliased end
const completely_unaliased = CompletelyUnaliased

(a::Set)(::Type{CompletelyUnaliased}, args...) = a(args...)
(a::MatAlgebra)(::Type{CompletelyUnaliased}, args...) = a(args...; completely_unaliased = true)

The downside with this is that it violates calling conventions in AbstractAlgebra.

Obviously, one can simply not solve that problem and require that any object that wraps matrix algebra and requires efficiency provides a specialized implementation. While this violates encapsulation, it could be defended by the argument that whoever requires best performance will need to go into the details of their data structures anyway.

Summarizing, this is really a design question, and since I'm merely downstream for Oscar, I would leave this to all of you to make a decision on. I've only tried to depict the bifurcations of Nemocas/Nemo.jl#278. I can effectively live with any of these solutions, and many others that I haven't thought of now. The one thing that would be important to me, is that we don't land in what I call the Sage-trap of totally degraded performance.

@thofma
Copy link
Member

thofma commented Dec 5, 2020

Thanks @martinra for your input.

Because I had to recall our reasoning behind Nemocas/Nemo.jl#278, let me just write down some of the details here, as far as I remember. I think we thought of two solutions:

  1. Allow aliasing between matrix entries. Advantage is that A[i, j] = z does not have to copy and also M([a b; c d]) does not have to make copies. Disadvantage is that no function is allowed to mutate A[i, j] directly. Thus no efficient add! etc. functions.
  2. Disallow aliasing between matrix entries. Advantage is that we have fast entry-mutating functions like add!. Disadvantage is that M([a b; c d]) must internally do M(deepcopy([a b; c d])) and A[i, j] = z must internally do A[i, j] = deepcopy(z). To mitigate this, one can introduce setindex!(A, i, j, z, copy = false) and similar for the constructor.

I think at some point we tried out 2) but it was a lot of work to go over the codebase to introduce copy = false at the right places. (Or maybe there was also a problem with copy = false itself. I don't remember right now.) Thus we chose 1).

I think we could have some completely unaliased constructors. Would the contract of add!!(z, a, b) be, that z must be completely unaliased (a, b arbitrary) and z will stay completely unaliased? Also, if z is completely unaliased, then after f!(z,...), z may loose this property?

@wbhart
Copy link
Contributor

wbhart commented Dec 6, 2020

My opinion on this is that time sensitive routines such as polynomial and matrix arithmetic over number fields should be implemented in C. We face numerous issues by not doing this: 1) performance is poor as per the issues leading to this PR, 2) jit compile times are massive for components using such arithmetic implemented in Julia, 3) we have had issues previously due to matrices over number fields being the only Nemo type without a C implementation (see the similar, zero_matrix, change_base_ring issues).

However, there are two reasons we haven't done such an implementation in C: 1) many algorithms over number fields rely on much more infrastructure, all of which is currently in Hecke. No one has any desire to duplicate all that work, 2) @fieker has argued that once the code is locked away in a "black box" in C, it is hard to work on, improve, experiment. Julia has some advantages after all!

I'd personally be interested in looking into whether there is some natural separation between fundamental algorithms for matrices and polynomials over number fields that could be implemented in Antic, with everything that is less fundamental that can be built on top of Antic remaining in Julia. If so, I believe that would be the best solution, to avoid getting bitten by this sort of thing from time to time.

In the mean time, I have no objection to using the mul!! idea in Nemo, so long as it doesn't propagate. I don't want us to have the library littered with double bang functions. I think it is a dangerous idea that can easily lead to an inconsistent system that will be hard to maintain. The whole issue with a different aliasing system for matrices vs polynomials arises due to Flint in the first place. Therefore it makes the most sense to keep such stuff in Flint and other related C libraries, in my opinion, rather than allow it to propagate to Julia.

Of course, what you do in Hecke is none of my business. So my comments apply only to Nemo, where I really want to be conservative when it comes to changes like this.

@fredrik-johansson
Copy link
Contributor

I'm also interested in having more algorithms for number fields implemented in Antic (I have done some things in Calcium, and that's a good place to do things where a numerical embedding is useful, but clearly more basic things belong in Antic).

@fieker
Copy link
Contributor

fieker commented Dec 8, 2020 via email

@thofma
Copy link
Member

thofma commented Dec 10, 2020

While implementing things in C may help with a special case, this still does not solve the issue here.

This is an issue in AbstractAlgebra, unrelated to Nemo or flint. At the moment we have no way to have fast arithmetic with matrices in AbstractAlgebra. I think we have reasonable aliasing rules, which work quite good in practice. But I think that in the matrix domain we are missing the tools that allow people to write code that is on par with our competitors. I mean not for playing aroud in the REPL, but to do impressive stuff performance wise. I think we should have this, even if it is very "dangerous" for the average user.

Small anectode. I once helped @alexjbest with https://github.com/alexjbest/Coleman.jl. After some work, we were doing not too bad compared to Magma. But we were always lacking behind and it gets worse the larger the parameters (see second to last page https://alexjbest.github.io/papers/coleman-superelliptic.pdf). A mutating matrix multiplication/addition would have helped quite a lot I think. I did not realise it back then, because I forgot that our inplace operations for matrices are not mutating.

@wbhart
Copy link
Contributor

wbhart commented Dec 10, 2020

Yes, I see this is a more general issue than for one specific ring.

Did you reduce to matrix multiplication? This is usually why one falls behind other implementations. It's a little hard to see how to do this generically, as the crossovers are going to be so dependent on the ring.

I'm skeptical that making a more complex generic implementation is really the answer. But I'm sympathetic to the problem (competing with other implementations), obviously.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants