-
Notifications
You must be signed in to change notification settings - Fork 15
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tests and examples for the complex relative entropy cone #831
Conversation
I can give a preliminary pass over your questions but Chris would need to add his input.
|
Thanks for the answers!
|
Could you rebase this branch on the current master, so that it works with julia 1.10? |
You're right, I already forgot about that :)
It should be OK to merge from master |
… matrix relative entropy
🤦 Yes, of course, I can just rebase it myself, I don't know what I was thinking. |
should be ok to merge in master again for the 1.10 fixes. you will have to Pkg add JuliaFormatter before dev'ing Hypatia and then run |
also i'm in the process of moving the repo ownership over to the jump-dev github organization, so that it can be a bit better maintained! |
I understand, life after academia is tough. I've merged in master and ran JuliaFormatter, let's see how it goes now. |
thank you. the format checks now pass. the failing examples seem to just be related to the sparse PSD cone. I'll have a look into those failures later today. |
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #831 +/- ##
==========================================
- Coverage 96.86% 96.83% -0.04%
==========================================
Files 56 56
Lines 8976 9058 +82
==========================================
+ Hits 8695 8771 +76
- Misses 281 287 +6 ☔ View full report in Codecov by Sentry. |
Oh. Somewhere a mess was made, but there is still a |
Anyway, I think we should merge this as-is. Our default is to add a hessian product when we can do so. It has helped for numerics, and in some cases, speed. It has also caused some minor speed regressions in cases where numerics were already fine, but that hasn't stopped us in the past. |
No, this PR's branch lives in my fork. The latest commit in Hypatia's In any case, I'm ok with merging, I can easily disable the product oracle in my code to do more careful benchmarks, and perhaps find what is causing the slowdown. I just wanted your opinion on the first question I asked, is there a way to get rid of my hack and get Also, there is another puzzling behaviour that I have observed, at some point I inadvertently added a redundant PSD constraint for one of the Hermitian matrices going into the relative entropy cone. To my surprise this did not make the solver go crazy, but instead made it faster. |
Completely unrelated: I think equation (60) of |
More careful benchmarking shows that the product oracle is innocent. The great speedup I was seeing in the easy instances was due to changing Julia from 1.9 to 1.10. Now testing everything on 1.9, I get pretty much the same time there.
So we see that there was a regression somewhere between aaf6669 and dd9978b making the harder instances drastically slower, whereas the product oracle made the situation a little bit better. |
@araujoms thanks for the concrete analysis. could you try to narrow it down to the commit(s) that cause the performance degradation you see? |
I did a binary search and found the offending commit: 676b055. That's where I added the asymptotic expansion for the central point, which delivered a closer approximation to it. I am very confused. I was certain that there would be no noticeable difference in performance, as the nonlinear fit you were using already delivered a good approximation, and the difference from mine was only about 10^-4. If anything I would expect the performance to be slightly better, not 2 times worse! I've now tested providing an even better approximation, correct up to 10^-6. It did improve the performance with respect to my asymptotic expansion, but it was still significantly slower than your "bad" approximation. I don't know what to do. If simply providing the best central point gave the optimal performance I could just generate the central points for dimensions 2 to 1000 and add that as a file, leaving the rest to the asymptotic expansion. But since this is not the case, how on Earth can I find the central point that gives the optimal performance!? |
That change you made is to the epirelentropy cone, not epitrrelentropytri. Was that intentional? I wasn't aware that you were using both cones. |
The central points of the matrix relative entropy cone are a simple function of the central points of the vector relative entropy cone, so the code just calls them from there. I don't use the actual vector relative entropy cone for anything. |
Got it, right. Sorry I'm on mobile so can't look more closely. I think this situation is just a fluke. Something about your specific examples seems to make the old central point value work slightly better for larger cases. This kind of thing happens... Really we need a broad diversity of examples generated from many different problems to show that certain changes to Hypatia help or hinder performance on average. Lacking that, we just use simple math heuristics and some guesswork. I like that you derived asymptotic values for the central point (though I didn't verify them), and you can decide whether to leave them in the PR. Maybe you just want to increase the dimension threshold to something higher. In any case, modulo what you decide there, I think this PR is done. |
You're right, I've tested with a couple other problems, and the performance does indeed vary, sometimes it's better with a closer approximation to the central point, sometimes it's not. Since the performance depends so critically on the central point I'd like to add the solutions for the first 1000 dimensions. I'm unsure about what to do about the file IO; normally I'd use JLD2, but I assume you're not keen on adding another dependency to Hypatia. |
Our experience is that performance impacts of central point precision are fairly random, as long as it's somewhat close, hence why we used pretty simple approximations. I'm not keen for this cone to have 1000 points, and I don't think it's going to make any noticeable difference in general, so long as the approximations we have already are within the vague neighborhood of the central point. If you want, for your paper, you could experiment with that and not merge it. But I'd like to keep the central points that we have in Hypatia simple and consistent across the different cones. |
How about 100 points then? I could just save the literals in |
Sorry to be a pain but we don't do that level of precision for other cones - we settled on the simple step-wise approximations that we use because getting more precise than that wasn't really helpful on our benchmarks in general, and allows us to keep the code concise and maintainable. Even if you could show convincing evidence that actually hard coding 100 points is better on a variety of benchmarks from several different applications, I'd still be a littttttle reluctant. Why not just go back to our original initial points that we used before that commit that made things slower? |
Also the initial point should (I hope) be user-specifiable without modifying Hypatia source code - so you could drop down to that in the experiments you are running without needing to fork Hypatia. |
Fine, I just set the asymptotic expansion to kick in after dimension 300, where your nonlinear fit really breaks down. It is of little relevance in practice as dimension 300 is hardly solvable. |
@lkapelevich @odow @araujoms would you like to review this PR before merge? |
I don't think I understand the code enough to competently review. |
Not before the merge. My PhD student will go over it and hopefully write the inverse Hessian oracle, but that will still take some time. |
Nothing from me |
Merged. I'll tag today |
Thanks everyone! |
New version 0.8.0 is tagged |
I got the tests to pass and the examples to run with the new cone (I get an error with "contraction", "semidefinitepoly", and "convexityparameter" but they are unrelated).
arrayutilities.jl
. I couldn't figure out how to get the functionssmat_to_svec!
andsvec_to_smat!
to dispatch correctly with complex Hermitian JuMP variables. I hope you can tell me how to do that so I can fix it.