Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Single precision variants #29

Open
dylanede opened this issue Jul 19, 2022 · 6 comments
Open

Single precision variants #29

dylanede opened this issue Jul 19, 2022 · 6 comments

Comments

@dylanede
Copy link

What would it take to produce versions of the functions for single precision floating point? Presumably in most cases it means truncating the series and thus the tables of coefficients at lower indexes, but I'm not sure where to begin with adjusting the test cases and finding the new cutoff points. On another note, what guarantees do the existing functions have with respect to precision over their entire input range?

@Expander
Copy link
Owner

Hi @dylanede ,

that should be straight-forward if one would do what you're suggesting, namely truncating the approximations accordingly. (I guess you're asking for single-precision versions for performance reasons, right?)

To answer your questions:

  • According to the tests the double-precision implementations have a relative precision of 2*std::pow(10.0 , -std::numeric_limits<double>::digits10) = 2e-15 over the whole tested range. This is achieved by mapping the functions to some small interval and using an approximation with sufficient precision there. In this process there should be not loss of precision or so.
  • For single precision one could aim for 2*std::pow(10.0f , -std::numeric_limits<float>::digits10) = 2e-6.
  • The tables with the single-precision coefficients should be stored in the code with 9 significant decimal digits.

To begin with one could start with Li2(float):

  • One could adapt the TEST_CASE("test_real_fixed_values") accordingly.
  • Add Li2(float) to src/cpp/Li2.{hpp,cpp}.
  • To implement Li2(float) one could copy and adapt Li2(double) accordingly:
    • convert all double precision numbers to single precision, like 0.5 -> 0.5f etc.
    • determine the coefficients for the numerator and denominator using the Mathematica script test/math/dilogMinimaxApprox.m. Here one should reduce the number of coefficients in the numerator and denominator (otherwise there would be no performance gain) until one reaches the desired maximum relative error.

I could help with that, if you like. (So far I did not see any application for this, but it seems you have one.)

@dylanede
Copy link
Author

dylanede commented Jul 20, 2022

Thanks for the detailed reply!

I have determined using that script that for the target of 2e-6 relative error, the numbers of coefficients for the numerator and denominator are two and three respectively (giving a relative error of approximately 8e-9). I'll see what I can do with the test cases.

By the way, the main functions I am interested in are the complex versions of Li2, Li3 and Li4, so if/when I submit a PR, it will likely be for those plus their dependencies.

@Expander
Copy link
Owner

Many thanks in advance for preparing a PR!

I have determined using that script that for the target of 2e-6 relative error, the numbers of coefficients for the numerator and denominator are two and three respectively (giving a relative error of approximately 8e-9).

That's pretty good in the sense that the number of coefficients is not very big. (I guess reducing the number of coefficients further would give a relative error of more than ~1e-6, right?)

By the way, the main functions I am interested in are the complex versions of Li2, Li3 and Li4, so if/when I submit a PR, it will likely be for those plus their dependencies.

For the complex versions I've used the helper functions in test/math/polylog.m, which contains all the different expansions in analytic form. But I guess you won't need them because for the complex versions I've used a simple Taylor expansion that you could simply chop off at a sufficient order.

Just out of curiosity, if you don't mind: What exactly is your application? Since you care about performance so much, I guess it you're evaluating these polylogarithms millions of times, right?

@dylanede
Copy link
Author

That's pretty good in the sense that the number of coefficients is not very big. (I guess reducing the number of coefficients further would give a relative error of more than ~1e-6, right?)

Yes, though I did notice that for a given input number of coefficients, the number that comes out is one more than that for both the numerator and denominator, with the first coefficients being 1. So really it's three and four. The next step down in coefficient count produced a relative error of 4.4e-6.

Just out of curiosity, if you don't mind: What exactly is your application? Since you care about performance so much, I guess it you're evaluating these polylogarithms millions of times, right?

Sure. The application relies heavily on parallel Monte-Carlo sampling of distributions proportional to the Green's functions for Laplace's equation on rectangular regions. These functions turn out to be expressible in terms of logarithms of the Jacobi theta functions, and the integrals needed to perform the sampling of them end up as quickly converging (within two or three terms) series involving complex polylogarithms.

I'm currently using the double precision versions simply adjusted to use floats instead, since double precision currently isn't necessary, but I thought I'd enquire about how much effort would be necessary to optimise the functions for floats, in case I find myself needing more performance. So I can't guarantee that the PR will be soon.

@Expander
Copy link
Owner

Yes, though I did notice that for a given input number of coefficients, the number that comes out is one more than that for both the numerator and denominator, with the first coefficients being 1. So really it's three and four. The next step down in coefficient count produced a relative error of 4.4e-6.

Some time ago I found this old paper [Morris], which gives several rational function approximations with different precision. For double precision (index 0011) with 16.2 decimal digits precision it uses 5 (numerator) and 6 terms (denominator), where one of them is equal to 1.

For single precision, where one may be satisfied with 6.7 decimal digits of precision, Morris uses 2 (numerator) and 2 (denominator) terms (index 0004), where one of them is set to 1. If I understand correctly, this matches your observation.

Sure. The application relies heavily on parallel Monte-Carlo sampling of distributions proportional to the Green's functions for Laplace's equation on rectangular regions. These functions turn out to be expressible in terms of logarithms of the Jacobi theta functions, and the integrals needed to perform the sampling of them end up as quickly converging (within two or three terms) series involving complex polylogarithms.

I'm currently using the double precision versions simply adjusted to use floats instead, since double precision currently isn't necessary, but I thought I'd enquire about how much effort would be necessary to optimise the functions for floats, in case I find myself needing more performance. So I can't guarantee that the PR will be soon.

Very interesting, thanks!

No worries regarding the times scale. :)

@Expander
Copy link
Owner

Expander commented Sep 3, 2023

I've drafted a single-precision version of the real dilogarithm li2(float) in the branch feature-li2-float. On my laptop it is faster by a factor ~1.4. I'll see if I can optimize it further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants