-
Notifications
You must be signed in to change notification settings - Fork 988
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
do not ignore explicitly given mantissa width #868
Conversation
I haven't yet understand the function completely, but it looks |
Chez also runs out of memory with Your example, on the other hand, is an inexact number, so the final result won't take much space. The question is whether the calculation has to take that much space, especially in the less trivial case |
Here's what I'm seeing:
I don't see why there's inherently a problem here. As long as the number written before the |
(Please excuse the delayed answer; it was night here.) Have you tried
and much larger denominators for higher mantissa widths. It would be wrong to truncate the precision. The reason is that 1/10 has a period of length 4 in binary representation. In fact,
in binary representation. From this, we can deduce that
in binary representation, where I rounded to even. This means that larger and larger denominators (all powers of two) are needed the larger the mantissa width is. |
Sorry, I misunderstood what you meant by "Chez also", and i was confused about fractions and binary representations. Thank you for the tutorial! It makes sense that It still seems like |
Also, the results of |
Oh, I see. Indeed, what I wrote wasn't very clear.
Consider the following number in binary notation (where N* means to repeat the following binary digit N times):
Let If N is sufficiently larger than p and q sufficiently larger than N, we have that |
I do not have a full characterisation. But a denominator N means that the quotient has a period length of at most N - 1, so one should be able to reduce the case of an arbitrary mantissa width roughly to the case of a mantissa width <= 2*N. But I wonder whether it makes sense to spend the time getting the details right and writing extra code for huge mantissa widths. In practice, the largest mantissa widths may come from when using libraries like GNU MPFR. Do you think someone would use floats that use megabytes of memory? |
Maybe it is not that complicated to actually implement mantissa truncation.
The estimates can possibly be off by 1 or 2, but one can code safely. |
That sounds really great. As we've established, I'm not clear on the math, but it certainly sounds plausible. My experience with Scheme numbers is that these details end up being worthwhile, even though it means extra code, and even through the happy spaces often end up being complex (e.g., only power-of-two denominators). Unfortunately, my experience with Scheme numbers is also that I have to learn a lot of new things, and then I forget them soon afterward! |
Hi @mnieper — I pushed a commit to add precision bounding in (I think) the way you describe. Does it look right? Do I understand correctly that this captures all of the cases where the end result number can be represented with about the same amount of memory as the number without a precision adjustment? |
Thanks a lot, Matthew. I am going to take a look at it within the next few days. (I wanted to have come up with some code as well but haven't found the time.) |
@mnieper Have you had a chance to take a look? (I have been thinking that it would make sense to create a v10.1.0 release soon, but waiting for this change.) |
Sorry, this somehow fell off my radar. I have some comments (and, hopefully, an improvement of my earlier analysis), which I will post as soon as possible. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My original analysis of the inexact case (which needed the period length and which you didn't implement) was too complicated; the simpler analysis here should suffice (and leads to a simple formula).
s/strnum.ss
Outdated
[(= b (bitwise-arithmetic-shift-left 1 (- b-bits 1))) | ||
;; no need for extra precision if the | ||
;; denominator is a power of 2 | ||
(min p (+ a-bits b-bits))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If b
is a power of two, the number of significant bits of n
is just the number of significant bits of a
. (In the issue, I wrote erroneously "denominator" instead of "numerator" at one point.) Thus, (min p a-bits)
should also work.
s/strnum.ss
Outdated
;; bound p; we don't need a tight bound, and adding 2 | ||
;; extra bits over `double` precision to make sure | ||
;; rounding will be right | ||
(min p (+ (max a-bits b-bits) 53 2))] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are in the case where a binary fraction x
is first rounded to p
significant bits and then to 53 significant bits (round to even in case of a tie). This is, in general, not the same as rounding directly to 53 significant bits. The case where it may differ looks like X.1Y
. Here, X
represents 53 significant binary digits, the dot is the point of rounding and Y
are all further binary digits. The rounding of X.1Y
to 53 binary digits depends on whether Y
vanishes or not. If we first round X.1Y
to p
significant digits, it may happen that a nonzero Y
becomes zero after rounding. Because of this, we must make sure that if we truncate p
, the answer to the question of whether Y
becomes zero or not won't change.
Every binary fraction is a transient (non-repeating digits, possibly trivial) followed by a repetend (repeating digits, possibly just zeros). We are allowed to truncate p
when p
points into the repetend and remains there after truncation (because this won't change whether Y
becomes zero after rounding). So, for this, we have to estimate the length of the transient. This has at most (+ a-bits b-bits)
bits. Together with the 53 bits for X
and 2
bits for a safety measure to exclude possible edge cases, the formula should therefore be (min p (+ a-bits b-bits 53 2))
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
PS I didn't mention that the repetend may start with zero digits. The analysis remains correct when we count the zero digits as part of the transient.
Avoiding running out of memory for a very large precision request when the number with adjusted precision should take about as much memory as the number without an adjustment.
@mnieper Thanks for the corrections! I've made adjustments and rebased. Can you check one last time to make sure I didn't mangle the change? I wasn't able to find any inputs that produce different results before an after the last change, trying millions of samples. That doesn't mean much, since my grasp of the arithmetic is weak. But my fuzzing script was able to find counterexamples within 10000 samples when I mangled the exact case to use |
My analysis was very conservative, so it may very well be that there is no counterexample for your earlier code. I just want to make sure that we are on the safe side. We should improve the estimate once we have a formal proof that a more tight estimate also works. But I think this does not need to be done in the coming Chez version. I think I am |
This fixes the issue raised in #866, making mantissa widths meaningful.