-
Notifications
You must be signed in to change notification settings - Fork 168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement shortcut for 0.0 and 1.0 percentile calculations, and add two new tests #382
Changes from all commits
934fe92
8fd0eee
dabfeaf
d3b7bce
7a4a5cd
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -906,11 +906,27 @@ else if (l1[0] == null) | |
|
||
/** | ||
* Get a {@link Collector} that calculates the derived <code>PERCENTILE_DISC(percentile)</code> function given a specific ordering. | ||
* | ||
* @param function map the items in the streams into values | ||
* @param comparator comparator used for sorting the items | ||
* @return a collector that calculates the derived <code>PERCENTILE_DISC(percentile)</code> function | ||
*/ | ||
public static <T, U> Collector<T, ?, Optional<T>> percentileBy(double percentile, Function<? super T, ? extends U> function, Comparator<? super U> comparator) { | ||
if (percentile < 0.0 || percentile > 1.0) | ||
throw new IllegalArgumentException("Percentile must be between 0.0 and 1.0"); | ||
|
||
// CS304 Issue link: https://github.com/jOOQ/jOOL/issues/376 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't know what CS304 means. Some reference to some external tracking system? There's no need for this, I will remove it. The convention to track github issues (if necessary) would be to use: // [#376] Rationale |
||
if (percentile == 0.0) | ||
// If percentile is 0, this is the same as taking the item with the minimum value. | ||
return minBy(function, comparator); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The comment says the same thing as the method call, so it isn't really necessary. I'll remove it again after merging. |
||
else if (percentile == 1.0) | ||
// If percentile is 1, this is the same as taking the item with the maximum value, | ||
// If there are multiple maxima, take the last one. | ||
return maxBy(function, (o1, o2) -> { | ||
int compareResult = comparator.compare(o1, o2); | ||
return compareResult == 0 ? -1 : compareResult; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This hack violates the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This seems to be the correct way to implement this: collectingAndThen(maxAllBy(function, comparator), s -> s.findLast()) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Emmm, I think my solution is not incorrect, because the function It's perfectly fine to implement this with There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's incorrect. If
You shouldn't rely on such an implementation detail.
Famous last words :)
I'm open to other suggestions, but correctness always beats performance. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, you are right. Thanks for your reply! |
||
}); | ||
|
||
// At a later stage, we'll optimise this implementation in case that function is the identity function | ||
return Collector.of( | ||
() -> new ArrayList<Tuple2<T, U>>(), | ||
|
@@ -929,11 +945,6 @@ else if (size == 1) | |
|
||
l.sort(Comparator.comparing(t -> t.v2, comparator)); | ||
|
||
if (percentile == 0.0) | ||
return Optional.of(l.get(0).v1); | ||
else if (percentile == 1.0) | ||
return Optional.of(l.get(size - 1).v1); | ||
|
||
// x.5 should be rounded down | ||
return Optional.of(l.get((int) -Math.round(-(size * percentile + 0.5)) - 1).v1); | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's probably a good idea to document these in general, but I prefer this be done in a separate task, for the entire API, not just this method. I've created a new issue to track this: #388.