-
Notifications
You must be signed in to change notification settings - Fork 1.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GeoTile and GeoHash Grid aggregations on GeoShapes. #5589
Conversation
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple minor comments, but as I'm not a subject matter expert here I unfortunately can't really give a good review on the core of this change.
modules/geo/src/internalClusterTest/java/org/opensearch/geo/GeoModulePluginIntegTestCase.java
Show resolved
Hide resolved
...rc/internalClusterTest/java/org/opensearch/geo/search/aggregations/bucket/GeoHashGridIT.java
Outdated
Show resolved
Hide resolved
* @return A list of the new registrar functions | ||
*/ | ||
@Override | ||
public List<Consumer<ValuesSourceRegistry.Builder>> getAggregationExtentions() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please clarify why this registration has been removed? I do not 100% understand the consequences yet, but the comment documents the intention pretty well why it is needed: GEO_SHAPE
is defined in core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was removed because I move it to GeoBoundsAggregatorFactory.java. When I was working on GeoBounds Aggregation I was thinking that to add new aggregations I need to put it in getAggregationExtentions(). But while I am working on GeoTile and GeoHash I realized that these aggregations can be added in *AggregatorFactory.class . The PR contains that change.
I hope this clarifies.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, yeah, I saw that, really unsure about the consequences but the change (AggregatorFactory
) looks good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the changes via IT and also manual testing it works. Also, AggregatorFactory is used to register the aggregation. Given that this is same aggregation but on a different SourceValue type I don't see any issue. The only thing I can think of is, is this the right place from a code abstraction standpoint. I will wait for @nknize to comment on this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But as per my understanding of the Aggregation framework, I think this looks good, as aggregations are present in same module/plugin. So we are good. getAggregationExtentions : function is provided to extend the aggregations by the plugin for a different Source value type but having same aggregation name defined in core. Doc link: https://github.com/opensearch-project/OpenSearch/blob/main/server/src/main/java/org/opensearch/plugins/SearchPlugin.java#L174-L176
...c/main/java/org/opensearch/geo/search/aggregations/bucket/geogrid/util/GeoShapeHashUtil.java
Outdated
Show resolved
Hide resolved
Gradle Check (Jenkins) Run Completed with:
|
Gradle Check (Jenkins) Run Completed with:
|
Codecov Report
@@ Coverage Diff @@
## main #5589 +/- ##
============================================
+ Coverage 70.66% 70.72% +0.06%
- Complexity 58526 58687 +161
============================================
Files 4769 4772 +3
Lines 280686 280776 +90
Branches 40533 40547 +14
============================================
+ Hits 198333 198582 +249
+ Misses 66093 65878 -215
- Partials 16260 16316 +56
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
@nknize can you please take a look on the PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Beautiful!!! One little nitpick.. can be done in a follow up PR. Thanks!
* methods which can be used across different bucket aggregations. If there is any common code that can be used | ||
* across other integration test too then this is not the class. Use {@link GeoModulePluginIntegTestCase} | ||
*/ | ||
public abstract class AbstractBucketAggregationIntegTest extends GeoModulePluginIntegTestCase { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: maybe AbstractGeoBucketAggregationIntegTest
to avoid confusion w/ other tests like AbstractBucketMetricsTestCase
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have fixed it. @nknize can you please merge in this PR after squashing the commits.
Signed-off-by: Navneet Verma <[email protected]>
…apes Signed-off-by: Navneet Verma <[email protected]>
…t index preparation code. Signed-off-by: Navneet Verma <[email protected]>
Gradle Check (Jenkins) Run Completed with:
|
The backport to
To backport manually, run these commands in your terminal: # Fetch latest updates from GitHub
git fetch
# Create a new working tree
git worktree add ../.worktrees/backport-2.x 2.x
# Navigate to the new working tree
pushd ../.worktrees/backport-2.x
# Create a new branch
git switch --create backport/backport-5589-to-2.x
# Cherry-pick the merged commit of this pull request and resolve the conflicts
git cherry-pick -x --mainline 1 ee58457bd5a7313bab201edd38195902f859027c
# Push it to GitHub
git push --set-upstream origin backport/backport-5589-to-2.x
# Go back to the original working tree
popd
# Delete the working tree
git worktree remove ../.worktrees/backport-2.x Then, create a pull request where the |
This PR is for 2.7 release. Will add this PR in 2.x branch once 2.6 is released |
@navneet1v Just curious, why can't you backport this now for the next release? Is the code here somehow incomplete? |
The code is not incomplete. This feature has a UI part also which is not completed. Hence I am not merging it in 2.x Once it is merged in 2.x then it has to go in next release which is 2.6 |
I understand we might hold of on announcing the feature until the UI part is ready, but if the server portion is complete why not merge it when it is ready? In general early and continuous integration is better and it is one less thing to keep track of. It is probably not a big deal in this case but may be a useful example as a precedent. |
Help me understand this, if a feature is required to be launched in 2.7 can i merge it early like before 2.6 launch in 2.x branch? |
From a purely technical point of view, if a backward compatible change is committed to main then it should be backported to 2.x. The risk of not doing so is that if another change comes in that touches the same part of code as this change did, then it won't be able to cleanly backport because of the divergent branches. All committed code should be releasable. I believe the code here is in fact releasable (technically), but is being held back for other reasons, and that is what I'm pushing on. As a thought experiment, if we decided to scrap the 2.6 release and instead release 3.0, what would you do? Would you revert this code? |
What you are saying about this is correct. If conflicting changes come it won't be a clean backport. But here is my question with the current branching strategy if we are working on a change that requires multiple commits are we not supposed to merge in main? Where each commit can be built separately? If I merge the changes for a release which is going to happen like 3 months down the road putting it in 2.x branch will make the cutting of 2.6 branches difficult. We need to revert some changes from 2.x which will add more burden on the release owner. The strategy I am implementing here is keep on merging changes in main branch and only backport a feature when all other parts are fully ready. If let's say I merge a big feature in 1 single commit, then approving that PR will be very difficult as it might contain like 100 files in it. I also cannot create a feature branch where I can put all the approved commits and then finally merge in main and 2.x because I don't have permission to create feature branch or keep it in sync with main. On scrapping of all next 2.x releases and releasing 3.0 using main case, if that happens and we cut a 3.x branch. I am hoping maintainers will provide enough time to revert the commits which are not intended to be released. Moreover, all this is my understanding based on what we do in our other repos that follow same branching strategy. I am adding @dblock to comment here. As he originally proposed the branching strategy. He might have some thoughts here. |
To be clear, the alternative I'm advocating here is that we should backport this PR and these server-side changes should be available to users in 2.6. Why is this not acceptable? I am absolutely not advocating for developing a feature in a single big commit. But the incremental commits along the way should be releasable. |
…roject#5589) Src files for GeoTile and GeoHash Aggregations on GeoShape with integration tests. Signed-off-by: Navneet Verma <[email protected]>
…roject#5589) Src files for GeoTile and GeoHash Aggregations on GeoShape with integration tests. Signed-off-by: Navneet Verma <[email protected]>
…roject#5589) Src files for GeoTile and GeoHash Aggregations on GeoShape with integration tests. Signed-off-by: Navneet Verma <[email protected]>
…roject#5589) Src files for GeoTile and GeoHash Aggregations on GeoShape with integration tests. Signed-off-by: Navneet Verma <[email protected]>
* Add GeoTile and GeoHash Grid aggregations on GeoShapes. (#5589) Src files for GeoTile and GeoHash Aggregations on GeoShape with integration tests. Signed-off-by: Navneet Verma <[email protected]> * [opensearch-project/geospatial#212] Fixing the IT for GeoTilesAggrega… (#6120) Fixing the IT for GeoTilesAggregation. Signed-off-by: Navneet Verma <[email protected]> * [#6187, #6222] Fixing the GeoShapes GeoHash and GeoTile Aggregations Integration tests. (#6242) Changes done: * Fixed the ArrayIndexOutOfBoundsException. * Reduced the precision for GeoShapes Aggregation IT testing. Signed-off-by: Navneet Verma <[email protected]> * [#7101] Fixing the GeoTileIT#testMultivaluedGeoPointsAggregation test case. (#7166) The issue was happening because we encode the GeoPoint as long and error comes in the precision due to that encoding. The error was not taken care while generating the exepected tiles count for execpected output. Signed-off-by: Navneet Verma <[email protected]> --------- Signed-off-by: Navneet Verma <[email protected]> Signed-off-by: Heemin Kim <[email protected]> Co-authored-by: Navneet Verma <[email protected]>
Description
Issues Resolved
#4072 , #4071
Check List
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.