Skip to content

Commit

Permalink
Adding approximate point range query
Browse files Browse the repository at this point in the history
Signed-off-by: Harsha Vamsi Kalluri <[email protected]>
  • Loading branch information
harshavamsi committed Aug 21, 2024
1 parent 95236d6 commit 2a6d7bb
Show file tree
Hide file tree
Showing 15 changed files with 1,254 additions and 607 deletions.
41 changes: 12 additions & 29 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,35 +5,18 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),

## [Unreleased 2.x]
### Added
- Constant Keyword Field ([#12285](https://github.com/opensearch-project/OpenSearch/pull/12285))
- Convert ingest processor supports ip type ([#12818](https://github.com/opensearch-project/OpenSearch/pull/12818))
- Add a counter to node stat api to track shard going from idle to non-idle ([#12768](https://github.com/opensearch-project/OpenSearch/pull/12768))
- Allow setting KEYSTORE_PASSWORD through env variable ([#12865](https://github.com/opensearch-project/OpenSearch/pull/12865))
- [Concurrent Segment Search] Perform buildAggregation concurrently and support Composite Aggregations ([#12697](https://github.com/opensearch-project/OpenSearch/pull/12697))
- [Concurrent Segment Search] Disable concurrent segment search for system indices and throttled requests ([#12954](https://github.com/opensearch-project/OpenSearch/pull/12954))
- Rename ingest processor supports overriding target field if exists ([#12990](https://github.com/opensearch-project/OpenSearch/pull/12990))
- [Tiered Caching] Make took time caching policy setting dynamic ([#13063](https://github.com/opensearch-project/OpenSearch/pull/13063))
- Derived fields support to derive field values at query time without indexing ([#12569](https://github.com/opensearch-project/OpenSearch/pull/12569))
- Detect breaking changes on pull requests ([#9044](https://github.com/opensearch-project/OpenSearch/pull/9044))
- Add cluster primary balance contraint for rebalancing with buffer ([#12656](https://github.com/opensearch-project/OpenSearch/pull/12656))
- [Remote Store] Make translog transfer timeout configurable ([#12704](https://github.com/opensearch-project/OpenSearch/pull/12704))
- Reject Resize index requests (i.e, split, shrink and clone), While DocRep to SegRep migration is in progress.([#12686](https://github.com/opensearch-project/OpenSearch/pull/12686))
- Add support for more than one protocol for transport ([#12967](https://github.com/opensearch-project/OpenSearch/pull/12967))
- [Tiered Caching] Add dimension-based stats to ICache implementations. ([#12531](https://github.com/opensearch-project/OpenSearch/pull/12531))
- Add changes for overriding remote store and replication settings during snapshot restore. ([#11868](https://github.com/opensearch-project/OpenSearch/pull/11868))
- Add an individual setting of rate limiter for segment replication ([#12959](https://github.com/opensearch-project/OpenSearch/pull/12959))
- [Tiered Caching] Expose new cache stats API ([#13237](https://github.com/opensearch-project/OpenSearch/pull/13237))
- [Streaming Indexing] Ensure support of the new transport by security plugin ([#13174](https://github.com/opensearch-project/OpenSearch/pull/13174))
- Add cluster setting to dynamically configure the buckets for filter rewrite optimization. ([#13179](https://github.com/opensearch-project/OpenSearch/pull/13179))
- [Tiered Caching] Gate new stats logic behind FeatureFlags.PLUGGABLE_CACHE ([#13238](https://github.com/opensearch-project/OpenSearch/pull/13238))
- [Tiered Caching] Add a dynamic setting to disable/enable disk cache. ([#13373](https://github.com/opensearch-project/OpenSearch/pull/13373))
- [Remote Store] Add capability of doing refresh as determined by the translog ([#12992](https://github.com/opensearch-project/OpenSearch/pull/12992))
- [Batch Ingestion] Add `batch_size` to `_bulk` API. ([#12457](https://github.com/opensearch-project/OpenSearch/issues/12457))
- [Tiered caching] Make Indices Request Cache Stale Key Mgmt Threshold setting dynamic ([#12941](https://github.com/opensearch-project/OpenSearch/pull/12941))
- Batch mode for async fetching shard information in GatewayAllocator for unassigned shards ([#8746](https://github.com/opensearch-project/OpenSearch/pull/8746))
- [Remote Store] Add settings for remote path type and hash algorithm ([#13225](https://github.com/opensearch-project/OpenSearch/pull/13225))
- [Remote Store] Upload remote paths during remote enabled index creation ([#13386](https://github.com/opensearch-project/OpenSearch/pull/13386))
- [Search Pipeline] Handle default pipeline for multiple indices ([#13276](https://github.com/opensearch-project/OpenSearch/pull/13276))
- Add fingerprint ingest processor ([#13724](https://github.com/opensearch-project/OpenSearch/pull/13724))
- [Remote Store] Rate limiter for remote store low priority uploads ([#14374](https://github.com/opensearch-project/OpenSearch/pull/14374/))
- Apply the date histogram rewrite optimization to range aggregation ([#13865](https://github.com/opensearch-project/OpenSearch/pull/13865))
- [Writable Warm] Add composite directory implementation and integrate it with FileCache ([12782](https://github.com/opensearch-project/OpenSearch/pull/12782))
- [Workload Management] Add QueryGroup schema ([13669](https://github.com/opensearch-project/OpenSearch/pull/13669))
- Add batching supported processor base type AbstractBatchingProcessor ([#14554](https://github.com/opensearch-project/OpenSearch/pull/14554))
- Fix race condition while parsing derived fields from search definition ([14445](https://github.com/opensearch-project/OpenSearch/pull/14445))
- Add `strict_allow_templates` dynamic mapping option ([#14555](https://github.com/opensearch-project/OpenSearch/pull/14555))
- Add allowlist setting for ingest-common and search-pipeline-common processors ([#14439](https://github.com/opensearch-project/OpenSearch/issues/14439))
- Create SystemIndexRegistry with helper method matchesSystemIndex ([#14415](https://github.com/opensearch-project/OpenSearch/pull/14415))
- Print reason why parent task was cancelled ([#14604](https://github.com/opensearch-project/OpenSearch/issues/14604))
- [Range Queries] Add new approximateable query framework to short-circuit range queries ([#13788](https://github.com/opensearch-project/OpenSearch/pull/13788))

### Dependencies
- Bump `org.apache.commons:commons-configuration2` from 2.10.0 to 2.10.1 ([#12896](https://github.com/opensearch-project/OpenSearch/pull/12896))
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
---
"search with approximate range":
- do:
indices.create:
index: test
body:
mappings:
properties:
date:
type: date
index: true
doc_values: true

- do:
bulk:
index: test
refresh: true
body:
- '{"index": {"_index": "test", "_id": "1" }}'
- '{ "date": "2018-10-29T12:12:12.987Z" }'
- '{ "index": { "_index": "test", "_id": "2" }}'
- '{ "date": "2020-10-29T12:12:12.987Z" }'
- '{ "index": { "_index": "test", "_id": "3" } }'
- '{ "date": "2024-10-29T12:12:12.987Z" }'

- do:
search:
rest_total_hits_as_int: true
index: test
body:
query:
range: {
date: {
gte: "2018-10-29T12:12:12.987Z"
},
}

- match: { hits.total: 3 }

- do:
search:
rest_total_hits_as_int: true
index: test
body:
sort: [{ date: asc }]
query:
range: {
date: {
gte: "2018-10-29T12:12:12.987Z"
},
}


- match: { hits.total: 3 }
- match: { hits.hits.0._id: "1" }

- do:
search:
rest_total_hits_as_int: true
index: test
body:
sort: [{ date: desc }]
query:
range: {
date: {
gte: "2018-10-29T12:12:12.987Z",
lte: "2020-10-29T12:12:12.987Z"
},
}

- match: { hits.total: 2 }
- match: { hits.hits.0._id: "2" }
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@
import org.opensearch.index.query.QueryShardContext;
import org.opensearch.search.DocValueFormat;
import org.opensearch.search.approximate.ApproximatePointRangeQuery;
import org.opensearch.search.approximate.ApproximateableQuery;
import org.opensearch.search.approximate.ApproximateScoreQuery;
import org.opensearch.search.lookup.SearchLookup;

import java.io.IOException;
Expand All @@ -82,8 +82,8 @@
import java.util.function.LongSupplier;
import java.util.function.Supplier;

import static org.apache.lucene.document.LongPoint.pack;
import static org.opensearch.common.time.DateUtils.toLong;
import static org.apache.lucene.document.LongPoint.pack;

/**
* A {@link FieldMapper} for dates.
Expand Down Expand Up @@ -461,32 +461,52 @@ public Query rangeQuery(
@Nullable DateMathParser forcedDateParser,
QueryShardContext context
) {
failIfNotIndexed();
failIfNotIndexedAndNoDocValues();
if (relation == ShapeRelation.DISJOINT) {
throw new IllegalArgumentException("Field [" + name() + "] of type [" + typeName() + "] does not support DISJOINT ranges");
}
DateMathParser parser = forcedDateParser == null ? dateMathParser : forcedDateParser;
return dateRangeQuery(lowerTerm, upperTerm, includeLower, includeUpper, timeZone, parser, context, resolution, (l, u) -> {
Query query = new ApproximateableQuery(new PointRangeQuery(name(), pack(new long[]{l}).bytes, pack(new long[]{u}).bytes, new long[]{l}.length) {
Query pointRangeQuery = isSearchable() ? createPointRangeQuery(l, u) : null;
Query dvQuery = hasDocValues() ? SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u) : null;
if (isSearchable() && hasDocValues()) {
Query query = new IndexOrDocValuesQuery(pointRangeQuery, dvQuery);

if (context.indexSortedOnField(name())) {
query = new IndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
}
return query;
}
if (hasDocValues()) {
Query query = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
if (context.indexSortedOnField(name())) {
query = new IndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
}
return query;
}
return pointRangeQuery;
});
}

private Query createPointRangeQuery(long l, long u) {
return new ApproximateScoreQuery(
new PointRangeQuery(name(), pack(new long[] { l }).bytes, pack(new long[] { u }).bytes, new long[] { l }.length) {
protected String toString(int dimension, byte[] value) {
return Long.toString(LongPoint.decodeDimension(value, 0));
}
}, new ApproximatePointRangeQuery(name(), pack(new long[]{l}).bytes, pack(new long[]{u}).bytes, new long[]{l}.length) {
},
new ApproximatePointRangeQuery(
name(),
pack(new long[] { l }).bytes,
pack(new long[] { u }).bytes,
new long[] { l }.length
) {
@Override
protected String toString(int dimension, byte[] value) {
return Long.toString(LongPoint.decodeDimension(value, 0));
}
});
if (hasDocValues()) {
Query dvQuery = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
query = new IndexOrDocValuesQuery(query, dvQuery);

if (context.indexSortedOnField(name())) {
query = new IndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
}
}
return query;
});
);
}

public static Query dateRangeQuery(
Expand Down
Loading

0 comments on commit 2a6d7bb

Please sign in to comment.