Skip to content

Commit d8163a5

Browse files
[9.0] Support DATE_NANOS in LOOKUP JOIN (#127962) (#128958)
* Support DATE_NANOS in LOOKUP JOIN (#127962) We reported in #127249, there is no support for DATE_NANOS in LOOKUP JOIN, even though DATETIME is supported. This PR attempts to fix that. The way that date-time was supported in LOOKUP JOIN (and ENRICH) was by using the `DateFieldMapper.DateFieldType.rangeQuery` (hidden behind the `termQuery` function) which internally takes our long values, casts them to Object, renders them to a string, parses that string back into an Instant (with a bunch of fancy and unnecessary checks for date-math, etc.), and then converts that instant back into a long for the actual query. Parts of this complex process are precision aware (ie. differentiate between ms and ns dates), but not the whole process. Simply dividing the original longs by 1_000_000 before passing them in actually works, but obviously looses precision. And the only reason it works anyway is that the date parsing code will accept a string containing a simple number and interpret it as either ms since the epoch, or years if the number is short enough. This does not work for nano-second dates, and in fact is far from ideal for LOOKUP JOIN on dates which does not need to re-parse the values at all. This complex loop only makes sense in the Query DSL, where we can get all kinds of interesting sources of range values, but seems quite crazy for LOOKUP JOIN where we will always provide the join key from a LongBlock (the backing store of the DATE_TIME DataType, and the DATE_NANOS too). So what we do here for DateNanos is provide two new methods to `DateFieldType`: * `equalityQuery(Long, ...)` to replace `termQuery(Object, ...)` * `rangeQuery(Long, Long, ...)` to replace `rangeQuery(Object, Object, ...)` This allows us to pass in already parsed `long` values, and entirely skip the conversion to strings and re-parsing logic. The new methods are based on the original methods, but considerably simplified due to the removal of the complex parsing logic. The reason for both `equalityQuery` and `rangeQuery` is that it mimics the pattern used by the old `termQuery` with delegated directly down to `rangeQuery`. In addition to this, we hope to support range matching in `LOOKUP JOIN` in the near future. * Use correct parameter name after backport * [CI] Auto commit changes from spotless --------- Co-authored-by: elasticsearchmachine <infra-root+elasticsearchmachine@elastic.co>
1 parent 6f404b6 commit d8163a5

File tree

11 files changed

+440
-70
lines changed

11 files changed

+440
-70
lines changed

docs/changelog/127962.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,6 @@
1+
pr: 127962
2+
summary: Support DATE_NANOS in LOOKUP JOIN
3+
area: ES|QL
4+
type: bug
5+
issues:
6+
- 127249

server/src/main/java/org/elasticsearch/index/mapper/DateFieldMapper.java

Lines changed: 53 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -472,19 +472,12 @@ public DateFieldType(String name) {
472472
this(name, true, true, false, true, DEFAULT_DATE_TIME_FORMATTER, Resolution.MILLISECONDS, null, null, Collections.emptyMap());
473473
}
474474

475+
public DateFieldType(String name, boolean isIndexed, Resolution resolution) {
476+
this(name, isIndexed, isIndexed, false, true, DEFAULT_DATE_TIME_FORMATTER, resolution, null, null, Collections.emptyMap());
477+
}
478+
475479
public DateFieldType(String name, boolean isIndexed) {
476-
this(
477-
name,
478-
isIndexed,
479-
isIndexed,
480-
false,
481-
true,
482-
DEFAULT_DATE_TIME_FORMATTER,
483-
Resolution.MILLISECONDS,
484-
null,
485-
null,
486-
Collections.emptyMap()
487-
);
480+
this(name, isIndexed, Resolution.MILLISECONDS);
488481
}
489482

490483
public DateFieldType(String name, DateFormatter dateFormatter) {
@@ -698,6 +691,54 @@ public static long parseToLong(
698691
return resolution.convert(dateParser.parse(BytesRefs.toString(value), now, roundUp, zone));
699692
}
700693

694+
/**
695+
* Similar to the {@link DateFieldType#termQuery} method, but works on dates that are already parsed to a long
696+
* in the same precision as the field mapper.
697+
*/
698+
public Query equalityQuery(Long value, @Nullable SearchExecutionContext context) {
699+
return rangeQuery(value, value, true, true, context);
700+
}
701+
702+
/**
703+
* Similar to the existing
704+
* {@link DateFieldType#rangeQuery(Object, Object, boolean, boolean, ShapeRelation, ZoneId, DateMathParser, SearchExecutionContext)}
705+
* method, but works on dates that are already parsed to a long in the same precision as the field mapper.
706+
*/
707+
public Query rangeQuery(
708+
Long lowerTerm,
709+
Long upperTerm,
710+
boolean includeLower,
711+
boolean includeUpper,
712+
SearchExecutionContext context
713+
) {
714+
failIfNotIndexedNorDocValuesFallback(context);
715+
long l, u;
716+
if (lowerTerm == null) {
717+
l = Long.MIN_VALUE;
718+
} else {
719+
l = (includeLower == false) ? lowerTerm + 1 : lowerTerm;
720+
}
721+
if (upperTerm == null) {
722+
u = Long.MAX_VALUE;
723+
} else {
724+
u = (includeUpper == false) ? upperTerm - 1 : upperTerm;
725+
}
726+
Query query;
727+
if (isIndexed()) {
728+
query = LongPoint.newRangeQuery(name(), l, u);
729+
if (hasDocValues()) {
730+
Query dvQuery = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
731+
query = new IndexOrDocValuesQuery(query, dvQuery);
732+
}
733+
} else {
734+
query = SortedNumericDocValuesField.newSlowRangeQuery(name(), l, u);
735+
}
736+
if (hasDocValues() && context.indexSortedOnField(name())) {
737+
query = new XIndexSortSortedNumericDocValuesRangeQuery(name(), l, u, query);
738+
}
739+
return query;
740+
}
741+
701742
@Override
702743
public Query distanceFeatureQuery(Object origin, String pivot, SearchExecutionContext context) {
703744
failIfNotIndexedNorDocValuesFallback(context);

0 commit comments

Comments
 (0)