osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Review Request (CALCITE-2689). ElasticSearch Adapter. Grouping on date / number fields.


I haven't had a chance to review but saw that Elastic has the same issue
with aggregations.

https://github.com/elastic/elasticsearch/issues/35745

Kevin Risden

On Wed, Nov 28, 2018, 20:46 Andrei Sereda <andrei@xxxxxxxxx wrote:

> Greetings ,
> We have discovered an issue with ES aggregations when grouping on
> non-textual fields (date, long). Currently the following sql fails because
> for missing value
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-terms-aggregation.html#_missing_value_13
> >
> we inject __MISSING__ sentinel which is not date / number parseable (it
> can’t be null either) :
>
> select max(amount), date from orders group by date -- special ES type
>
> The solution is to make sentinel type specific :
>
>    1. Integer.MIN_VALUE for integers
>    2. 9999-12-31 for dates etc.
>
> For low cardinality types like boolean, byte, short I’m not sure what to do
> since there is high probability of collision between missing field and
> actual value (eg. what value to choose for missing boolean?) :
>
> select max(amount), isActive from orders group by isActive -- boolean type
>
> Let me know if you solved this problem differently before. Composite
> aggregations
> <
> https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations-bucket-composite-aggregation.html
> >
> (available since 6.1) should help in future.
>
> PR: https://github.com/apache/calcite/pull/946
> JIRA: https://issues.apache.org/jira/browse/CALCITE-2689
>
> Many Thanks in Advance,
> Andrei.
>