osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: 回复: count(distinct col) can't recognize distinct


The plan is correct for the following query:

select channel, count(distinct version) from daming group by channel
76:SolrAggregate(group=[{0}], EXPR$1=[COUNT($1)])
  74:SolrAggregate(group=[{4, 7}])
    0:SolrTableScan(table=[[test03.loc:2181/dmsolr, daming]])

It will first get group by channel and version, then group by channel. the count will be the number of distinct versions. It is an optimization for single aggregation with count distinct.


Thanks ~
Haisheng Yuan
------------------------------------------------------------------
发件人:大明<zteny@xxxxxxxxxxx>
日 期:2018年11月14日 11:20:41
收件人:dev<dev@xxxxxxxxxxxxxxxxxx>
主 题:回复: count(distinct col) can't recognize distinct

Maybe I need to show more detail about this.
I printed the explaination using `input.explain(pw)`.


select channel, count(distinct version), sum(amount) from daming group by channel
500:SolrAggregate(group=[{4}], EXPR$1=[COUNT(DISTINCT $7)], EXPR$2=[SUM($0)])
  0:SolrTableScan(table=[[test03.loc:2181/dmsolr, daming]])


select channel, count(distinct version) from daming group by channel
76:SolrAggregate(group=[{0}], EXPR$1=[COUNT($1)])
  74:SolrAggregate(group=[{4, 7}])
    0:SolrTableScan(table=[[test03.loc:2181/dmsolr, daming]])



Thanks!


Best Regards
haochao.zhuang


------------------ 原始邮件 ------------------
发件人: "大明"<zteny@xxxxxxxxxxx>;
发送时间: 2018年11月13日(星期二) 晚上10:43
收件人: "dev"<dev@xxxxxxxxxxxxxxxxxx>;

主题: 回复: count(distinct col) can't recognize distinct







Dears


i catch it in the solr adaptor. it is implemented by solr as elasticsearch adaptor.


case a: correct
select a, count(distinct b) from tbl group by a;  


case b: incorrect
1. select a, count(distinct b) from tbl where c = 'c' group by a;

2. select a, count(distinct b), max(c)  from tbl group by a; 




attached solr's code


public void implement(Implementor implementor) {
    implementor.visitChild(0, getInput());


    final List<String> inNames = SolrRules.solrFieldNames(getInput().getRowType());
    for(Pair<AggregateCall, String> namedAggCall : getNamedAggCalls()) {


      AggregateCall aggCall = namedAggCall.getKey();


      Pair<String, String> metric = toSolrMetric(implementor, aggCall, inNames);
      implementor.addReverseAggMapping(namedAggCall.getValue(), metric.getKey().toLowerCase(Locale.ROOT)+"("+metric.getValue()+")");
      implementor.addMetricPair(namedAggCall.getValue(), metric.getKey(), metric.getValue());
    }
    ...
}


Best Regards
Haochao Zhuang






------------------ 原始邮件 ------------------
发件人: "Michael Mior"<mmior@xxxxxxxxxx>;
发送时间: 2018年11月13日(星期二) 晚上9:35
收件人: "dev"<dev@xxxxxxxxxxxxxxxxxx>;

主题: Re: count(distinct col) can't recognize distinct



Can you give a specific example of a query that doesn't do what you expect
it to do?

--
Michael Mior
mmior@xxxxxxxxxx


Le mar. 13 nov. 2018 à 07:56, 大明 <zteny@xxxxxxxxxxx> a écrit :

> Hi Guys:
>
>
> I got a question why it can't recognize DISTINCT tag on COUNT(DISTINCT
> col) aggregation function sometimes.
>
>
> but it is not problem when my sql query statement have WHEN clause or one
> more aggregation functions.
>
>
> Best Regards
> Haochao Zhuang