[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Percentile calculations with Druid


Hi,
Thank you guys for your response. Is there a comparative analysis of the
possible solutions(approximate-histograms, data sketches etc.) such that we
know which algorithm fits which use cases. For a use case in my company in
which aggregation is done on about 1 - 2 billion events(over a week),
calculating 95 and 99 percentile using approxHistogram is giving highly
variable and unreliable results for the same query.
*Sample*:
"percentile95_total_page_load_time" : 3744.591,
VS
"percentile95_total_page_load_time" : 1.62524006E9,
when the number of buckets is kept at 2.


So, I want to understand if data sketch(or any other solution) is a better
alternative  and in what all scenarios ?

Thanks a lot


On Sat, Sep 1, 2018 at 3:22 AM Samarth Jain <samarth@xxxxxxxxxx> wrote:

> At my work, we have been using t-digest and yahoo quantile sketches for
> percentiles and building sketches. I am working on contributing the modules
> to the community.
>
> On Fri, Aug 31, 2018 at 2:50 PM eyal.yurman@xxxxxxxxx <
> eyal.yurman@xxxxxxxxx>
> wrote:
>
> > Hi,
> >
> > I would look at data sketches as potentially a better alternative, with
> > the recent release of "Numeric quantiles sketch aggregator" (As noted in
> > the release notes:
> > https://github.com/apache/incubator-druid/releases/druid-0.12.0).
> >
> > Unfortunately, the documentation wasn't ready with that release, thus is
> > not available on the public website yet. It is marked to be released on
> the
> > next major milestone (0.13.0) and viable on the Druid master branch:
> >
> https://github.com/apache/incubator-druid/blob/master/docs/content/development/extensions-core/datasketches-quantiles.md
> >
> >
> > On 2018/08/31 10:13:56, Abhishek Kaushik <akaushik079@xxxxxxxxx> wrote:
> > > Hi,
> > >
> > > I wanted to know which algorithm/s druid uses to compute percentile
> > values.
> > > I came across this doc:
> > >
> >
> http://druid.io/docs/latest/development/extensions-core/approximate-histograms.html
> > > but it mentions that there are "no formal error bounds on the
> > > approximation". So, just want to know what all options are available if
> > one
> > > wishes to compute percentiles?
> > >
> > > Thanks in advance.
> > >
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxx
> > For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxx
> >
> >
>