Re: Question about sketches aggregation in druid
This is also my understanding.But even in a single-writer-single-reader scenario removing the lock can increase the throughput of accesses to the object.
If the union is only used to produce the result at query time then removing the lock would not affect ingestion throughput, but could decrease query latency.However, I don't understand why is the union object read before the result is ready.
On Tuesday, July 10, 2018, 8:13:36 PM GMT+3, Gian Merlino <gian@xxxxxxxxxx> wrote:
To my knowledge, in the Druid Aggregator and BufferAggregator interfaces,
the main place where concurrency happens is that "aggregate" and "get" may
be called simultaneously during realtime ingestion. So if there would be a
benefit from improving concurrency it would probably end up in that area.
On Tue, Jul 10, 2018 at 2:10 AM Eshcar Hillel <firstname.lastname@example.org>
> Hi All,
> My name is Eshcar Hillel from Oath research. I'm currently working with
> Lee Rhodes on committing a new concurrent implementation of the theta
> sketch to the sketches-core library.I was wondering whether this
> implementation can help boost the union operation that is applied to
> multiple sketches at query time in druid.From what I see in the code the
> sketch aggregator uses the SynchronizedUnion implementation, which
> basically uses a lock at every single access (update/read) of the union
> operation. We believe a thread-safe implementation of the union operation
> can help decrease the inherent overhead of the lock.
> I will be happy to join the meeting today and briefly discuss this option.