osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: execution time and planning time from Calcite


Hi Stamatis,

Thank you so much for your help and I appreciate your prompt response.

I think, I mixed our conversation by bringing in the TPCH query stuff. So,
coming to the tracing part, I was able to create the timing  tracer using
the classes *CalciteTimingTracer* and *CalciteTrace* and was able to
determine the execution time of query.

Now, for my research activities, I had to perform a benchmark comparison
between different Database systems and in my case, I was trying to do it
for *Calcite* and *PostgresSql*. So, I thought TPCH queries were the right
thing to start with. I tried running the TpchTest (
https://github.com/apache/calcite/blob/master/plus/src/test/java/org/apache/calcite/adapter/tpch/TpchTest.java)
by adding the *CalciteTimingTracer* in the junit tests to determine the
execution time. While doing so, I could see that the execution time in
calcite is significantly higher compared to postgresSql. On further
investigation, I could see that we generate the required datas required for
these queries(which comes around 150,000 for some tables) and I was under
an impression that most of the time was spend on the data generation and
that the query execution could be faster. So, I modified the relevant
schema class (
https://github.com/apache/calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java)
to perform the data generation and query execution separately. Then, I
traced the time took for just query execution. Even, then there was a
significant difference from that of PostgresSql.

So, basically I would like to know if its a known issue that calcite takes
such a long time for the query execution for the TPCH queries.

Also, now coming to the personal preference, I would like to continue my
research in calcite due to its simplicity and extensibility.  But, if I
fail to give a good case study in favour of Calcite, I am afraid that I
could loose an opportunity to work with you guys.

Thanks and Regards

Lekshmi B.G
Email: lekshmibg09@xxxxxxxxx




On Fri, Dec 28, 2018 at 8:28 AM Stamatis Zampetakis <zabetak@xxxxxxxxx>
wrote:

> Hi Lekshmi,
>
> Basically, I think you just need to add some log information inside Prepare
> <
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/prepare/Prepare.java
> >
> .
> Possibly it suffices to replace calls to timingTracer
> <
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/prepare/Prepare.java#L96
> >
> with
> calls to an actual logger.
> If you need more help I will try to provide a small patch.
>
> Best,
> Stamatis
>
> Στις Πέμ, 27 Δεκ 2018 στις 1:09 μ.μ., ο/η Lekshmi <lekshmibg09@xxxxxxxxx>
> έγραψε:
>
> > Hi Stamatis,
> >       I was working on getting the planning/ execution time of TPC-H
> > queries using calcite TPC-H test cases. But I got significantly large
> time
> > than PostgreSQL. When I analyses the code, I suppose that,  It took the
> > time to generate the TPC-H data and  execute the queries on that.  I
> would
> > like to compare the performance of Calcite with PostgreSQL on executing
> the
> > TPC-H queries. For that, I would like to get at least the planning time
> of
> > TPC-H queires on Calcite. Can you please help me with any points to get
> it?
> > Thanks and Regards
> >
> > Lekshmi B.G
> > Email: lekshmibg09@xxxxxxxxx
> >
> >
> >
> >
> > On Sat, Dec 1, 2018 at 10:28 AM Stamatis Zampetakis <zabetak@xxxxxxxxx>
> > wrote:
> >
> > > Hi Lekshmi,
> > >
> > > I don't think that you can obtain this information easily. The current
> > > implementation of explain does not seem to provide this information.
> > > If you are willing to dig in a bit into the code you might find useful
> > the
> > > following entry points regarding the timing of queries:
> > CalciteTimingTracer
> > > <
> > >
> >
> https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/util/trace/CalciteTimingTracer.java
> > > >
> > > , CalciteTrace
> > > <
> > >
> >
> https://github.com/apache/calcite/blob/439ca73b8a335213b5f2764514b14e17c9d3c216/core/src/main/java/org/apache/calcite/util/trace/CalciteTrace.java#L103
> > > >
> > > .
> > >
> > > Best,
> > > Stamatis
> > >
> > > Στις Παρ, 30 Νοε 2018 στις 11:33 μ.μ., ο/η Lekshmi <
> > lekshmibg09@xxxxxxxxx>
> > > έγραψε:
> > >
> > > > Hi,
> > > >    I was comparing the performance of PostgreSQL and Calcite. I can
> > > collect
> > > > the planning and execution time from PostgreSQL using "explain
> analyze"
> > > but
> > > > from calcite I didnt get all the information using "explain all
> > > > attributes". I would like to know how can we collect the execution
> time
> > > and
> > > > planning time of a query from calcite..It would be great, if anyone
> > help
> > > > me..
> > > >
> > > > Thanks and Regards
> > > >
> > > > Lekshmi B.G
> > > > Email: lekshmibg09@xxxxxxxxx
> > > >
> > >
> >
>