osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Question Regarding The Benchmark of Calcite Compared To Conventional Database System(Related to CALCITE-2169)


Hello Folks,

For my research activities, I was trying to perform a benchmark comparison
between calcite with other database systems.  As an initial step, I was
trying to do it for *Calcite* and *PostgresSql*. So, I thought TPCH queries
were the right thing to start with. I tried running the TpchTest (
https://github.com/apache/calcite/blob/master/plus/src/test/java/org/apache/calcite/adapter/tpch/TpchTest.java)
by adding the *CalciteTimingTracer* in the junit tests to determine the
execution time. While doing so, I could see that the execution time in
calcite is significantly higher compared to postgresSql. On further
investigation, I could see that we generate the required datas required for
these queries(which comes around 150,000 for some tables) and I was under
an impression that most of the time was spend on the data generation and
that the query execution could be faster. So, I modified the relevant
schema class (
https://github.com/apache/calcite/blob/master/plus/src/main/java/org/apache/calcite/adapter/tpch/TpchSchema.java)
to perform the data generation and query execution separately. Then, I
traced the time took for just query execution. Even, then there was a
significant difference from that of PostgresSql.

I, also enabled the *log4j.rootLogger* to *TRACE * to find the time spend
for sql2rel and optimization phases of the class Prepare
<
https://github.com/apache/calcite/blob/master/core/src/main/java/org/apache/calcite/prepare/Prepare.java>.
And, to my surprise, I could see that calcite takes a time of 355ms for
sql2rel and 352ms for optimization for the junit test *testQuery01*. On the
other side, the same query gave a planning time of 0.163ms in Postgres.

I would like to know, if this is the right way to test the performance of
TPCH queries using apache calcite. Can anyone let me know if there exist
any better ways to do it.

And, while searching through JIRA, I could find a ticket
https://issues.apache.org/jira/browse/CALCITE-2169 which was created by
Edmon Begoli for performing a comparative performance study of the calcite
framework. I think, its related to my current problem. I have no idea
regarding the status of the ticket. It would be really great if someone
could help me with some information on it.

Also, now coming to the personal preference, I would like to continue my
research in calcite due to its simplicity and extensibility.  But, if I
fail to give a good case study in favour of Calcite, I am afraid that I
could loose an opportunity to work with calcite.

Thanks and Regards

Lekshmi B.G
Email: lekshmibg09@xxxxxxxxx