Which approach should we use for exposing metrics through Virtual tables?
I would like to start working on exposing the metrics through virtual
tables in CASSANDRA-14537
We had some long discussion already in CASSANDRA-7622 about which schema to
use to expose the metrics, unfortunately in the end I was not truly
convinced by any solution (including my own).
I would like to expose the possible solutions and there limitations and
advantages to find out which is the solution that people prefer or to see
if somebody can come up with another solution.
In CASSANDRA-7622, Chris Lohfink proposed to expose the table metric using
the following schema:
VIRTUAL TABLE table_stats (
PRIMARY KEY( keyspace_name, table_name , metric));
This approach has some advantages:
- It is easy to use for all the metric categories that we have (http://
- The number of column is relatively small and fit in the cqlsh console.
The main disadvantage that I see with that approach is that it might not
always be super readable. Gauge or a Counter metric will have data for only
one column and will return NULL for all the others. If you know precisely
which metric is what and you only target that type of metric you can build
your query in such a way that the output is nicely formatted.
Unfortunately, I do not expect every user to know which metric is what.
The output format can also be problematic for monitoring tools as they
might have to use some extra logic to determine how to process each metric.
My preferred approach was to use metrics has columns. For example for the
threadpool metrics it will have given the following schema:
VIRTUAL TABLE threadpool_metrics (
PRIMARY KEY( pool_name )
That approach provide an output similar to the one of the nodetool
tpstats which will be, in my opinion, more readable that the previous
Unfortunately, it also has several serious drawbacks:
- It does work for small set of metrics but do not work well for the
table or keyspace metrics where we have more than 63 metrics. If you
split the histograms, meters and timers into multiple columns you easily
reach more than a hundred columns. As Chris pointed out in CASSANDRA-7622
it makes the all thing unusable.
- It also does not work properly for set of metrics like the commit log
metrics because you can not get a natural primary key and will have to
somehow create a fake one.
Nodetool solved the table and keyspace metric problems by splitting them
into subset (e.g. tablestats, tablehistograms). We could take a similar
approach and group metrics in meaningful sub-groups and expose them using
the second approach.
I tried to put myself in the shoes of a user that has a limited knowlegde
of the C* metrics but at the end of the day I am certainly not the best
person to figure out what is the best solution here. So I would like to
have your feedbacks on that problem.
Chris if I was wrong on some part or forgot some stuff feel free to correct