We keep track of metrics by using the value
of MetricGroup::getMetricIdentifier, which returns
the fully qualified metric name. The query that we use to
monitor metrics filters for metrics IDs that
match '%Status.JVM.Memory%'. As long as the new metrics
come online via the MetricReporter interface then I think
the chart would be continuous; we would just see the old
JVM memory metrics cycle into new metrics.
On Wed, May 30, 2018 at
5:30 PM, Ajay Tripathy <ajayt@xxxxxxxx>
How are your metrics
dimensionalized/named? Task managers often
have UIDs generated for them. The task id
dimension will change on restart. If you name
your metric based on this 'task_id' there
would be a discontinuity with the old metric.
are seeing our task manager JVM
metrics disappear over time. This
last time we correlated it to our
job crashing and restarting. I
wasn't able to grab the failing
exception to share. Any thoughts?
track metrics through the
MetricReporter interface. As far
as I can tell this more or less
only affects the JVM metrics. I.e.
most / all other metrics continue
reporting fine as the job is