Re: Issue with segments not loading/taking a long time
I have a 6 node historical test cluster. 3 nodes are at ~80% and the other
two at ~60 and ~50% disk utilization.
The interesting thing is that the 6th node ended up getting into zk timeout
(because of large GC pause) and is no longer part of the cluster (which is
a separate issue I am trying to figure out).
On this 6th node, I see that it is busy loading segments. However, once it
is done downloading, I am not sure if it will report back to zk as being
On Thu, Jul 19, 2018 at 12:58 PM, Jihoon Son <ghoonson@xxxxxxxxx> wrote:
> Hi Samarth,
> have you had a change to check the segment balancing status of your
> Do you see any significant imbalance between historicals?
> On Thu, Jul 19, 2018 at 12:28 PM Samarth Jain <samarth.jain@xxxxxxxxx>
> > I am working on upgrading our internal cluster to 0.12.1 release and
> > that a few data sources fail to load. Looking at coordinator logs, I am
> > seeing messages like this for the datasource:
> > @400000005b50dbc637061cec 2018-07-19T18:43:08,923 INFO
> > [Coordinator-Exec--0] io.druid.server.coordinator.CuratorLoadQueuePeon -
> > Asking server peon[/druid-test--001/loadQueue/127.0.0.1:7103] to drop
> > segment[*datasource*
> > _2015-09-03T00:00:00.000Z_2015-09-04T00:00:00.000Z_2018-
> > @400000005b50dbc637391f84 2018-07-19T18:43:08,926 WARN
> > [Coordinator-Exec--0] io.druid.server.coordinator.rules.LoadRule - No
> > available [_default_tier] servers or node capacity to assign primary
> > segment[*datasource*-08-10T00:00:00.000Z_2015-08-11T00:00:
> > Expected Replicants
> > The datasource failed to load for a long time and then eventually was
> > loaded successfully. Has anyone else seen this? I see a few fixes around
> > segment loading and coordination in 0.12.2 (which I am hoping will be out
> > soon) but I am not sure if they address this issue.