[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[GitHub] erankor edited a comment on issue #5979: Kafka Indexing Service lagging every hour

erankor edited a comment on issue #5979: Kafka Indexing Service lagging every hour
URL: https://github.com/apache/incubator-druid/issues/5979#issuecomment-404050959
   @jihoonson, a couple of additional questions on the MySQL tables - 
   1. Following this issue, we started tracking slow queries on the MySQL DB, and I'm seeing queries like the following repeat often -
   `SELECT payload FROM druid_segments WHERE used = true AND dataSource = 'player-events-historical' AND ((start <= '2018-07-11T04:00:00.000Z' AND `end` >= '2018-07-11T03:00:00.000Z'));`
     a. Is it possible to delete rows with used = false from this table? While I saw there is already an index for `used`, keeping the table small can probably help. In our case, only ~1% of the rows there have used=true, I'm guessing it's because KIS tasks create partitioned segments that we later merge.
     b. I'm thinking about replacing the existing (used) index with an index of (used, end), since it seems these queries always have end >= (some recent timestamp), does that make sense?
   2. Going over the tables one by one, I saw that druid_tasks & druid_tasklogs also have rows being accumulated, does it make sense to clean those? from what I saw, there is no date on druid_tasklogs, but I can get a list of old tasks from druid_tasks, and then delete both the tasks and their logs by id.
   Thank you!

This is an automated message from the Apache Git Service.
To respond to the message, please log on GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:

With regards,
Apache Git Services

To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxx