[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Questions about TWCS

In Jeff Jirsa C* 2016 summit presentation, TimeWindowCompactionStrategy for Time Series Workloads, there is a slide which talks about optimizations. It says to align partition keys to your TWCS windows. Is it generally the case that calendar/date based partitions would align nicely with TWCS windows such that we would end up with one SSTable per partition after the major compaction runs?

http://thelastpickle.com/blog/2016/12/08/TWCS-part1.html says to aim for < 50 buckets per table based on TTL. Are there any recommendations on a range to stay within for number of buckets? What are some of the tradeoffs of smaller vs larger number of buckets? For example I know that a smaller number of buckets means more SSTable to compact during the major compaction that runs when we get past a given window.

Are tombstone compactions disabled by default?

Can you ever wind up in a situation where the major compaction that is supposed to run at the end of a window does not run? Not sure if this is realistic but consider this scenario. Suppose compaction falls behind such that there are 5 windows for which the major compactions have not run. Will TWCS run the major compactions for those window serially oldest to newest?

With respect to TWCS would a write be considered out of order when it arrives after its window has already finished? If am I using a window size of one day, it it current 02:00 AM Tuesday, and I receive a write for 11:45 PM Monday, should I consider that out of order?


- John