osdir.com

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Built in trigger: double-write for app migration


Could be done with CDC
Could be done with triggers
(Could be done with vtables — double writes or double reads — if they were extended to be user facing)

Would be very hard to generalize properly, especially handling failure cases (write succeeds in one cluster/table but not the other) which are often app specific


-- 
Jeff Jirsa


> On Oct 18, 2018, at 6:47 PM, Jonathan Ellis <jbellis@xxxxxxxxx> wrote:
> 
> Isn't this what CDC was designed for?
> 
> https://issues.apache.org/jira/browse/CASSANDRA-8844
> 
> On Thu, Oct 18, 2018 at 10:54 AM Carl Mueller
> <carl.mueller@xxxxxxxxxxxxxxx.invalid> wrote:
> 
>> tl;dr: a generic trigger on TABLES that will mirror all writes to
>> facilitate data migrations between clusters or systems. What is necessary
>> to ensure full write mirroring/coherency?
>> 
>> When cassandra clusters have several "apps" aka keyspaces serving
>> applications colocated on them, but the app/keyspace bandwidth and size
>> demands begin impacting other keyspaces/apps, then one strategy is to
>> migrate the keyspace to its own dedicated cluster.
>> 
>> With backups/sstableloading, this will entail a delay and therefore a
>> "coherency" shortfall between the clusters. So typically one would employ a
>> "double write, read once":
>> 
>> - all updates are mirrored to both clusters
>> - writes come from the current most coherent.
>> 
>> Often two sstable loads are done:
>> 
>> 1) first load
>> 2) turn on double writes/write mirroring
>> 3) a second load is done to finalize coherency
>> 4) switch the app to point to the new cluster now that it is coherent
>> 
>> The double writes and read is the sticking point. We could do it at the app
>> layer, but if the app wasn't written with that, it is a lot of testing and
>> customization specific to the framework.
>> 
>> We could theoretically do some sort of proxying of the java-driver somehow,
>> but all the async structures and complex interfaces/apis would be difficult
>> to proxy. Maybe there is a lower level in the java-driver that is possible.
>> This also would only apply to the java-driver, and not
>> python/go/javascript/other drivers.
>> 
>> Finally, I suppose we could do a trigger on the tables. It would be really
>> nice if we could add to the cassandra toolbox the basics of a write
>> mirroring trigger that could be activated "fairly easily"... now I know
>> there are the complexities of inter-cluster access, and if we are even
>> using cassandra as the target mirror system (for example there is an
>> article on triggers write-mirroring to kafka:
>> https://dzone.com/articles/cassandra-to-kafka-data-pipeline-part-1).
>> 
>> And this starts to get into the complexities of hinted handoff as well. But
>> fundamentally this seems something that would be a very nice feature
>> (especially when you NEED it) to have in the core of cassandra.
>> 
>> Finally, is the mutation hook in triggers sufficient to track all incoming
>> mutations (outside of "shudder" other triggers generating data)
>> 
> 
> 
> -- 
> Jonathan Ellis
> co-founder, http://www.datastax.com
> @spyced

---------------------------------------------------------------------
To unsubscribe, e-mail: dev-unsubscribe@xxxxxxxxxxxxxxxxxxxx
For additional commands, e-mail: dev-help@xxxxxxxxxxxxxxxxxxxx