osdir.com


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: DELETE/SELECT with multi-column PK and IN


Ok now I REALLY got it :) 
Thanks Sylvain!

2017-02-09 11:42 GMT+01:00 Sylvain Lebresne <sylvain@xxxxxxxxxxxx>:
On Thu, Feb 9, 2017 at 10:52 AM, Benjamin Roth <benjamin.roth@xxxxxxxxx> wrote:
Ok got it.

But it's interesting that this is supported:
DELETE/SELECT FROM ks.cf WHERE (pk1) IN ((1), (2), (3));

This is technically mostly the same (Token awareness, coordination/routing, read performance, ...), right?

It is. That's what I meant by "there is something to be said for the consistency of the CQL language in general". In other words, look for no externally logical reason for this being unsupported, it's unsupported simply due to how the CQL code evolved. But as I said, we didn't fix that inconsistency because we're all busy and it's not really that important in practice. The project of course welcome any contributions though :)
 

2017-02-09 10:43 GMT+01:00 Sylvain Lebresne <sylvain@xxxxxxxxxxxx>:
This is a statement on multiple partitions and there is really no optimization the code internally does on that. In fact, I strongly advise you to not use a batch but rather simply do a for loop client side and send statement individually. That way, your driver will be able to use proper token-awareness for each request (while if you send a batch, one coordinator will be picked up and will have to forward most statement, doing more network hops at the end of the day). The only case where using a batch is indeed legit is if you care about all the statement being atomic, but in that case it's a logged batch you want.

That's btw more or less why we never bothered implementing that: it's totally doable technically, but it's not really such a good idea performance wise in practice most of the time, and you can easily work it around with a batch if you need atomicity. 

Which is not saying it will never be and shouldn't be supported btw, there is something to be said for the consistency of the CQL language in general. But it's why no-one took time to do it so far.

On Thu, Feb 9, 2017 at 10:36 AM, Benjamin Roth <benjamin.roth@xxxxxxxxx> wrote:
Yes, thats the workaround - I'll try that.

Would you agree it would be better for internal optimizations to process this within a single statement?

2017-02-09 10:32 GMT+01:00 Ben Slater <ben.slater@xxxxxxxxxxxxxxx>:
Yep, that makes it clear. I think an unlogged batch of prepared statements with one statement per PK tuple would be roughly equivalent? And probably no more complex to generate in the client?

On Thu, 9 Feb 2017 at 20:22 Benjamin Roth <benjamin.roth@xxxxxxxxx> wrote:
Maybe that makes it clear:

DELETE FROM ks.cf WHERE (partitionkey1, partitionkey2IN ((1, 2), (1, 3), (2, 3), (3, 4));

If want to delete or select a bunch of records identified by their multi-partitionkey tuples.

2017-02-09 10:18 GMT+01:00 Ben Slater <ben.slater@xxxxxxxxxxxxxxx>:
Are you looking this to be equivalent to (PK1=1 AND PK2=2) or are you looking for (PK1 IN (1,2) AND PK2 IN (1,2)) or something else?

Cheers
Ben

On Thu, 9 Feb 2017 at 20:09 Benjamin Roth <benjamin.roth@xxxxxxxxx> wrote:
Hi Guys,

CQL says this is not allowed:

DELETE FROM ks.cf WHERE (pk1, pk2) IN ((1, 2));


1. Is there a reason for it? There shouldn't be a performance penalty, it is a PK lookup, the same thing works with a single pk column
2. Is there a known workaround for it?

It would be much of a help to have it for daily business, IMHO it's a waste of resources to run multiple queries just to fetch a bunch of records by a PK.

Thanks in advance for any reply

--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
--
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support



--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
--
————————
Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support



--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer




--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer




--
Benjamin Roth
Prokurist

Jaumo GmbH · www.jaumo.com
Wehrstraße 46 · 73035 Göppingen · Germany
Phone +49 7161 304880-6 · Fax +49 7161 304880-1
AG Ulm · HRB 731058 · Managing Director: Jens Kammerer