[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: AWS r5.xlarge vs i3.xlarge

Thanks for the feedback.

By "small" I mean that currently I have a 6x m1.xlarge instances running Cassandra 3.0.17. Total amount of data is around 1.5TB spread across couple of keypaces wih RF:3.

Over time few things happened/became clear including:
  • increase amount of ingested data
  • m1.xlarge instances are somehow outdated. We noted that one of them is under performing compared to the others. Networking is not always stable/reliable and so on
  • Upgrading from 3.0.6 to 3.0.17 emphasized the need of better hardware even more (in my opinion).
Starting from here I believe that i3/r5d are already a much better option to what we have with a comparable price.

About the EBS: Yes, I am aware its performance is related to its size (and type) That is the reason why I was looking into a 600/900GB drive that already a much better option compared to our raid0 of spinning disks. Both i3 and r5d are EBS optimized


On Mon, Dec 10, 2018 at 2:38 PM Oleksandr Shulgin <oleksandr.shulgin@xxxxxxxxxx> wrote:
On Mon, Dec 10, 2018 at 12:20 PM Riccardo Ferrari <ferrarir@xxxxxxxxx> wrote:
I am wondering what instance type is best for a small cassandra cluster on AWS.

Define "small" :-D
Actually I'd like to compare, or have your opinion about the following instances:
  • r5d.xlarge (4vCPU, 19ecu, 32GB ram and 1 NVMe instance store 150GB)
    • Need to attach a 600/900GB ESB
  • i3.xlarge (4vCPU, 13ecu, 30.5GB ram and 9.5TB NVMe instance store)
Both have up to 10Gb networking.
I see AWS mark i3 as the NoSQL DB instances nevertheless r5d seems bit better CPU wise. Putting a decently sized gp2 EBS I should have enough IOPS especially we think to put commitlog and such on the 150GB NVMe storage.
About the workload: mostly TWCS inserts and upserts on LCS.

So there is a number of trade-offs:

1. With EBS you have more flexibility when it comes to scaling compute power: you don't have to rebuild data directory from scratch.  At the same time, EBS performance can be limited by the volume itself (it depends on volume type *and* size), and it can also be limited by instance type.  You might not be able to reach max throughput of a big volume with a small instance attached.

2. I didn't try to run Cassandra with i2 or i3 instances.  These are optimized for a lot of random IO, though with Cassandra what you should be seeing is mostly sequential IO, so I'm not sure you're going to utilize the NVMes fully.  Some AWS features, like auto-recovery, only work with instances using EBS-backed storage exclusively.