I am wondering what instance type is best for a small cassandra cluster on AWS.
Define "small" :-D
Actually I'd like to compare, or have your opinion about the following instances:
- r5d.xlarge (4vCPU, 19ecu, 32GB ram and 1 NVMe instance store 150GB)
- Need to attach a 600/900GB ESB
- i3.xlarge (4vCPU, 13ecu, 30.5GB ram and 9.5TB NVMe instance store)
Both have up to 10Gb networking.
I see AWS mark i3 as the NoSQL DB instances nevertheless r5d seems bit better CPU wise. Putting a decently sized gp2 EBS I should have enough IOPS especially we think to put commitlog and such on the 150GB NVMe storage.
About the workload: mostly TWCS inserts and upserts on LCS.
So there is a number of trade-offs:
1. With EBS you have more flexibility when it comes to scaling compute power: you don't have to rebuild data directory from scratch. At the same time, EBS performance can be limited by the volume itself (it depends on volume type *and* size), and it can also be limited by instance type. You might not be able to reach max throughput of a big volume with a small instance attached.
2. I didn't try to run Cassandra with i2 or i3 instances. These are optimized for a lot of random IO, though with Cassandra what you should be seeing is mostly sequential IO, so I'm not sure you're going to utilize the NVMes fully. Some AWS features, like auto-recovery, only work with instances using EBS-backed storage exclusively.