Cassandra In Production: Things We Learned

Cassandra’s memory footprint is more dependent on the number of column families than on the size of the data set. Cassandra scales pretty well horizontally for storage and IO, but not for memory footprint, which is tied to your schema and your cache settings regardless of the size of your cluster. Planning for the smallest number of column families possible reduces your memory footprint and allows more memory for caching.

.. Cassandra runs most efficiently with data that is written once. Data that is frequently deleted or updated puts more pressure on compaction. Cassandra’s compaction tends to be CPU intensive, and heavy compaction loads can cause nodes to fall out of the ring.