Cassandra In Production: Things We Learned

Cassandra’s memory footprint is more dependent on the number of column families than on the size of the data set. Cassandra scales pretty well horizontally for storage and IO, but not for memory footprint, which is tied to your schema and your cache settings regardless of the size of your cluster. Planning for the smallest number of column families possible reduces your memory footprint and allows more memory for caching.

.. Cassandra runs most efficiently with data that is written once. Data that is frequently deleted or updated puts more pressure on compaction. Cassandra’s compaction tends to be CPU intensive, and heavy compaction loads can cause nodes to fall out of the ring.

 

 

Does everyone hate MongoDB?

MongoDB uses unsafe writes by default in the sense that from the driver, you do not know if the write has succeeded without a further call to getLastError. This is because one of the often cited use cases for MongoDB is fast writes, which is achieved by fire and forget queries.

.. Getting your working set in memory is one of the most difficult things to calculate and plan for with MongoDB.

.. A general guideline is to provide as much RAM as you can to fit all your data plus indexes or if that’s not possible, at least your indexes