Shannen Wehrman

Author Archives: Shannen Wehrman

Cassandra NoSQL Data Model Design

We at Instaclustr recently published a blog post on the most common data modelling mistakes that we see with Cassandra. This post was very popular and led me to think about what advice we could provide on how to approach designing your Cassandra data model so as to come up with a quality design that avoids the traps.

There are a number of good articles around that with rules and patterns to fit your data model into: 6 Step Guide to Apache Cassandra Data Modelling and Data Modelling Recommended Practices.

However, we haven’t found a step by step guide to analysing your data to determine how to fit in these rules and patterns. This white paper is a quick attempt at filling that gap.

Phase 1: Understand the data

This phase has two distinct steps that are both designed to gain a good understanding of the data that you are modelling and the access patterns required.

Define the data domain

The first step is to get a good understanding of your data domain. As someone very familiar with relation data modelling, I tend to sketch (or at least think) ER diagrams to understand the entities, their keys and relationships. However, Continue reading