Relationships: Tying It Together

Practical Data Modeling Mar 12, 2026

Here is Chapter 7, all about relationships (the data kind, not the human kind). What’s funny is I wrote the original version of this around two years ago. It’s strangley alien when you encounter your writing from a couple of years ago. I guess it’s like reviewing your old code, and you ask, “What idiot wrote this?” I think the original was good, but this is substantially better, more current, and fits the MMA theme.

Editing and revising it, along with the previous two chapters on entities and attributes, has been quite the chore. Same problem of reviewing ancient manuscripts from my past self. Thankfully, this trio - core to the book series - is in good shape.

Coming up next are the revised chapters on grains, time, and so forth. I think the editing work will be a lot easier going forward because these chapters are more recent and haven’t had as much time to gather dust and cruft.

Also, a note on artwork: I use Google Draw or Claude to generate the images. This is mostly because it’s a quick way to prototype images and move on with the writing. When you’re writing and editing, the last thing you want to do is get hung up on non-writing stuff. I will be coming back and revising the images, so please don’t think these are the final ones we’ll use in the book.

Thanks,

Joe


Grappling day. The coach has everyone working positional drills—start in guard, work a sweep, land in mount, hunt the submission. Reset and do it again. The more experienced grapplers in the gym make it look effortless, and between rounds, one of them shows you why: “Nothing in jiu-jitsu exists by itself. Guard connects to sweeps. Sweeps connect to the mount. Mount connects to submissions. Skip a link and the whole chain breaks.”

The coach comes over to correct your technique. “Stop muscling the sweep. Your underhook creates the angle. The angle gives you the sweep. The sweep gives you the top position. Each thing only works because of the thing before it.”

Jiu-jitsu is the closest thing I’ve experienced to “human chess.” You’re exploring a continuous flow of recalculation and figuring out next moves and setups, trying to expose a weakness in your opponent’s game. Of course, your opponent is doing the same to you. And in mixed martial arts, it gets more complicated, adding the dimension of striking alongside the dangerous submissions in jiu-jitsu. Learning the relationships between grappling and striking (and vice versa) allows you to get very creative, and it was one of my favorite things to geek out on back when I trained.

Data modeling various types of relationships allows for similar creative expression, and while thankfully not as dangerous as mixed martial arts, you’ll find it can be a lot of fun.

Five Camps. One Relationship. Five Expressions.

In data modeling, a relationship is a logical or physical connection between two or more entities. It defines how data interacts, depends on, or references other data. Without relationships, data is a collection of isolated facts. With them, it becomes a coherent ecosystem.

In Mixed Model Arts, we establish relationships for two distinct reasons. First, to map reality—developing a shared understanding between technicians, domain experts, and the business of how things actually relate in the real world. Second, to serve a specific intent—just as a mixed martial arts fighter chooses a technique based on whether they want to keep the fight standing or take it to the ground — a data professional builds the same relationship differently depending on what they’re trying to do.

To see this in action, let’s track a single business event across all five camps.

As I write, it’s Spring in Salt Lake City, and it’s a great time to get outside. I need a new pair of trail-running shoes, so I go online to purchase some. Let’s look at one relationship (me buying a pair of trail running shoes), with five completely different expressions across use cases.

The Relational Camp: State and Constraints

In the Relational Camp, the relationship is a strict contract. My customer profile and shoe purchase live in separate, normalized tables connected by a foreign key. The database guarantees that an order cannot reference a customer who doesn’t exist, and a product line item cannot reference a product that was never in the catalog. Every database write is validated when it happens, and every read is as accurate as the last write. Here, the goal is a predictable and consistent state of the data, along with constraints to guard against problems such as data update anomalies. I’d really hate to think my order was completed when it actually wasn’t, due to a database glitch.

Apply the Relational Camp to the wrong workload, and you’ll feel it. Run complex historical queries across a fully normalized relational schema at scale, and you’ll probably wait a long time, if the query ever completes. I heard stories from friends who worked at Yahoo in the early 2000s. They had to use expensive relational databases and data models for data growing at volumes where queries would never complete. To analyze this massive dataset, they had to invent a new system to distribute queries across commodity hardware. Hadoop was born. This team realized that the relational model, while great for certain things, was insufficient for the type of problem they were working on. The same joins that protect your data integrity in transactions become expensive anchors at scale. At a smaller scale, I’ve seen many teams that try to run their data warehouse on the same OLTP database they use for transactions. When the pain becomes unbearable, they almost always move to something more purpose-built. It happens a lot.

The Analytical Camp: Context and History

When the goal shifts from recording transactions to understanding what something means, the relationship is refactored into a form better suited to analytics. The metrics of my purchase (like order amount and quantity) become facts, and I become a dimension related to those facts. The relationship connects the event not just to my purchase, but to the Q2 2026 marketing campaign that acquired me, the product category I bought from, and the exact state of my demographic profile at the time of purchase—Salt Lake City, outdoor enthusiast, first-time buyer for this brand of trail-running shoes.

The Analytical Camp likes to aggregate behavior over time. Often, this means deliberately denormalizing the data to serve exhaustive read patterns. Historical context is the priority. You’re answering questions like “which customer segments drove Q2 2026 revenue”—questions that would require dozens of joins in a relational schema but resolve in seconds here.

The failure mode is treating analytical models as a source of truth for transactions. They’re a snapshot optimized for reading, not a system optimized for writing. Same as above, teams that try to treat their analytical data model the same as their application model end up with a model and system that’s slow for analytics and unreliable for operations. Pick a lane.

The Application Camp: Access Patterns and Velocity

Read more

link to the original content