
In Data Modeling is Dead Again (Part 1), we looked at some various arguments I’ve seen for why data modeling is finally, truly dead. “Because AI.” On the surface, the argument is compelling. Between massive context windows, increasingly capable AI agents, and ubiquitous LLMs, why not just dump raw data into a pile and let the machines sort it out? The bonus is that there’s no need to talk to people. This seems way faster, cheaper, and easier than the archaic ways data has been modeled over the decades.
The first draft of this article was a point-by-point rebuttal of each point raised in the Part 1 article. After more thought, I decided to start over, as a point-by-point analysis is a game of whack-a-mole. I think the argument can be summed up very neatly by Joel Spolsky’s Law of Leaky Abstractions. Because data modeling is perceived as hard work and time-consuming, we look for every possible way to avoid it, and unfortunately, the hard work doesn’t go away.
For example, you might prompt an agent to “generate a schema for our order management system.” It will confidently spit out perfectly valid SQL, primary keys, foreign keys, and 3NF structures. It looks syntactically perfect. But the agent doesn’t know that your logistics partner splits shipments in a way that defies standard logic, or that your internal definition of a “customer” is legally distinct in the EU versus the US. The agent abstracted away the code. But since it lacked context about your arcane operations, it couldn’t abstract away the reality. The data model looks convincing, but it’s trash.
As Joel points out in his article, increasing the number of abstractions doesn’t remove complexity. In fact, it forces you to understand it even deeper. This is the paradox of the law of leaky abstractions. There is no easy button that is actually easy.
Can we do away with data modeling once and for all? It’s certainly tempting. You can finally move faster, a luxury that didn’t exist in the Before Times (I consider 2025 when the AI models started getting really good). Today’s AI tools are extremely powerful, and I can only imagine what tomorrow brings.
I alluded to this in my recent post on whether the traditional levels of data modeling (Conceptual, Logical, Physical) are relevant today. The short answer is yes. Having a strong mental model of data modeling is more important than ever.
Exceptions exist, as always. If you’re building a prototype over a weekend, why not? But if you’re building a production system responsible for operations, revenue, customers, people’s lives, and other critical uses, agents and crappy data models don’t mix. Do you think throwing data into a giant pile, sight unseen, is a good idea? If you think so, consider a different profession.
Hiding complexity with AI doesn’t remove the complexity. All you’re doing is obscuring the failure points. It’s the scene in a slasher horror movie where the victim hides in a hallway closet, hoping the villain will go away. We all know how that scene ends - bloody and messy. When you strip away the hype, the “no data modeling” approach runs headlong into a similar fate. I’ve never seen it end well.
