drss

This is likely the final section of a short series on ways to speed up data modeling. Here, we talk about data modeling in a just-in-time manner. This differs quite a bit from how data modeling is traditionally taught and historically practiced, which is usually some form of Big Design Up Front or waterfall. These had a time and place, but things move way too fast for those approaches to be first considerations.

Read on for some tips on data modeling in an extremely fast and iterative way that matches today’s cadence of business.

Coming up - analytical data modeling (everyone’s been asking about this), graphs, and more.

Thanks,
Joe

Ellie makes data modeling as easy as sketching on a whiteboard, so even business stakeholders can contribute effortlessly. By skipping redraws, rework, and forgotten context, and by keeping all dependencies in sync, teams report saving up to 78% of modeling time.

Bridge reality with Data!

Read more here

Thanks to Ellie.ai for sponsoring this newsletter.

The world moves very fast, often far too fast for data modelers to keep up. Stakeholders want answers yesterday. Engineering teams are juggling feature requests, bug fixes, and AI experiments. In this environment, the old-school idea of “let’s spend months or years modeling everything first” doesn’t hold up.

Enter Just-In-Time Data Modeling (JITDM). Like the other suggestions in this section for speeding up data modeling, you’re agile, responsive, and focused on doing just enough to solve today’s problem. Similar to just-in-time manufacturing, with JITDM, you’re delivering only what the user needs when they need it.

This contrasts with Big Design Up Front (BDUF), which we’ve discussed earlier. We’re almost taking the opposite approach with JITDM, where we ignore everything except what the user needs from their data model. At the risk of acronym overload, we’re taking the philosophy of You Ain’t Gonna Need It (YAGNI). Think of it like overpacking for a trip. For a 3 day trip, I’m not taking 15 pairs of socks. I’m a one-pair-of-socks kind of guy. Two, max. Stick to the bare minimum of what you need to get the job done. Or, as John Giles says1, “There is a difference between “you ain’t gonna need it ever” versus “you don’t need it yet.”” Unlike BDUF, with JITDM, YAGNI. Okay, maybe we don’t need so many acronyms either. I’ll stop.

What does JITDM look like in practice? You might get a request from your boss or product manager to improve the functionality of an app, answer a new question, or create or tweak an ML/AI model to perform an action. You’ll sketch the request, maybe just a quick drawing on a whiteboard, paper, or modern data modeling tool. Identify the entities and attributes you need, and connect the dots in terms of how they relate to each other and to what might already exist. Next, build the working model in code, SQL, notebooks, or whatever you’re using. Finally, deploy the model for feedback (ideally in a development or test branch). Does the new model meet their needs? If so, great, and if not, iterate. If this model gets reused or becomes critical, you can push it to production, evolve it, document it, and fold it into your broader ecosystem.

The JITDM approach works well for collaborative situations where the request is ad hoc (“I need an answer to X.”), schemas might evolve, or you’re testing out new functionality. The downsides are potential duplication and data redundancy, model drift, and technical debt. JITDM works when there’s a balance between moving fast in a lightweight way and applying just the right amount of discipline and rigor to make sure things work in production to meet the user's needs.

Here are some tips to be effective with JITDM:

Standardize your naming conventions. If you add or change fields, maintain consistency to prevent your team from being bogged down in unnecessary work.
If a model is used more than a few times, consider promoting it to a shared dataset or standard data model.
Write comments so you and others understand the intent of the data model
Use version control and work in branches. Don’t work in the main branch for prototyping.

Today’s world moves extremely fast. JITDM is not a rejection of traditional modeling, but rather a response to ever-increasing velocity. Balance speed with awareness. You’re not abandoning good practice. You’re adapting it to fit the tempo of modern data work. Not everything needs a heavyweight model. Sometimes, the best model is the one that gets built just in time to deliver value.

The Nimble Elephant, pg. 110

link to the original content