Espresso #5: Design docs, dbt at scale, and how to move up the ladder
Make yourself an espresso and join me for a short break on a Monday afternoon ☕
Hello fellow data enthusiasts,
In this edition, we will talk about design docs for data pipelines, setting standards and foundations for scalable dbt usage, and the core principle to keep in mind when working towards your next promotion. So without further ado, let’s talk data engineering while the espresso is still hot.
Writing design docs for data pipelines
Over the past few years, adopting software engineering best practices has become a common theme within the data engineering space. From dbt’s software-engineering-inspired capabilities to the rise of data observability, data engineers are getting increasingly accustomed to the software engineer’s toolset and principles.
This shift had a major impact on how we design and build data pipelines. It made data pipelines more robust (since we moved away from hard-coded business logic and complex SQL queries to modular dbt models and macros) and drastically lowered the number of “Hey can you check this table?” Slack messages (via automated data quality monitoring and alerting).
These changes helped us move the industry in the right direction — but there are still areas in which we still have much to learn from our software engineering counterparts.
Based on recent data released by dbt Labs, around 20% of dbt projects have more than 1,000 models (and 5% have more than 5,000). These numbers highlight one fundamental problem in our data pipelines: we’re not intentional enough with how we design them. We stack layers upon layers of ad-hoc models and use-case-specific transformations and end up with ten models that represent the same logical entity “with subtle differences”.
In my latest article, I discuss one artifact that can help us design (and build) robust foundations for our data platforms: design docs.
Fresh off the press: our dbt journey at Zendesk
At Zendesk, we started our dbt journey more than a year ago (and have been running it in production for over six months) - and today, more than half of our data pipelines are dbt jobs.
The most important pillar to which we owe the success of our approach is setting the right foundations from day one and defining standards that make sense within the context of our use cases and architecture.
Last month, I published an article in the Zendesk engineering blog that discusses the core foundations of our dbt setup (and more) - it’s intended to be the first article of a three-part series, so stay tuned for more technical articles on our dbt implementation.
Out of the comfort zone: What got you here won’t get you there.
Last year I participated in the fantastic Ignite tech lead mentoring program (which I highly recommend), and one of the key learnings that stuck with me is “What got you here won’t get you there.” This was a very elegant way to phrase a principle that I’ve been applying throughout my career, and I think it’s important to keep it in mind as you progress throughout yours.
Every vertical (or horizontal) transition in your career introduces you to a new job requiring a specific skill set that differs from your previous role. Senior, Staff, Tech Lead, or Engineering Manager are all very different hats that require some preparation before wearing them. Some companies merely rely on the “years of experience” metric to determine your readiness to change hats, but what really matters is how comfortable you are with the skill set of the new job.
As you think about your career goals, consider the role that you currently aim to achieve and the skills that it requires, then use this information as a map to define the areas on which you want to focus in your development and the experiences you want to gain.
A sound to code to
Ever since the Succession series finale, I’ve been mostly relying on Nicholas Britell’s beautiful, grandiose, and melancholic season 4 score as my coding playlist - with a focus on Andante Risoluto (unsurprisingly).
If you enjoyed this issue of Data Espresso, feel free to recommend the newsletter to people in your entourage.
Your feedback is also very welcome, and I’d be happy to discuss one of this issue’s topics in detail and hear your thoughts on it.
Stay safe and caffeinated ☕