Let me describe a common scene from data engineering circa 2019.
You've just finished a complex transformation pipeline. It works. It's been tested in dev, promoted to staging, ready for prod. You submit the PR.
Someone comments: "Missing unit tests." Someone else: "No documentation for the new tables." A third: "Can you add a data flow diagram?"
You groan. You know they're right. But you're already onto the next thing, and these artifacts are going to take another half day.
So you write a quick README, skip the unit tests ("I'll add them later"), and move on.
We all did this. The result was technical debt, onboarding nightmares, and pipelines that only their authors understood.
Gen AI Changed This Completely
Today, when I finish a data pipeline, I ask gen AI to:
- Write unit tests based on the transformation logic
- Generate inline code comments explaining the non-obvious parts
- Create a data dictionary for any new tables or fields
- Produce a data flow diagram from the DDL and transformation logic
- Draft a README that explains what the pipeline does, how to run it, and what depends on it
This takes about ten minutes. The output is 80% good on first pass, and with a quick review and a few edits, it's production quality.
The artifacts that used to be skipped are now just part of the workflow.
The Organizational Impact
When documentation is consistent and current, onboarding new engineers gets faster. When unit tests are present, refactoring is less scary. When diagrams are maintained, architecture reviews are more productive.
The chronic under-investment in documentation that has plagued data teams for years is being solved — not by cultural change, but by making it cheap to do right.
What You Should Be Doing Right Now
- Code comments: Paste a complex function, ask for inline comments on the non-obvious logic
- Unit tests: Describe the expected input/output, let gen AI draft the test cases
- Documentation: Ask for a README pre-filled with your project's details
- Diagrams: Paste your schema, ask for a Mermaid.js or PlantUML diagram
The time you save on the boring stuff is time you can spend on the interesting stuff.
And the interesting stuff is where you build expertise that's harder to automate.
0 Comments
Leave a Comment