F R A N K I E
H O M E W O O D
<
P R O J E C T
AutomateDV dbt Package
A dbt package for generating Data Vault 2.0 tables from the model metadata. Streamlining the DV2 build process.
B A C K G R O U N D
AutomateDV at the time had a number of gaps in functionality making it a more limited tool than it is today. It had the core data vault table structures in it, hubs, links and satellites. However for a fully versatile Data Vault 2.0 implementation, it needed to be extended with additional tables.
I was responsible for delivering a number of these direct-to-customer features for the AutomateDV package. My contributions included creation of a table with a more complex use case: extended tracking satellites designed by Patrick Cuba.
This expanded the range of situations that the package could be applicable to and as a result allowed us to answer "Yes, we can do that" to more customer requests.
T E C H N O L O G I E S
C H A L L E N G E S
Creating features for an established code base can present challenges of integration into the existing environment. This was particularly true in this case, ensuring that all of the new features stitched together nicely and matched the initial design and approach of the package.
Implementing an extended tracking satellite load script can be a challenge for even the most competent data engineers. This was a step beyond that because it requires the templating and configuration of the ETL scripts to be able to handle any given customer's requirements.
P R O J E C T
O U T C O M E
The AutomateDV package has been a great success, with over 1000 downloads per month and a growing community of users.
It has become a key part of many projects data engineering toolkit, and has helped to accelerate the development of Data Vault 2.0 solutions.
The package is now used by a wide range of companies, from small startups to large enterprises, and has become a key part of many projects data engineering toolkit.
P E R S O N A L
G R O W T H
My time working on AutomateDV has allowed me to develop a deep appreciation for proper management of public software, including best practices for documentation, testing, and community engagement.
Integrating into a publicly available software, I learned the risk of not only untested code but what I describe as post-tested code, where the development has been complete and the tests are included based on what is already known to be working.
It's much more important to set out the acceptance criteria in the tests before any development takes place. This is described in the industry as Test Driven Development.