Analytics Engineering on the Lakehouse using DBT & Databricks (part-2 )
in Part-1 article we have set up the local env to develop and connect dbt models with databricks platform in this article we will jump into action and transform our data pipeline .
Use case : we are using dummy sales process where we have following entity
- customer : customer information
- orders : order information and status
- payments : payments and order id details
data is stored in bronze layer with date partitioned ing_date partition
- Defining source : Source is a starting point for any dbt transformation and generally represent bronze data source, ingested by data ingestion tool like ADF .
dbt run-operation stage_external_sources --profiles-dir ./profiles/
Similarly we create payments and orders source also.
2. Transforming source data in Silver :
we are keeping configuration about target table in <model>.yml . this config file consist table details at silver layer and its column information we can also configure generic test cases here.pay attention in customer_id coulmn config.
Transformation logic is kept in <model>.sql file with some tables specfic config.
Running the model with variable value
dbt run --vars '{ ing_date : 20221216}' --profiles-dir ./profiles
Customer table in unity cataloge