Analytics Engineering on the Lakehouse using DBT & Databricks (part-2 )

Anuj Sen
3 min readMar 26, 2023

in Part-1 article we have set up the local env to develop and connect dbt models with databricks platform in this article we will jump into action and transform our data pipeline .

Use case : we are using dummy sales process where we have following entity

  1. customer : customer information
  2. orders : order information and status
  3. payments : payments and order id details

data is stored in bronze layer with date partitioned ing_date partition

  1. Defining source : Source is a starting point for any dbt transformation and generally represent bronze data source, ingested by data ingestion tool like ADF .
dbt run-operation stage_external_sources   --profiles-dir ./profiles/

Similarly we create payments and orders source also.

2. Transforming source data in Silver :

we are keeping configuration about target table in <model>.yml . this config file consist table details at silver layer and its column information we can also configure generic test cases here.pay attention in customer_id coulmn config.

Transformation logic is kept in <model>.sql file with some tables specfic config.

Running the model with variable value

dbt run --vars '{ ing_date : 20221216}' --profiles-dir ./profiles

Customer table in unity cataloge

--

--

Anuj Sen

Software Engineer | Big data | Student and Mentor