Job Summary
The Lead Data Engineer will play a central role in the buildout of CMG's next-generation data platform. This is a high-ownership role on a small, senior team, working directly with the SVP of Data & AI to design and implement a scalable lakehouse architecture on Google Cloud Storage (GCS) and Databricks, spanning bronze, silver, and gold layers. The role emphasizes domain-driven design, data contracts, and proactive communication with both internal stakeholders and external vendors.
Responsibilities
- Lead the technical design and implementation of CMG's Medallion 2.0 lakehouse architecture — bronze ingestion, silver transformation, and gold domain layers — built on GCS and Databricks (Delta Lake), with clear data contracts at each boundary
- Design and manage data pipelines using Astro (Airflow), PySpark, and Delta Live Tables, ensuring reliability and scalability across ingestion and transformation layers
- Govern the lakehouse using Databricks Unity Catalog — managing access controls, data lineage, and schema enforcement across domains
- Apply domain-driven design principles to partition and model data domains (e.g., royalty, asset, artist, distribution)
- Collaborate with the analytics team to ensure the gold layer reflects real business needs — reducing workarounds
- Coordinate with external vendors (e.g., DataArt) and internal stakeholders across DevOps, product, and analytics
- Proactively identify architectural risks, data quality issues, and dependency blockers with proposed resolutions
- Maintain clear, impact-first documentation and status updates for both technical and non-technical stakeholders
- Other duties as assigned
Qualifications
- 4+ years of data engineering experience, with at least 1–2 years focused on data platform or lakehouse architecture
- Hands-on experience with Databricks — including Delta Lake, PySpark, and ideally Unity Catalog
- Experience with GCS or equivalent cloud object storage as a lakehouse foundation layer
- Hands-on experience with domain-driven design applied to data modeling
- Strong command of SQL and at least one transformation framework (dbt preferred)
- Experience with medallion or lakehouse architectures (bronze/silver/gold or equivalent)
- Familiarity with GCP-native tooling — Pub/Sub, Dataflow, or Dataplex a plus
- Excellent written communication — able to write design docs non-engineers can understand and status updates executives can act on
- Demonstrated ability to work independently in ambiguous environments
- Track record of flagging risks early with proposed solutions
Nice to have: Experience in music/media/entertainment data; familiarity with data contracts or schema validation (Unity Catalog, Great Expectations, dbt tests); experience with external dev vendors
Pay Scale
- $120,000 - $150,000 CAD per year
- The final compensation within this range will be determined based on the candidate’s experience, skills, and overall fit for the role.