We are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks who will play a critical role in designing transforming and operationalizing data pipelines within a modern Lakehouse architecture.
The role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets building automated CI/CD pipelines and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.
Data Engineering & Transformation
Design develop and maintain scalable data transformation pipelines using Python (with tools like PySpark ADF) and SQL in Azure Databricks
Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices
Apply strong data engineering principles to ensure data reliability quality performance and reusability
Work with structured and semi-structured data at scale
Databricks Azure & Cloud ETL
Build and manage Databricks notebooks jobs Delta Lake tables and orchestrated workflows
Hands-on experience with Cloud-based ETL platforms
(Preferred: Microsoft Azure Databricks Synapse Azure Functions; otherwise AWS or Google Cloud)
Optimize data pipelines for performance scalability and cost efficiency
Python Applications APIs & Automation
Design develop and maintain Python applications scripts and APIs for data processing and automation
Write production-grade Python code with strong focus on readability maintainability and testing
Leverage Python for orchestration validation and integration with downstream systems
Collaboration with Data Science & Engineering Teams
Collaborate closely with Data Scientists and Data Analysts to understand data analytical models and consumption requirements
Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets
Translate analytical needs into scalable data engineering solutions
CI/CD DevOps & Platform Engineering
Build and maintain automated CI/CD pipelines for data and Databricks workloads
Hands-on experience with DevOps tools and practices including Git-based version control
Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift
Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod)
Data Modeling & Querying
Design and implement robust data models optimized for analytics and reporting
Strong hands-on knowledge of SQL and exposure to KQL or other query languages
Apply best practices in data structures indexing and performance tuning UI / UX & Data Applications (Additional Advantage)
Open to contributing to data-driven UI/UX components dashboards or lightweight data applications
Work with analytics and business teams to improve data usability and customer experience
Required Skills & Qualifications
Must-Have
Strong hands-on expertise in Python (with frameworks like PySpark)
Solid foundation in Data Engineering principles and large-scale data processing
Experience with Azure Databricks and cloud-based ETL platforms
Strong knowledge of SQL and data querying techniques
Experience with CI/CD pipelines and DevOps practices
Experience in pipeline monitoring and alerting
Ability to design efficient scalable solutions to complex data problems
Good-to-Have
Experience with Azure Synapse Azure Functions
Exposure to AWS or Google Cloud data platforms
Hands-on experience with OpenShift
Knowledge of data science concepts and workflows
Familiarity with analytics platforms dashboards and UI/UX considerations
Role OverviewWe are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks who will play a critical role in designing transforming and operationalizing data pipelines within a modern Lakehouse architecture.The role primarily focuses on transforming data from the Bronze...
Role Overview
We are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks who will play a critical role in designing transforming and operationalizing data pipelines within a modern Lakehouse architecture.
The role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets building automated CI/CD pipelines and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.
Data Engineering & Transformation
Design develop and maintain scalable data transformation pipelines using Python (with tools like PySpark ADF) and SQL in Azure Databricks
Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices
Apply strong data engineering principles to ensure data reliability quality performance and reusability
Work with structured and semi-structured data at scale
Databricks Azure & Cloud ETL
Build and manage Databricks notebooks jobs Delta Lake tables and orchestrated workflows
Hands-on experience with Cloud-based ETL platforms
(Preferred: Microsoft Azure Databricks Synapse Azure Functions; otherwise AWS or Google Cloud)
Optimize data pipelines for performance scalability and cost efficiency
Python Applications APIs & Automation
Design develop and maintain Python applications scripts and APIs for data processing and automation
Write production-grade Python code with strong focus on readability maintainability and testing
Leverage Python for orchestration validation and integration with downstream systems
Collaboration with Data Science & Engineering Teams
Collaborate closely with Data Scientists and Data Analysts to understand data analytical models and consumption requirements
Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets
Translate analytical needs into scalable data engineering solutions
CI/CD DevOps & Platform Engineering
Build and maintain automated CI/CD pipelines for data and Databricks workloads
Hands-on experience with DevOps tools and practices including Git-based version control
Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift
Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod)
Data Modeling & Querying
Design and implement robust data models optimized for analytics and reporting
Strong hands-on knowledge of SQL and exposure to KQL or other query languages
Apply best practices in data structures indexing and performance tuning UI / UX & Data Applications (Additional Advantage)
Open to contributing to data-driven UI/UX components dashboards or lightweight data applications
Work with analytics and business teams to improve data usability and customer experience
Required Skills & Qualifications
Must-Have
Strong hands-on expertise in Python (with frameworks like PySpark)
Solid foundation in Data Engineering principles and large-scale data processing
Experience with Azure Databricks and cloud-based ETL platforms
Strong knowledge of SQL and data querying techniques
Experience with CI/CD pipelines and DevOps practices
Experience in pipeline monitoring and alerting
Ability to design efficient scalable solutions to complex data problems
Good-to-Have
Experience with Azure Synapse Azure Functions
Exposure to AWS or Google Cloud data platforms
Hands-on experience with OpenShift
Knowledge of data science concepts and workflows
Familiarity with analytics platforms dashboards and UI/UX considerations