Data Engineer (Azure & Databricks)

Job Location:

Bengaluru - India

Monthly Salary: Not Disclosed

Posted on: 23 days ago

Vacancies: 1 Vacancy

Job Summary

Role Overview

We are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks who will play a critical role in designing transforming and operationalizing data pipelines within a modern Lakehouse architecture.

The role primarily focuses on transforming data from the Bronze layer into curated analytics-ready datasets building automated CI/CD pipelines and developing high-quality Python and PySpark-based data solutions. The engineer will also collaborate closely with Data Scientists and Software Engineers and should be open to contributing to data-driven UI/UX initiatives.

Data Engineering & Transformation

Design develop and maintain scalable data transformation pipelines using Python (with tools like PySpark ADF) and SQL in Azure Databricks
Implement transformation logic to move data from Bronze to Silver/Gold layers following data engineering best practices
Apply strong data engineering principles to ensure data reliability quality performance and reusability
Work with structured and semi-structured data at scale

Databricks Azure & Cloud ETL

Build and manage Databricks notebooks jobs Delta Lake tables and orchestrated workflows
Hands-on experience with Cloud-based ETL platforms

(Preferred: Microsoft Azure Databricks Synapse Azure Functions; otherwise AWS or Google Cloud)

Optimize data pipelines for performance scalability and cost efficiency

Python Applications APIs & Automation

Design develop and maintain Python applications scripts and APIs for data processing and automation
Write production-grade Python code with strong focus on readability maintainability and testing
Leverage Python for orchestration validation and integration with downstream systems

Collaboration with Data Science & Engineering Teams

Collaborate closely with Data Scientists and Data Analysts to understand data analytical models and consumption requirements
Enable and support advanced analytics and data science workflows by preparing high-quality feature datasets
Translate analytical needs into scalable data engineering solutions

CI/CD DevOps & Platform Engineering

Build and maintain automated CI/CD pipelines for data and Databricks workloads
Hands-on experience with DevOps tools and practices including Git-based version control
Exposure to containerization and orchestration platforms such as Kubernetes / OpenShift
Ensure smooth promotion of code and pipelines across environments (Dev/Test/Prod)

Data Modeling & Querying

Design and implement robust data models optimized for analytics and reporting
Strong hands-on knowledge of SQL and exposure to KQL or other query languages
Apply best practices in data structures indexing and performance tuning UI / UX & Data Applications (Additional Advantage)
Open to contributing to data-driven UI/UX components dashboards or lightweight data applications
Work with analytics and business teams to improve data usability and customer experience

Required Skills & Qualifications

Must-Have

Strong hands-on expertise in Python (with frameworks like PySpark)
Solid foundation in Data Engineering principles and large-scale data processing
Experience with Azure Databricks and cloud-based ETL platforms
Strong knowledge of SQL and data querying techniques
Experience with CI/CD pipelines and DevOps practices
Experience in pipeline monitoring and alerting
Ability to design efficient scalable solutions to complex data problems

Good-to-Have

Experience with Azure Synapse Azure Functions
Exposure to AWS or Google Cloud data platforms
Hands-on experience with OpenShift
Knowledge of data science concepts and workflows
Familiarity with analytics platforms dashboards and UI/UX considerations

Role OverviewWe are seeking a highly skilled Data Engineer with strong experience in Azure and Databricks who will play a critical role in designing transforming and operationalizing data pipelines within a modern Lakehouse architecture.The role primarily focuses on transforming data from the Bronze...