Role Overview: We are seeking a highly skilled Big Data Engineer with strong experience in Apache Spark Hadoop ecosystem and Apache Ozone. The ideal candidate will design develop and optimize large-scale data processing systems ensuring high performance scalability and reliability for enterprise-level applications. Key Responsibilities:
Design and implement distributed data processing solutions using Apache Spark Hadoop Flink
Develop and maintain Spark applications for data transformation aggregation and ETL processes using Scala Java or Python Utilize Apache Ozone for storing large-scale datasets ensuring efficient data access and management in a distributed environment
Manage and optimize HDFS and Apache Ozone Kafka for scalable and fault-tolerant storage.
Develop ETL pipelines for batch and real-time data ingestion and transformation.
Implement and ensure data validation data security integrity and compliance across big data platforms.
Monitor and troubleshoot performance issues in large-scale clusters.
Collaborate with data scientists analysts and application teams to deliver high-quality data solutions.
Automate workflows and improve operational efficiency using scripting and orchestration tools. Required Skills & Qualifications:
Strong expertise in Apache Spark (Core SQL Streaming).
Hands-on experience with Hadoop ecosystem (HDFS YARN MapReduce).
Proficiency in Apache Ozone for object storage and integration with Hadoop.
Solid programming skills in Java Scala or Python.
Experience with Hive HBase and Kafka is a plus.
Knowledge of cluster management and resource optimization.
Familiarity with Linux/Unix environments and shell scripting.
Understanding of data security governance and compliance standards.
Experience with cloud-based big data platforms Exposure to containerization (Docker Kubernetes) for big data workloads.
Knowledge of CI/CD pipelines for data engineering projects.
Behavioral Skills:
Good Communication skills
5 days Work from Office at Berkley Heights NJ
Team Player
Ability to work in a changing environment
Strong problem solving and analytical skills
Ability to work independently or within a team
Manage day-to-day challenges and communicate developmental risks with the technical team
Qualifications:
Bachelors degree in computer science Software Engineering or a related field.
Proficiency in business process modeling and documentation to
DATA ENGINEER - Sparx Hadoop OzoneCH Flink Job Description ROLE: DATA ENGINEER - Sparx Hadoop OzoneCH Flink Job Title: Data Engineer (Spark Hadoop OzoneCH) Location: Berkley Heights NJ Role Overview: We are seeking a highly skilled Big Data Engineer with strong experience in Apache Spark Hadoo...
Role Overview: We are seeking a highly skilled Big Data Engineer with strong experience in Apache Spark Hadoop ecosystem and Apache Ozone. The ideal candidate will design develop and optimize large-scale data processing systems ensuring high performance scalability and reliability for enterprise-level applications. Key Responsibilities:
Design and implement distributed data processing solutions using Apache Spark Hadoop Flink
Develop and maintain Spark applications for data transformation aggregation and ETL processes using Scala Java or Python Utilize Apache Ozone for storing large-scale datasets ensuring efficient data access and management in a distributed environment
Manage and optimize HDFS and Apache Ozone Kafka for scalable and fault-tolerant storage.
Develop ETL pipelines for batch and real-time data ingestion and transformation.
Implement and ensure data validation data security integrity and compliance across big data platforms.
Monitor and troubleshoot performance issues in large-scale clusters.
Collaborate with data scientists analysts and application teams to deliver high-quality data solutions.
Automate workflows and improve operational efficiency using scripting and orchestration tools. Required Skills & Qualifications:
Strong expertise in Apache Spark (Core SQL Streaming).
Hands-on experience with Hadoop ecosystem (HDFS YARN MapReduce).
Proficiency in Apache Ozone for object storage and integration with Hadoop.
Solid programming skills in Java Scala or Python.
Experience with Hive HBase and Kafka is a plus.
Knowledge of cluster management and resource optimization.
Familiarity with Linux/Unix environments and shell scripting.
Understanding of data security governance and compliance standards.
Experience with cloud-based big data platforms Exposure to containerization (Docker Kubernetes) for big data workloads.
Knowledge of CI/CD pipelines for data engineering projects.
Behavioral Skills:
Good Communication skills
5 days Work from Office at Berkley Heights NJ
Team Player
Ability to work in a changing environment
Strong problem solving and analytical skills
Ability to work independently or within a team
Manage day-to-day challenges and communicate developmental risks with the technical team
Qualifications:
Bachelors degree in computer science Software Engineering or a related field.
Proficiency in business process modeling and documentation to