At Apple we believe that innovation flourishes in an environment where ideas are challenged collaboration is encouraged and technology is pushed to its limits. This environment is only possible when diverse minds come together bringing unique perspectives and experiences. Our people and their ideas inspire innovation in everything we do. Imagine what you could accomplish here! Join Apple and help us make the world a better a principal contributor and technical lead in our Apple Data Platform (ADP) SRE organization you will apply SRE principles as you mentor and partner with our engineers and partner teams ensuring large-scale analytics infrastructure runs reliably and efficiently. This role focuses on driving reliability standards architectural consistency and engineering excellence across peer SRE teams and partner engineering organizations spanning Hadoop HBase Spark Data Lakes and Airflow ecosystems through technical leadership cross-functional alignment and the development of platform-wide tooling observability and operational practices that raise the reliability bar for all of ADP. This role includes production on-call responsibilities.n
Apple Service Engineering (ASE) teams build and scale the platforms and infrastructure behind many of Apples services including iCloud iTunes Siri and Maps. We are the foundation on which Apples software developers build the products that our customers love. We are looking for a passionate and dedicated Technical Lead to drive SRE standards and engineering excellence across the entire Apple Data Platform organization. The Apple Data Platform (ADP) SRE Technical Lead partners with multiple SRE and engineering teams across the data platform including teams responsible for Hadoop and HBase infrastructure Spark S3-compatible storage and Airflow-orchestrated pipelines. Rather than owning a single vertical this role sets the technical direction for how reliability is practiced across ADP: defining SLOs establishing architectural review processes developing shared tooling and automation and ensuring that SRE principles are applied consistently as the platform scales. You will be a force multiplier making every team around you more effective.
Serve as the SRE Technical Lead across ADP partnering with vertical SRE teams and software engineering organizations to ensure reliability standards are consistently applied across the full data platformnDefine and drive adoption of SLO frameworks error budget policies and incident management practices across ADP servicesn Provide architectural review and reliability guidance for new services and major platform changes identifying risks and influencing design before they reach productionn Lead the development of shared observability automation and infrastructure-as-code tooling that benefits multiple ADP teams simultaneouslyn Identify and eliminate systemic sources of toil and instability across the platform; advocate for and deliver platform-wide reliability improvementsn Mentor and grow SRE engineers across teams establishing a culture of engineering excellence and continuous improvementn Represent ADP SRE in cross-organizational forums communicating technical strategy and reliability posture to ASE and Apple leadershipn Programming in Python and Golang supported by Generative AI tooling to accelerate development of mission-critical shared automation and toolsn Production on-call and incident management responsibilities including leading response for high-severity cross-platform incidentsn
BS/MS in Computer Science or equivalentn12 years of experience in Site Reliability Engineering managing infrastructure and services at scalen5 years of experience in technical leadership roles with demonstrated ability to lead horizontally across teams without direct authoritynBroad expertise across the data platform stack: Hadoop (HDFS YARN) HBase Apache Spark Data Lake architectures S3-compatible storage solutions and Apache Airflown History of defining and driving SLO/error budget frameworks and reliability practices across multiple teams or servicesn Demonstrable programming skills to develop shared tooling lead code reviews and set engineering standardsn Strong written and verbal communication skills able to present technical strategy to both engineers and leadershipn Advanced knowledge of Linux networking and distributed systems fundamentalsn
15 years of experience in SRE or related work managing infrastructure at scalen Experience with Ceph object storage operationsn Kubernetes cluster operations experience particularly running stateful data workloadsn Experience with scale testing disaster recovery and capacity planning across distributed data systemsn Experience driving multi-year platform migrations or large-scale architectural transitionsn Ability to define the technical roadmap for a data platform organization and drive cross-functional alignment on architectural standards and best practicesn Background in data security access control or compliance-sensitive data environmentsn
Required Experience:
Senior IC
At Apple we believe that innovation flourishes in an environment where ideas are challenged collaboration is encouraged and technology is pushed to its limits. This environment is only possible when diverse minds come together bringing unique perspectives and experiences. Our people and their ideas ...
At Apple we believe that innovation flourishes in an environment where ideas are challenged collaboration is encouraged and technology is pushed to its limits. This environment is only possible when diverse minds come together bringing unique perspectives and experiences. Our people and their ideas inspire innovation in everything we do. Imagine what you could accomplish here! Join Apple and help us make the world a better a principal contributor and technical lead in our Apple Data Platform (ADP) SRE organization you will apply SRE principles as you mentor and partner with our engineers and partner teams ensuring large-scale analytics infrastructure runs reliably and efficiently. This role focuses on driving reliability standards architectural consistency and engineering excellence across peer SRE teams and partner engineering organizations spanning Hadoop HBase Spark Data Lakes and Airflow ecosystems through technical leadership cross-functional alignment and the development of platform-wide tooling observability and operational practices that raise the reliability bar for all of ADP. This role includes production on-call responsibilities.n
Apple Service Engineering (ASE) teams build and scale the platforms and infrastructure behind many of Apples services including iCloud iTunes Siri and Maps. We are the foundation on which Apples software developers build the products that our customers love. We are looking for a passionate and dedicated Technical Lead to drive SRE standards and engineering excellence across the entire Apple Data Platform organization. The Apple Data Platform (ADP) SRE Technical Lead partners with multiple SRE and engineering teams across the data platform including teams responsible for Hadoop and HBase infrastructure Spark S3-compatible storage and Airflow-orchestrated pipelines. Rather than owning a single vertical this role sets the technical direction for how reliability is practiced across ADP: defining SLOs establishing architectural review processes developing shared tooling and automation and ensuring that SRE principles are applied consistently as the platform scales. You will be a force multiplier making every team around you more effective.
Serve as the SRE Technical Lead across ADP partnering with vertical SRE teams and software engineering organizations to ensure reliability standards are consistently applied across the full data platformnDefine and drive adoption of SLO frameworks error budget policies and incident management practices across ADP servicesn Provide architectural review and reliability guidance for new services and major platform changes identifying risks and influencing design before they reach productionn Lead the development of shared observability automation and infrastructure-as-code tooling that benefits multiple ADP teams simultaneouslyn Identify and eliminate systemic sources of toil and instability across the platform; advocate for and deliver platform-wide reliability improvementsn Mentor and grow SRE engineers across teams establishing a culture of engineering excellence and continuous improvementn Represent ADP SRE in cross-organizational forums communicating technical strategy and reliability posture to ASE and Apple leadershipn Programming in Python and Golang supported by Generative AI tooling to accelerate development of mission-critical shared automation and toolsn Production on-call and incident management responsibilities including leading response for high-severity cross-platform incidentsn
BS/MS in Computer Science or equivalentn12 years of experience in Site Reliability Engineering managing infrastructure and services at scalen5 years of experience in technical leadership roles with demonstrated ability to lead horizontally across teams without direct authoritynBroad expertise across the data platform stack: Hadoop (HDFS YARN) HBase Apache Spark Data Lake architectures S3-compatible storage solutions and Apache Airflown History of defining and driving SLO/error budget frameworks and reliability practices across multiple teams or servicesn Demonstrable programming skills to develop shared tooling lead code reviews and set engineering standardsn Strong written and verbal communication skills able to present technical strategy to both engineers and leadershipn Advanced knowledge of Linux networking and distributed systems fundamentalsn
15 years of experience in SRE or related work managing infrastructure at scalen Experience with Ceph object storage operationsn Kubernetes cluster operations experience particularly running stateful data workloadsn Experience with scale testing disaster recovery and capacity planning across distributed data systemsn Experience driving multi-year platform migrations or large-scale architectural transitionsn Ability to define the technical roadmap for a data platform organization and drive cross-functional alignment on architectural standards and best practicesn Background in data security access control or compliance-sensitive data environmentsn
Ask Siri to name the most successful company in the world and it might respond: Apple. And it's not just out of familial pride. Apple consistently ranks highly in profit, revenue, market capitalization, and consumer cachet. In 2018, the company became the first reach a trillion dollar
... View more