Director, Core Infrastructure Engineering
Seattle, OR - USA
Department:
Job Summary
Leads a multple teams to implement strategies for the architecture and delivery of interdependent scalable distributed systems that meet organizational and customer demands. Orchestrates cross-group optimization for highthroughput largescale data processing; aligns stakeholders on scalability requirements; and oversees elastic designs and effective use of data plane platforms. Provides strategic oversight for faulttolerant inserviceupgradable architectures sets direction for partitionaware design choices and leads initiatives to harden networks via loadshedding throttling and ratelimiting. Establishes expectations for formal verification and peer reviews and sets SLOaligned durability and availability standards across the department. Drives KPI and telemetry strategies; directs creation of complex dashboards and alerting for proactive health assurance; and ensures functional/correctness validation data replication and synchronization meet organizational needs. Guides organizationwide incident management and operational readiness eliminating customer maintenance windows and ensuring consistent SOPs. Provides strategic security guidance (encryption access controls) oversees remediation and compliance documentation and sponsors automation (IaC) and changemanagement alignment so systems can be safely patched updated and rolled back at scale.
Responsibilities
Key Responsibilities
System Design & Architecture System Scalability:
Implements strategies across multiple teams or groups for the architecture and design of interdependent scalable distributed systems including the use of distributed state management tools ensuring organizational and system demands are met.
Spearheads code and/or system optimization initiatives for large-scale data processing and high-throughput requirements across multiple areas driving improvements that support hyper-scale systems.
Facilitates collaborations to define system scalability requirements ensuring the defined requirements meet customer expectations.
Oversees the design of interdependent systems to scale with elasticity (e.g. effectively scaling both up and down).
Drives the effective use and implementation of data plane platforms for large-scale data operations.
System Design & Architecture System Reliability Design:
Provides strategic oversight for the architecture of fault-tolerant interdependent systems capable of withstanding in-service updates by overseeing implementation across teams of redundancy replication and automatic failover mechanisms.
Influences and sets direction for designing systems to effectively handle service disruptions (e.g. network partitions) by prioritizing consistency availability or partition tolerance.
Leads strategic optimization initiatives for handling network unreliability including directing the design of load-shedding throttling and rate-limiting techniques.
Holds teams accountable for leveraging formal verification techniques to verify system designs and conduct peer reviews across teams.
Drives the design of systems that are durable and adhere to service level objectives (SLOs) developing standards for availability and durability of other computing services across the department.
System Design & Architecture System Reliability Performance:
Drives strategies for defining key performance indicators (KPIs) and telemetry to identify risks gaps or cyclical dependencies in running systems ensuring alignment with organizational goals.
Directs the creation and customization of complex dashboards telemetry systems and alerting mechanisms that proactively monitor and ensure optimal system health across teams.
System Design & Architecture Correctness / Availability:
Implements strategies to effectively determine if systems are meeting functional and correctness requirements and encourages teams to identify improvement opportunities.
Provides thought leadership on processes for formally verifying complex features to ensure system design correctness.
Oversees the implementation of data replication and synchronization techniques ensuring data integrity and availability across the organization.
Operational Troubleshooting & Incident Management:
Provides strategic oversight for diagnosing debugging and resolving issues in active systems to support ongoing operation.
Directs strategies within teams to prevent interruptions ensuring no maintenance windows are required for customers and users when resolving issues.
Drives alignment across teams for operational readiness protocol and standard operating procedures.
Provides expert guidance for complex incident response and root cause investigations.
Compliance & Security:
Provides strategic guidance in architecting robust security measures to protect data and applications in multi-tenant environments ensuring encryption techniques and access controls are implemented.
Oversees execution of remediation plans to address identified security gaps promoting significant improvements and continuous advancement of security measures.
Drives documentation efforts and ensures cloud infrastructure compliance with industry standards and regulations.
Automation & Change Management:
Provides strategic guidance across teams on developing and maintaining automation scripts and tools (e.g. Infrastructure as Code (IaC)) to manage cloud infrastructure.
Drives strategic alignment of change management plans for patching updating and rolling back applications and oversees that system designs allow for automation of these processes.
Core Responsibilities
Planning & Execution:
Oversees and guides multiple teams on managing complex projects or initiatives monitoring timelines deliverables and budgets (when applicable) to ensure strategic objectives are met.
Serves as a role model for appropriately delegating work setting priorities and ensuring alignment with business needs.
Coaches others on adjusting resources or project timelines in anticipation of business changes.
Collaboration & Partnership:
Role models leading cross-functional collaborative efforts to ensure alignment of expectations and strategic objectives.
Empowers teams to build and maintain partnerships with business leaders stakeholders and/or customers to address barriers and contribute to organizational success.
Drives transparency and inclusivity by modeling actively seeking listening to and leveraging diverse perspectives.
Problem Solving:
Shares problem-solving strategies across teams providing oversight on complex operational and/or technical issues as needed.
Coaches teams on analyzing highly complex data and/or information to identify solutions to ambiguous issues.
Provides direction on identifying root causes to prevent recurrence of issues.
Continuous Learning:
Pursues strategic learning opportunities to maintain expertise and apply best practices at the organizational level.
Creates opportunities for team members and leaders to build their expertise in new areas coaching them to build innovative skills.
Identifies skill gap trends across the organization and upholds a culture that places significant emphasis on sharing knowledge and pursuing learning opportunities that advance the organization.
Evaluates the efficiency of learning strategies and recommends adjustments as needed.
Continuous Improvement:
Empowers teams to own the development and implementation of ideas that increase the efficiency and effectiveness of processes protocols and workflows across the department.
Coaches teams to gain buy-in for ideas and to seek feedback on approaches and methods for continued improvement.
Prioritizes and reviews the roadmap of improvement initiatives to ensure alignment with strategic direction and maximize return on investments.
Performance and Development:
Serves as a role model for driving performance across teams through tailored feedback and coaching in alignment with performance management processes guidelines and expectations.
Drives consistency in the application of talent development procedures and socializes performance expectations across the organization.
Ensures that individual development goals are aligned with organizational strategic initiatives.
Collaborates with HR to implement talent strategy through hiring and promotion processes.
Qualifications
Disclaimer:
Certain U.S. based or U.S. customer or client-facing roles may be required to comply with applicable requirements such as immunization/occupational health mandates and/or drug testing requirements.
Range and benefit information provided in this posting are specific to the stated locations only
US: Hiring Range in USD from: $169800 to $355400 per annum. May be eligible for bonus equity and compensation deferral.
Oracle maintains broad salary ranges for its roles in order to account for variations in knowledge skills experience market conditions and locations as well as reflect Oracles differing products industries and lines of business.
Candidates are typically placed into the range based on the preceding factors as well as internal peer equity.
Oracle US offers a comprehensive benefits package which includes the following:
1. Medical dental and vision insurance including expert medical opinion
2. Short term disability and long term disability
3. Life insurance and AD&D
4. Supplemental life insurance (Employee/Spouse/Child)
5. Health care and dependent care Flexible Spending Accounts
6. Pre-tax commuter and parking benefits
7. 401(k) Savings and Investment Plan with company match
8. Paid time off: Flexible Vacation is provided to all eligible employees assigned to a salaried (non-overtime eligible) position. Accrued Vacation is provided to all other employees eligible for vacation benefits. For employees working at least 35 hours per week the vacation accrual rate is 13 days annually for the first three years of employment and 18 days annually for subsequent years of employment. Vacation accrual is prorated for employees working between 20 and 34 hours per week. Employees working fewer than 20 hours per week are not eligible for vacation.
9. 11 paid holidays
10. Paid sick leave: 72 hours of paid sick leave upon date of hire. Refreshes each calendar year. Unused balance will carry over each year up to a maximum cap of 112 hours.
11. Paid parental leave
12. Adoption assistance
13. Employee Stock Purchase Plan
14. Financial planning and group legal
15. Voluntary benefits including auto homeowner and pet insurance
The role will generally accept applications for at least three calendar days from the posting date or as long as the job remains posted.
Career Level - M4
Required Experience:
Director
About Company
As a world leader in cloud solutions, Oracle uses tomorrow’s technology to tackle today’s challenges. We’ve partnered with industry-leaders in almost every sector—and continue to thrive after 40+ years of change by operating with integrity. We know that true innovation starts when eve ... View more