Devops Engineer

Test Triangle

Job Location:

London - UK

Salary: Not Disclosed

Experience Required: 5years

Posted on: 21 days ago

Vacancies: 1 Vacancy

Job Summary

DevOps Engineer EMEIA Infrastructure

ROLE OVERVIEW

We are looking for a skilled and pragmatic DevOps Engineer to own and evolve our infrastructure across the EMEIA region. This is a dual-horizon role: you will keep our existing VM-based systems healthy while leading a greenfield effort to design and build the managed environment that those solutions will migrate onto.

A significant proportion of what we build is produced rapidly using AI-assisted structured development. That means our solutions can move from idea to deployment faster than ever and our infrastructure needs to keep pace. We need someone who thrives in a fast-moving ambiguous environment can absorb change quickly and treats adaptability as a core part of the job rather than an occasional demand.

The new managed environment is most likely to be based on Kube Apples internal Kubernetes (EKS) deployment though the final architecture will be a team decision and remains an option for workloads requiring greater control. You will help inform that decision and then own the build-out regardless of which direction is chosen.

You will work closely with data engineers developers and analysts acting as the infrastructure backbone for a team that moves quickly and expects you to move with it. The role also involves working directly with third-party vendors who support some of the tools being deployed and collaborating with teams outside of EMEIA including WorldWide to align on standards share solutions and resolve cross-regional dependencies.

KEY RESPONSIBILITIES

Platform Migration & Environment Design

Lead the design and build-out of a new managed container environment to replace existing VM-based infrastructure the most likely candidate is Kube (Apples internal Kubernetes/EKS cluster) but the final decision will be made collaboratively as a team
Contribute meaningfully to the environment selection decision: weigh trade-offs between managed solutions (Kube) and more directly controlled alternatives () considering maintenance overhead operational control and team capability
Own the migration of existing VM-based workloads onto the new platform managing sequencing risk and continuity of service throughout
Establish and maintain the standard workflow for deploying solutions: build locally containerise publish to Kube configure connectivity to Apple internal system dependencies

Apple Internal Networking & Connectivity

Configure and maintain networking between Kube and Apples internal systems including Shield Snowflake Appleconnect Floodgate and any other platform dependencies the team relies on
Own namespace and compute provisioning on the shared Kube cluster ensuring workloads are appropriately isolated and correctly configured
Manage credentials service accounts and access controls across the full connectivity chain from container to downstream service
Act as the go-to expert on how things connect within Apples internal network topology

Infrastructure Management

Own and manage cloud infrastructure across EMEIA using internal cloud tooling ( and connected systems including Shield)
Manage certificates firewalls resource pools networking and access controls
Ensure infrastructure is appropriately sized resilient and cost-efficient
Maintain accurate documentation of infrastructure topology and configuration

VM Provisioning & Automation (Existing Estate)

Maintain and operate existing virtual machines primarily on RHEL while migration to the new environment is in progress
Build and maintain standardised repeatable provisioning processes (e.g. via Ansible Terraform or equivalent IaC tooling)
Manage package deployment software repositories databases and web servers
Own the patching and update lifecycle for managed systems

Monitoring & Reliability

Implement and maintain monitoring alerting and observability across both the existing VM estate and the new container environment
Proactively identify risks bottlenecks and failure patterns before they impact users
Define and track appropriate SLIs/SLOs for critical services
Conduct post-incident reviews and drive lasting improvements

Supporting AI-Augmented Development

A large proportion of the solutions you will support are built rapidly using structured AI-assisted development you must be comfortable working with codebases and configurations that evolve quickly may not have deep documentation histories and may have been substantially generated with AI tooling
Provide the infrastructure scaffold that allows AI-assisted solutions to move from local development to production reliably and safely
Be a pragmatic partner to developers: unblock deployment quickly catch infrastructure-level risks early and help establish patterns that make rapid iteration safe at scale
Actively use AI tools (e.g. Claude Copilot or similar) to accelerate your own work: writing scripts diagnosing issues generating runbooks reviewing configurations

Diagnosis & Incident Response

Take ownership of vague or ambiguous production issues (e.g. its running slow the server keeps falling over) and drive them through to resolution
Deliver short-term fixes rapidly to restore service while tracking and delivering long-term root cause resolutions
Maintain a pragmatic balance between speed-of-recovery and quality-of-fix

SKILLS & EXPERIENCE

Essential

Proven experience in a DevOps infrastructure or platform engineering role
Hands-on experience with Kubernetes deploying configuring and operating workloads in a shared or managed cluster environment
Experience containerising applications: writing Dockerfiles managing images publishing to a registry and debugging container-level issues
Strong networking fundamentals: DNS TLS/SSL certificates firewall rules load balancing VPNs and service-to-service connectivity
Comfort operating in environments where the architecture is still being defined able to contribute to the decision then execute once direction is set
Hands-on experience with RHEL (or equivalent enterprise Linux) provisioning hardening package management (yum/dnf) systemd services
Experience managing cloud infrastructure ideally in an enterprise private/hybrid cloud environment
Experience with infrastructure-as-code or configuration management tooling (e.g. Terraform Ansible Puppet or similar)
Solid scripting ability in Bash and at least one higher-level language (Python preferred)
Experience with monitoring and observability tooling (e.g. Prometheus Grafana Datadog or similar)
Strong incident diagnosis skills able to work from vague symptoms to root cause using logs metrics and reasoning
Comfortable working with AI-generated or AI-assisted codebases: reading extending and debugging solutions without a full traditional authorship history
Clear written and verbal communication able to translate infrastructure complexity for non-technical stakeholders

Desirable

Experience with AWS or particularly EKS
Familiarity with Apples internal platform tooling: Kube Shield Appleconnect Floodgate or similar
Experience integrating with Snowflake including managing drivers credentials and network access
Experience with CI/CD pipelines (GitLab CI Jenkins GitHub Actions or similar)
Exposure to security tooling vulnerability scanning or compliance frameworks (e.g. CIS Benchmarks)
Familiarity with secrets management tooling (Vault CyberArk or similar)
Experience working in a regulated or enterprise environment with change management processes

WAYS OF WORKING

You are comfortable with genuine ambiguity including at the architectural level and can make progress and contribute to decisions without waiting for everything to be resolved
You default to automation: if you do something twice you script it; if you do it three times you build a process
You adapt quickly: the tools environments and solutions you support can change fast and you treat that as normal rather than exceptional
You are pragmatic under pressure: you know when to stop the bleeding first and fix it properly later
You are self-directed and comfortable owning problems end-to-end with minimal hand-holding
You are a willing partner to developers who move fast you keep up add guardrails where they matter and dont become a bottleneck

WHAT SUCCESS LOOKS LIKE

A new managed container environment is designed built and running with existing VM-based workloads migrated onto it in a controlled sequenced way
The standard deployment path (build containerise publish connect) is well-established documented and easy for the team to use
Connectivity from the new environment to Apple internal systems (Snowflake Appleconnect Shield Floodgate etc.) is reliable well-understood and correctly secured
Teams are unblocked quickly when they need new integrations access or capabilities even when the solutions they are deploying have been built at speed
Production issues are resolved rapidly with lasting fixes following close behind
Monitoring catches issues before users do
The infrastructure estate both old and new is well-documented well-understood and in a known-good state

Required Skills:

infrastructure

DevOps Engineer EMEIA InfrastructureROLE OVERVIEWWe are looking for a skilled and pragmatic DevOps Engineer to own and evolve our infrastructure across the EMEIA region. This is a dual-horizon role: you will keep our existing VM-based systems healthy while leading a greenfield effort to design and ...