Location: Reading PA(Hybrid 2-3 days a week from office)
Position Type: CTH
We are looking for a highly technical Lead Platform Engineer to architect the observability cost governance and security framework for our enterprise AI agent ecosystem. You will be responsible for ensuring our agentic workflowsbuilt on AWS Bedrock AgentCoreand MCP serversare scalable observable and cost-efficient.
The ideal candidate bridges the gap between traditional DevOps and the emerging world of LLMOps with a deep focus on distributed tracing for non-deterministic AI workloads.
Requirements
Experience: 8 years in Platform Engineering DevOps or Site Reliability Engineering (SRE).
Cloud Expertise: Deep proficiency in AWS (IAM CloudWatch Lambda).
Observability Tools: Proven experience with Dynatrace Jaeger or Honeycomb and distributed tracing standards.
AI/LLM Interest: Familiarity with the LLM lifecycle including prompt execution token usage and frameworks like LangChain or AgentCore.
Automation: Advanced experience with Terraform and CI/CD pipeline design.
Collaboration: Experience working in an Agile environment with integrated tools like Microsoft Teams and Confluence.
Job Responsibilities
Observability
Assess CloudWatch X-Ray Bedrock logging AgentCore traces vs. agentic workflow requirements; produce gap analysis Setup observability in Dynatrace
Design post-deployment validation pipeline for agents & MCP servers (deployment health tool registration checks)
Integrate alert notifications to Microsoft Teams channels and email; route by resource ownership tags
Author runbooks linked to every alert; publish in Confluence for developer self-service resolution
Evaluate AWS-native vs. third-party monitoring stack; deliver recommendation aligned to observability architecture
Security & Access Control
Assess current IAM tagging approach for multi-team isolation; identify scalability gaps and risks
Evaluate Cedar policy engine (AgentCore) for fine-grained tool access control; document enterprise-scale gaps
Design scalable ABAC-based identity model for multi-team isolation without IAM policy sprawl; deliver Terraform modules
Required Skills:
AWS
Role: Senior AWS Agentcore Platform Engineer Location: Reading PA(Hybrid 2-3 days a week from office) Position Type: CTH We are looking for a highly technical Lead Platform Engineer to architect the observability cost governance and security framework for our enterprise AI agent ecosystem. You will...
Role: Senior AWS Agentcore Platform Engineer
Location: Reading PA(Hybrid 2-3 days a week from office)
Position Type: CTH
We are looking for a highly technical Lead Platform Engineer to architect the observability cost governance and security framework for our enterprise AI agent ecosystem. You will be responsible for ensuring our agentic workflowsbuilt on AWS Bedrock AgentCoreand MCP serversare scalable observable and cost-efficient.
The ideal candidate bridges the gap between traditional DevOps and the emerging world of LLMOps with a deep focus on distributed tracing for non-deterministic AI workloads.
Requirements
Experience: 8 years in Platform Engineering DevOps or Site Reliability Engineering (SRE).
Cloud Expertise: Deep proficiency in AWS (IAM CloudWatch Lambda).
Observability Tools: Proven experience with Dynatrace Jaeger or Honeycomb and distributed tracing standards.
AI/LLM Interest: Familiarity with the LLM lifecycle including prompt execution token usage and frameworks like LangChain or AgentCore.
Automation: Advanced experience with Terraform and CI/CD pipeline design.
Collaboration: Experience working in an Agile environment with integrated tools like Microsoft Teams and Confluence.
Job Responsibilities
Observability
Assess CloudWatch X-Ray Bedrock logging AgentCore traces vs. agentic workflow requirements; produce gap analysis Setup observability in Dynatrace
Design post-deployment validation pipeline for agents & MCP servers (deployment health tool registration checks)