AI Cloud Support Engineer
Job Summary
Role Overview
This is an offshore support role providing day-to-day operational support and L2/L3 troubleshooting for the Microsoft Azure cloud platform and AI services that underpin an enterprise Agentic AI solution. The role keeps the live environment healthy across non-production and production handling incidents service requests deployments and routine operations and works closely with platform and MLOps engineering within a follow-the-sun delivery model.
Key Responsibilities
Provide L2/L3 operational support for Azure platform services including Service Bus Managed Redis Cosmos DB API Management Azure Functions Key Vault and Storage.
Monitor platform and AI service health using Application Insights Azure Monitor and Dynatrace and respond to alerts using KQL-based log analysis.
Triage diagnose and resolve incidents perform root cause analysis and escalate to engineering when required.
Support the AI agent runtime (Azure Functions / orchestration) and the consumption of a centrally-managed AI foundry / Cognitive Services offering.
Support MLOps pipelines and AI model/prompt deployment operations and assist with CI/CD pipeline runs using GitHub Actions.
Assist with Terraform-based configuration and platform changes under change management.
Support operational health of integration endpoints such as Kafka consumers and API Management.
Maintain operational runbooks and knowledge-base articles and track incident and request metrics against SLAs.
Provide follow-the-sun / on-call coverage through a rotation schedule.
Qualifications & Experience
4 years experience in cloud operations or application support on Microsoft Azure.
Hands-on experience with Azure PaaS services and platform diagnostics.
Familiarity with AI/LLM services such as Azure OpenAI or Cognitive Services.
Experience with monitoring tools such as Application Insights Azure Monitor and Dynatrace and with KQL.
Exposure to CI/CD pipelines and Infrastructure as Code (Terraform).
Proficiency in scripting with PowerShell Python or Bash.
Experience with ITIL-based incident problem and change management.
Strong communication and stakeholder engagement skills.
Bachelors degree in Computer Science Information Technology or a related discipline.
Preferred Skills
Microsoft Azure certification such as AZ-104.
Exposure to MLOps pipelines or AI/LLM platform operations.
Familiarity with Terraform and GitHub Actions.
Experience supporting agentic AI or LLM platforms.
Experience working within follow-the-sun support models.
Required Experience:
IC
About Company
At Virtusa, we are builders, makers, and doers. Digital engineering is in our DNA. It’s at the heart of everything we do.