Staff Software Engineer Forecast Engine
Santa Clara County, CA - USA
Job Summary
Employees can work remotely
Job Description
Team
Join the Global Cloud Services organizations FinOps Tools team which is building ServiceNows next-generation analytics and financial governance platform. Our team owns the full modern data stack: Trino for distributed queries dbt for transformations Iceberg for lakehouse architecture Lightdash for business intelligence and Argo Workflows for orchestration. You will own the Forecast Engine the system that turns ServiceNows cloud capacity and cost actuals into forward-looking forecasts then automatically tracks those forecasts against plan and budget and alerts the right people when reality diverges. The Forecast Engine also feeds directly into our Future Capacity Reservation (FCR) automation: its forecast of fleet growth and workload migration timing is the signal that drives how much hyperscaler capacity to reserve in which providers and regions and when against the lead-time windows FinOps and Cloud Operations plan around.
Role
The Forecast Engine is the simulation and automation core behind FinOps capacity and cost planning. It reads forecasting actuals from the lakehouse and runs a deterministic multi-period simulation of fleet growth workload migration placement and sizing. It validates each result against hard invariants and publishes forecasts that data scientists analysts and FinOps engineers consume in Lightdash. Today it is a fast single-binary Rust core with a streaming Trino read and an Iceberg publish path. The next chapter is to turn that engine into an automated always-on forecasting service.
As our Staff Software Engineer for the Forecast Engine you will design and build the automation layer around the engine: scheduled forecast runs variance and budget tracking against plan anomaly and threshold alerting first-class integration with planning systems Splunk and the broader observability stack and the handoff that turns forecasts into Future Capacity Reservation (FCR) recommendations. You will make the forecast a living signal: recomputed on a cadence reconciled against actuals and translated into the capacity reservations that keep hyperscaler supply ahead of demand.
This role demands speed and high velocity. You will take a proven simulation core and rapidly make it a dependable observable self-monitoring product that the organization plans against shipping working increments fast and iterating in tight loops. The automation layer around the engine is greenfield: you will build it from the ground up. We operate like a small startup and this is the operating mode of the role and the department: we move quickly deliver early keep process light and keep momentum.
What Youll Do: Core Responsibilities
- Design and develop scalable maintainable and reusable software components with a strong emphasis on performance determinism and reliability.
- Collaborate with product managers and FinOps partners to translate planning and budgeting requirements into well-architected solutions owning features from design through delivery.
- Build intuitive and extensible interfaces for forecast consumption (Lightdash models alert payloads and APIs) ensuring flexibility for finance and capacity-planning use cases.
- Contribute to the design and implementation of new Forecast Engine capabilities while enhancing existing simulation validation and publish paths.
- Integrate automated testing into development workflows to ensure consistent quality across releases including determinism (byte-identical output) and forecast-accuracy regression checks.
- Participate in design and code reviews ensuring best practices in performance maintainability and testability.
- Develop comprehensive test strategies covering functional regression integration and accuracy aspects (period-over-period identity backtest grading against real actuals).
- Foster a culture of continuous learning and improvement by sharing best practices in engineering and quality.
- Promote a culture of engineering craftsmanship knowledge-sharing and thoughtful quality practices across the team.
Technical Leadership & Architecture
- Own the architecture of the Forecast Engine and the automation layer around it: scheduled runs variance/budget tracking and alerting.
- Lead technical decision-making on forecast cadence reconciliation against actuals alert routing and the contract between the simulation core and downstream consumers.
- Establish best practices for forecast automation: idempotent scheduled runs deterministic reproducibility fail-loud data contracts and no silent fallbacks.
- Define how forecast signals (variance budget breach capacity headroom migration drift) are computed thresholded and surfaced.
- Drive innovation in forecasting and planning automation including the responsible use of AI/ML tooling to accelerate development and analysis.
Hands-On Development
- Build the automation that runs the Forecast Engine on a schedule via Argo Workflows with retries alerting on failure and run-to-run reproducibility.
- Develop variance and budget tracking: reconcile each forecast against plan and against the latest actuals compute deltas at the grains that matter (provider region pod workload) and persist a queryable variance history.
- Implement alerting that fires on budget breach forecast drift capacity thresholds and pipeline health routed to Splunk and the teams notification channels.
- Integrate with planning systems so plan/budget targets flow into the engine and forecast outputs flow back out to the planning surface.
- Drive the Future Capacity Reservation (FCR) handoff: translate the forecast of fleet growth and migration timing into reservation recommendations (how much capacity which providers/regions/pods and by when) aligned to hyperscaler procurement lead-time windows and reconciled with Cloud Operations so the same capacity is never reserved twice.
- Build and extend the Rust simulation core (period loop growth migration routing packing sizing validation) and its streaming Trino read and Iceberg publish paths.
- Create and maintain the Lightdash forecast and variance marts (standard dbt models on the published tables) that finance and capacity partners consume.
Platform Foundation
- Design the forecast data contract (the upstream view the engine reads) so data-quality problems halt loudly and are fixed at the source never papered over downstream.
- Implement scheduled observable forecast runs with full run lineage: inputs seed config output location and metrics for every run.
- Build observability and monitoring for the Forecast Engine: run success rates forecast latency memory ceilings accuracy drift and alert-delivery health emitted to Splunk and the observability stack.
- Establish an automation foundation that scales from a handful of scheduled scenarios to a broad multi-scenario forecasting program.
Forecast Automation & Alerting
- Create scheduled parameterized forecast scenarios with opinionated structure: pinned config deterministic seeds validated inputs and published outputs.
- Build tooling for one-command scenario runs and for promoting a scenario from ad-hoc to scheduled with minimal manual intervention.
- Establish guardrails: input data contracts resource/memory ceilings and loud halts that surface real problems instead of producing wrong-but-quiet numbers.
- Collaborate closely with FinOps analysts and capacity planners to rapidly iterate on variance definitions alert thresholds and the signals that matter without over-engineering.
- Prioritize forecast reliability accuracy tracking and clear alerting over feature breadth.
AI-Augmented Development
- Use modern AI development tools (e.g. Claude Code Cursor GitHub Copilot) to accelerate development testing and analysis and help the team adopt effective well-validated AI-assisted practices.
Collaboration & Integration
- Work autonomously with guidance from Engineering and FinOps leadership.
- Collaborate with DevOps and platform teams on scheduling infrastructure CI/CD pipelines and Splunk/observability integration.
- Partner with FinOps Tools team members working on Trino dbt Lightdash and Iceberg to ensure seamless integrations.
- Partner with finance and capacity-planning stakeholders to ensure forecasts variance and alerts map to how they actually plan and budget.
Qualifications :
Required Experience
- Experience in leveraging or critically thinking about how to integrate AI into work processes decision-making or problem-solving. This may include using AI-powered tools automating workflows analyzing AI-driven insights or exploring AIs potential impact on the function or industry.
- 8 years of experience in software engineering with a track record of delivering high-quality products with deep expertise in backend systems and cloud-native data-intensive architecture with a Bachelors degree; or 6 years and a Masters degree; or a PhD with 3 years experience in Computer Science Engineering or related technical field; or equivalent experience.
- Strong skills in a systems or backend language (Rust Go Java C or similar) and in Python for data tooling automation and analysis.
- Proven track record building automated scheduled data or forecasting pipelines that run reliably in production.
- Demonstrated ability to deliver at high velocity: shipping production-quality software fast in tight iteration loops without sacrificing reliability.
- Proven track record of greenfield development and building from scratch in environments with evolving requirements. We operate like a small startup and this role thrives on that: short paths from idea to shipped minimal process and high ownership.
- Hands-on experience building variance/anomaly detection budget or SLA tracking or alerting systems at scale.
- Experience integrating with observability and logging platforms (Splunk Datadog Prometheus/Grafana or similar).
- Experience with workflow orchestration systems (Argo Airflow or similar) and with the modern data stack.
- Strong knowledge of data structures algorithms object-oriented and data-oriented design design patterns and performance optimization.
- Familiarity with automated testing frameworks and integrating tests into CI/CD pipelines.
- Understanding of software quality principles including reliability determinism observability and production readiness.
- Ability to troubleshoot complex systems and optimize performance and memory across the stack.
- Experience validating data correctness: reconciling pipeline outputs against ground-truth actuals and catching silent regressions.
- Comfort with development tools such as IDEs debuggers profilers source control and Unix-based systems.
- Full professional proficiency in English.
Technical Expertise
- Forecasting & simulation: time-series or simulation-based forecasting scenario modeling and reconciliation of forecasts against actuals.
- Variance & alerting: budget vs. actual tracking anomaly/threshold detection alert routing and noise control (deduplication suppression severity).
- Observability: Splunk (search dashboards alerts) and metrics/logging integration for pipeline and forecast health.
- Orchestration: Argo Workflows or similar: scheduled runs retries idempotency failure alerting.
- Modern data stack: Trino dbt Iceberg Lightdash or similar lakehouse and BI technologies.
- Systems engineering: streaming/bounded-memory data processing deterministic and reproducible computation and config-driven design (no hardcoded business constants).
- Data contracts & quality: fail-loud ingestion upstream contract views and correctness invariants enforced in code.
- API & integration design: RESTful services authentication (OAuth/SAML) and webhook/notification integrations.
For positions in this location we offer a base pay of $166500 - $291400 plus equity (when applicable) variable/incentive compensation and benefits. Sales positions generally offer a competitive On Target Earnings (OTE) incentive compensation structure. Please note that the base pay shown is a guideline and individual total compensation will vary based on factors such as qualifications skill level competencies and work location. We also offer health plans including flexible spending accounts a 401(k) Plan with company match ESPP matching donations a flexible time away plan and family leave programs. Compensation is based on the geographic location in which the role is located and is subject to change based on work location.
Additional Information :
Work Personas
We approach our distributed world of work with flexibility and trust. Work personas (flexible remote or required in office) are categories that are assigned to ServiceNow employees depending on the nature of their work and their assigned work location. Learn more here. To determine eligibility for a work persona ServiceNow may confirm the distance between your primary residence and the closest ServiceNow office using a third-party service.
Equal Opportunity Employer
ServiceNow is an equal opportunity employer. All qualified applicants will receive consideration for employment without regard to race color religion sex sexual orientation national origin age disability gender identity veteran status or any other category protected by addition all qualified applicants with arrest or conviction records will be considered for employment in accordance with legal requirements.
Accommodations
We strive to create an accessible and inclusive experience for all candidates. If you require a reasonable accommodation to complete any part of the application process or are unable to use this online application and need an alternative method to apply please contact for assistance.
Export Control Regulations
For positions requiring access to controlled technology subject to export control regulations including the U.S. Export Administration Regulations (EAR) ServiceNow may be required to obtain export control approval from government authorities for certain individuals. All employment is contingent upon ServiceNow obtaining any export license or other approval that may be required by relevant export control authorities.
From Fortune. 2026 Fortune Media IP Limited. All rights reserved. Used under license.
Remote Work :
Yes
Employment Type :
Full-time
About Company
Learn here. Grow here. Make a difference here. At ServiceNow, our cloud?based platform and solutions deliver digital workflows that create great experiences and unlock productivity for employees and enterprises. Were growing fast, innovating even faster, and making an impact on our c ... View more