Sr. Data Architect

Job Location:

Vienna, VA - USA

Monthly Salary: Not Disclosed

Posted on: 17 hours ago

Vacancies: 1 Vacancy

Job Summary

SteerBridge Strategies is a modern technology company delivering innovative missionfocused solutions to the U.S. Government and private sector. Leveraging deep expertise in federal acquisition digital transformation and emerging technologies we deliver agile commercialgrade capabilities that accelerate operational effectiveness and drive measurable mission success.

At the core of SteerBridge is our peopleespecially the veterans whose leadership problemsolving mindset and commitment to excellence elevate every project we support. We dont simply hire exceptional talent; we cultivate it creating meaningful career pathways for veterans military spouses and professionals who share our passion for advancing technology and strengthening the missions we serve.

We are seeking aSenior Data Architectto lead the design and evolution of enterprise-level data ecosystems. You will be responsible for architecting scalable secure and high-performance data infrastructures that support mission-critical aviation sustainment. This is a player-coach role that requires high-level strategic planning alongside hands-on engineering execution.

Benefits

Health insurance
Dental insurance
Vision insurance
Life Insurance
401(k) Retirement Plan with matching
Paid Time Off
Paid Federal Holidays

Key Responsibilities

Architecture & Design:Design conceptual logical and physical data models for complex federal environments. Lead the transition from legacy on-premises systems to modern cloud-native (AWS/GCP) data platforms.

Pipeline Development:Architect and oversee the build of automated ETL/ELT pipelines using Python SQL and PySpark to ingest and transform unstructured and structured data.

Cloud Data Warehousing:Implement and optimize enterprise data warehouses using tools likeAWS RedshiftGoogle BigQueryAWS Glue andDatabricks.

Governance & Compliance:Establish data governance frameworks metadata management and data lineage in alignment with federal standards (HIPAA FHIR NIST).

Performance Optimization:Conduct index/partition design query tuning and sharding strategies to ensure high availability and scalability for real-time analytics.

AI/ML Support:Design data architectures that facilitate AI/ML initiatives including model training pipelines and real-time inference in production environments.

Leadership:Mentor a team of data engineers enforce software engineering best practices (CI/CD unit testing documentation) and serve as a technical bridge between stakeholders and delivery teams.

Required Qualifications

Must be a U.S. Citizen.
Masterss Degree or Above in Systems Engineering Computer Science or related field.
An active security clearance or the ability to obtain one is required.
Minimum 6 years of experience to include:
- Experience in data management utilizing advanced analytics tools and platforms and Python.
- Experience with Data Warehousing consulting/engineering or related technologies (Redshift Databricks BigQuery OADW Apache Hive Apache Lucene).
- Experience in scripting tooling and automating large-scale computing environments.
- Extensive experience with major tools such as Python Pandas PySpark NumPy SciPy SQL and Git; Minor experience with TensorFlow PyTorch and Scikit-learn.
- Compliance:Deep understanding of data security and federal compliance requirements.

PROFESSIONAL EXPERIENCE / QUALIFICATIONS

Data Architecture and Design
- Skills:
  - Data modeling (conceptual logical and physical)
  - Database schema design
  - Understanding of different database paradigms (relational NoSQL graph databases etc.)
  - ETL (Extract Transform Load) processes and tools
  - Experience with modern data warehousing solutions (e.g. Redshift Snowflake BigQuery)
  - Understanding of dimensional modeling (star/snowflake schemas) and data vault techniques.
  - Experience designing for both OLTP and OLAP workloads.
  - Familiarity with metadata-driven design and schema evolution in data systems.
  - Experience defining data SLAs and lifecycle management policies.
  - Project Experience: Designing and implementing scalable data architectures that support business intelligence analytics and machine learning workflows.

Data Pipeline Development
- Skills:
  - Proficiency in tools like Apache Kafka Airflow Spark Flink or NiFi
  - Experience with cloud-based data services (AWS Glue Google Cloud Dataflow Azure Data Factory)
  - Real-time and batch data processing
  - Automation and monitoring of data pipelines
  - Strong understanding of incremental processing idempotency and backfill strategies.
  - Knowledge of workflow dependency management retries and alerting.
  - Experience writing modular testable and reusable Python-based ETL code.
  - Project Experience: Leading the development of highly available fault-tolerant and scalable data pipelines integrating multiple data sources and ensuring data quality.

Cloud Platforms and Services
- Skills:
  - Expertise in cloud environments (AWS GCP Azure)
  - Understanding of cloud-based storage (S3 Blob Storage) databases (RDS DynamoDB) and compute resources
  - Implementing cloud-native data solutions (Data Lake Data Warehouse Data Mesh)
  - Experience with cost monitoring and optimization for data workloads.
  - Familiarity with hybrid and multi-cloud architectures.
  - Understanding of serverless data patterns (e.g. Lambda S3 Athena Cloud Functions BigQuery).
  - Project Experience: Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services with a focus on cost efficiency and scalability.

Big Data Technologies
- Skills:
  - Experience with big data ecosystems (Hadoop HDFS Hive Spark)
  - Distributed computing parallel processing and handling petabyte-scale data
  - Tools for querying large datasets (Presto Athena)
  - Understanding of lakehouse frameworks (Delta Lake Iceberg Hudi).
  - Familiarity with data compaction schema evolution and ACID guarantees in distributed storage
  - Project Experience: Building and managing big data platforms to enable large-scale analytics often incorporating structured and unstructured data.

Database Administration and Optimization
- Skills:
  - Expertise in database technologies (SQL NoSQL GraphDBs)
  - Query optimization indexing and partitioning strategies
  - Backup replication and disaster recovery planning
  - Understanding of query execution plans cost-based optimization and caching strategies.
  - Experience performing index and partition design based on query patterns.
  - Familiarity with data versioning and temporal tables.
  - Experience profiling and optimizing application code interacting with databases.
  - Project Experience: Performance tuning for complex queries implementing database replication and sharding strategies to support high availability and scalability.

Data Governance and Security
- Skills:
  - Data privacy encryption and compliance with regulations (GDPR CCPA)
  - Implementing data governance frameworks (data lineage cataloging metadata management)
  - Role-based access control and user management for sensitive data
  - Experience with automated policy enforcement and data lineage visualization tools (e.g. DataHub Collibra Alation).
  - Knowledge of data quality frameworks integrated into CI/CD pipelines.
  - Familiarity with data contract testing between producer and consumer teams.
  - Project Experience: Developing and implementing data governance policies and security controls across the organizations data assets ensuring compliance with industry standards.

Programming and Scripting Languages
- Skills:
  - Proficiency in Python and SQL
  - Experience with version control (Git) and CI/CD for data engineering (Gitlab Jenkins CircleCI)
  - API design and integration (Postman)
  - Strong understanding of object-oriented programming (OOP) principles and design patterns in Python.
  - Familiarity with software engineering best practices (modularity testing documentation linting).
  - Understanding of algorithmic complexity (Big O notation) and ability to optimize code for scale.
  - Experience with parallel and distributed computation frameworks (Spark Dask Ray).
  - Ability to profile and debug performance bottlenecks in data workflows.
  - Use of type hinting logging frameworks and automated testing frameworks (pytest unittest)

AI/ML Pipeline Support and Analytics
- Skills:
  - Experience in supporting data scientists with feature engineering data wrangling and model deployment
  - Knowledge of ML orchestration tools (MLflow Kubeflow)
  - Hands-on experience with analytics tools (e.g. Tableau Power BI)
  - Familiarity with feature store design and model feature lineage tracking.
  - Understanding of data versioning and reproducibility for ML workflows.
  - Experience supporting real-time model inference pipelines.
  - Project Experience: Designing architectures that support AI/ML initiatives enabling scalable data pipelines for training models and supporting experimentation in the production environment.

Leadership and Mentorship
- Skills:
  - Leading data engineering teams cross-functional collaboration with data scientists analysts and business units
  - Project management (Agile Scrum Kanban) and stakeholder communication
  - Experience with mentorship and growing junior data engineers
  - Experience establishing data architecture standards and best practices.
  - Ability to review and approve technical designs for consistency and scalability.
  - Proven success in mentoring engineers in code quality modeling and system design.
  - Project Experience: Leading the technical direction for large-scale data initiatives such as enterprise data lake implementations or the creation of a unified data platform.

$155000 - $180000 a year

SteerBridge Strategies is proud to be an Equal Opportunity Employer.We are committed to creating a diverse and inclusive workplace where all qualified applicants and employees are treated with respect and dignityregardless of race color gender age religion national origin ancestry disability veteran status genetic information sexual orientation or any other characteristic protected by law.

We also provide reasonable accommodations for individuals with disabilities in accordance with applicable laws. If you require assistance during the application process we encourage you to reach out so we can support your needs.

If you would like information about how your application is processed please contact us.

Required Experience:

Senior IC

SteerBridge Strategies is a modern technology company delivering innovative missionfocused solutions to the U.S. Government and private sector. Leveraging deep expertise in federal acquisition digital transformation and emerging technologies we deliver agile commercialgrade capabilities that acceler...

Benefits

Health insurance
Dental insurance
Vision insurance
Life Insurance
401(k) Retirement Plan with matching
Paid Time Off
Paid Federal Holidays

Key Responsibilities

Pipeline Development:Architect and oversee the build of automated ETL/ELT pipelines using Python SQL and PySpark to ingest and transform unstructured and structured data.

Cloud Data Warehousing:Implement and optimize enterprise data warehouses using tools likeAWS RedshiftGoogle BigQueryAWS Glue andDatabricks.

Governance & Compliance:Establish data governance frameworks metadata management and data lineage in alignment with federal standards (HIPAA FHIR NIST).

Performance Optimization:Conduct index/partition design query tuning and sharding strategies to ensure high availability and scalability for real-time analytics.

AI/ML Support:Design data architectures that facilitate AI/ML initiatives including model training pipelines and real-time inference in production environments.

Leadership:Mentor a team of data engineers enforce software engineering best practices (CI/CD unit testing documentation) and serve as a technical bridge between stakeholders and delivery teams.

Required Qualifications

Must be a U.S. Citizen.
Masterss Degree or Above in Systems Engineering Computer Science or related field.
An active security clearance or the ability to obtain one is required.
Minimum 6 years of experience to include:
- Experience in data management utilizing advanced analytics tools and platforms and Python.
- Experience with Data Warehousing consulting/engineering or related technologies (Redshift Databricks BigQuery OADW Apache Hive Apache Lucene).
- Experience in scripting tooling and automating large-scale computing environments.
- Extensive experience with major tools such as Python Pandas PySpark NumPy SciPy SQL and Git; Minor experience with TensorFlow PyTorch and Scikit-learn.
- Compliance:Deep understanding of data security and federal compliance requirements.

PROFESSIONAL EXPERIENCE / QUALIFICATIONS

Data Architecture and Design
- Skills:
  - Data modeling (conceptual logical and physical)
  - Database schema design
  - Understanding of different database paradigms (relational NoSQL graph databases etc.)
  - ETL (Extract Transform Load) processes and tools
  - Experience with modern data warehousing solutions (e.g. Redshift Snowflake BigQuery)
  - Understanding of dimensional modeling (star/snowflake schemas) and data vault techniques.
  - Experience designing for both OLTP and OLAP workloads.
  - Familiarity with metadata-driven design and schema evolution in data systems.
  - Experience defining data SLAs and lifecycle management policies.
  - Project Experience: Designing and implementing scalable data architectures that support business intelligence analytics and machine learning workflows.

Data Pipeline Development
- Skills:
  - Proficiency in tools like Apache Kafka Airflow Spark Flink or NiFi
  - Experience with cloud-based data services (AWS Glue Google Cloud Dataflow Azure Data Factory)
  - Real-time and batch data processing
  - Automation and monitoring of data pipelines
  - Strong understanding of incremental processing idempotency and backfill strategies.
  - Knowledge of workflow dependency management retries and alerting.
  - Experience writing modular testable and reusable Python-based ETL code.
  - Project Experience: Leading the development of highly available fault-tolerant and scalable data pipelines integrating multiple data sources and ensuring data quality.

Cloud Platforms and Services
- Skills:
  - Expertise in cloud environments (AWS GCP Azure)
  - Understanding of cloud-based storage (S3 Blob Storage) databases (RDS DynamoDB) and compute resources
  - Implementing cloud-native data solutions (Data Lake Data Warehouse Data Mesh)
  - Experience with cost monitoring and optimization for data workloads.
  - Familiarity with hybrid and multi-cloud architectures.
  - Understanding of serverless data patterns (e.g. Lambda S3 Athena Cloud Functions BigQuery).
  - Project Experience: Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services with a focus on cost efficiency and scalability.

Big Data Technologies
- Skills:
  - Experience with big data ecosystems (Hadoop HDFS Hive Spark)
  - Distributed computing parallel processing and handling petabyte-scale data
  - Tools for querying large datasets (Presto Athena)
  - Understanding of lakehouse frameworks (Delta Lake Iceberg Hudi).
  - Familiarity with data compaction schema evolution and ACID guarantees in distributed storage
  - Project Experience: Building and managing big data platforms to enable large-scale analytics often incorporating structured and unstructured data.

Database Administration and Optimization
- Skills:
  - Expertise in database technologies (SQL NoSQL GraphDBs)
  - Query optimization indexing and partitioning strategies
  - Backup replication and disaster recovery planning
  - Understanding of query execution plans cost-based optimization and caching strategies.
  - Experience performing index and partition design based on query patterns.
  - Familiarity with data versioning and temporal tables.
  - Experience profiling and optimizing application code interacting with databases.
  - Project Experience: Performance tuning for complex queries implementing database replication and sharding strategies to support high availability and scalability.

Data Governance and Security
- Skills:
  - Data privacy encryption and compliance with regulations (GDPR CCPA)
  - Implementing data governance frameworks (data lineage cataloging metadata management)
  - Role-based access control and user management for sensitive data
  - Experience with automated policy enforcement and data lineage visualization tools (e.g. DataHub Collibra Alation).
  - Knowledge of data quality frameworks integrated into CI/CD pipelines.
  - Familiarity with data contract testing between producer and consumer teams.
  - Project Experience: Developing and implementing data governance policies and security controls across the organizations data assets ensuring compliance with industry standards.

Programming and Scripting Languages
- Skills:
  - Proficiency in Python and SQL
  - Experience with version control (Git) and CI/CD for data engineering (Gitlab Jenkins CircleCI)
  - API design and integration (Postman)
  - Strong understanding of object-oriented programming (OOP) principles and design patterns in Python.
  - Familiarity with software engineering best practices (modularity testing documentation linting).
  - Understanding of algorithmic complexity (Big O notation) and ability to optimize code for scale.
  - Experience with parallel and distributed computation frameworks (Spark Dask Ray).
  - Ability to profile and debug performance bottlenecks in data workflows.
  - Use of type hinting logging frameworks and automated testing frameworks (pytest unittest)

AI/ML Pipeline Support and Analytics
- Skills:
  - Experience in supporting data scientists with feature engineering data wrangling and model deployment
  - Knowledge of ML orchestration tools (MLflow Kubeflow)
  - Hands-on experience with analytics tools (e.g. Tableau Power BI)
  - Familiarity with feature store design and model feature lineage tracking.
  - Understanding of data versioning and reproducibility for ML workflows.
  - Experience supporting real-time model inference pipelines.
  - Project Experience: Designing architectures that support AI/ML initiatives enabling scalable data pipelines for training models and supporting experimentation in the production environment.

Leadership and Mentorship
- Skills:
  - Leading data engineering teams cross-functional collaboration with data scientists analysts and business units
  - Project management (Agile Scrum Kanban) and stakeholder communication
  - Experience with mentorship and growing junior data engineers
  - Experience establishing data architecture standards and best practices.
  - Ability to review and approve technical designs for consistency and scalability.
  - Proven success in mentoring engineers in code quality modeling and system design.
  - Project Experience: Leading the technical direction for large-scale data initiatives such as enterprise data lake implementations or the creation of a unified data platform.

$155000 - $180000 a year

If you would like information about how your application is processed please contact us.

Required Experience:

Senior IC

Apply Now

About Company

SteerBridge

SteerBridge Strategies is proud to be an Equal Opportunity Employer. We are committed to creating a diverse and inclusive workplace where all qualified applicants and employees are treated with respect and dignity—regardless of race, color, gender, age, religion, national origin, ance ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click