Sr. Data Architect
Vienna, VA - USA
Job Summary
We are seeking aSenior Data Architectto lead the design and evolution of enterprise-level data ecosystems. You will be responsible for architecting scalable secure and high-performance data infrastructures that support mission-critical aviation sustainment. This is a player-coach role that requires high-level strategic planning alongside hands-on engineering execution.
Benefits
- Health insurance
- Dental insurance
- Vision insurance
- Life Insurance
- 401(k) Retirement Plan with matching
- Paid Time Off
- Paid Federal Holidays
Key Responsibilities
Architecture & Design:Design conceptual logical and physical data models for complex federal environments. Lead the transition from legacy on-premises systems to modern cloud-native (AWS/GCP) data platforms.
Pipeline Development:Architect and oversee the build of automated ETL/ELT pipelines using Python SQL and PySpark to ingest and transform unstructured and structured data.
Cloud Data Warehousing:Implement and optimize enterprise data warehouses using tools likeAWS RedshiftGoogle BigQueryAWS Glue andDatabricks.
Governance & Compliance:Establish data governance frameworks metadata management and data lineage in alignment with federal standards (HIPAA FHIR NIST).
Performance Optimization:Conduct index/partition design query tuning and sharding strategies to ensure high availability and scalability for real-time analytics.
AI/ML Support:Design data architectures that facilitate AI/ML initiatives including model training pipelines and real-time inference in production environments.
Leadership:Mentor a team of data engineers enforce software engineering best practices (CI/CD unit testing documentation) and serve as a technical bridge between stakeholders and delivery teams.
Required Qualifications
- Must be a U.S. Citizen.
- Masterss Degree or Above in Systems Engineering Computer Science or related field.
- An active security clearance or the ability to obtain one is required.
- Minimum 6 years of experience to include:
- Experience in data management utilizing advanced analytics tools and platforms and Python.
- Experience with Data Warehousing consulting/engineering or related technologies (Redshift Databricks BigQuery OADW Apache Hive Apache Lucene).
- Experience in scripting tooling and automating large-scale computing environments.
- Extensive experience with major tools such as Python Pandas PySpark NumPy SciPy SQL and Git; Minor experience with TensorFlow PyTorch and Scikit-learn.
- Compliance:Deep understanding of data security and federal compliance requirements.
PROFESSIONAL EXPERIENCE / QUALIFICATIONS
- Data Architecture and Design
- Skills:
- Data modeling (conceptual logical and physical)
- Database schema design
- Understanding of different database paradigms (relational NoSQL graph databases etc.)
- ETL (Extract Transform Load) processes and tools
- Experience with modern data warehousing solutions (e.g. Redshift Snowflake BigQuery)
- Understanding of dimensional modeling (star/snowflake schemas) and data vault techniques.
- Experience designing for both OLTP and OLAP workloads.
- Familiarity with metadata-driven design and schema evolution in data systems.
- Experience defining data SLAs and lifecycle management policies.
- Project Experience: Designing and implementing scalable data architectures that support business intelligence analytics and machine learning workflows.
- Skills:
- Data Pipeline Development
- Skills:
- Proficiency in tools like Apache Kafka Airflow Spark Flink or NiFi
- Experience with cloud-based data services (AWS Glue Google Cloud Dataflow Azure Data Factory)
- Real-time and batch data processing
- Automation and monitoring of data pipelines
- Strong understanding of incremental processing idempotency and backfill strategies.
- Knowledge of workflow dependency management retries and alerting.
- Experience writing modular testable and reusable Python-based ETL code.
- Project Experience: Leading the development of highly available fault-tolerant and scalable data pipelines integrating multiple data sources and ensuring data quality.
- Skills:
- Cloud Platforms and Services
- Skills:
- Expertise in cloud environments (AWS GCP Azure)
- Understanding of cloud-based storage (S3 Blob Storage) databases (RDS DynamoDB) and compute resources
- Implementing cloud-native data solutions (Data Lake Data Warehouse Data Mesh)
- Experience with cost monitoring and optimization for data workloads.
- Familiarity with hybrid and multi-cloud architectures.
- Understanding of serverless data patterns (e.g. Lambda S3 Athena Cloud Functions BigQuery).
- Project Experience: Migrating legacy data infrastructure to the cloud or developing new data platforms using cloud services with a focus on cost efficiency and scalability.
- Skills:
- Big Data Technologies
- Skills:
- Experience with big data ecosystems (Hadoop HDFS Hive Spark)
- Distributed computing parallel processing and handling petabyte-scale data
- Tools for querying large datasets (Presto Athena)
- Understanding of lakehouse frameworks (Delta Lake Iceberg Hudi).
- Familiarity with data compaction schema evolution and ACID guarantees in distributed storage
- Project Experience: Building and managing big data platforms to enable large-scale analytics often incorporating structured and unstructured data.
- Skills:
- Database Administration and Optimization
- Skills:
- Expertise in database technologies (SQL NoSQL GraphDBs)
- Query optimization indexing and partitioning strategies
- Backup replication and disaster recovery planning
- Understanding of query execution plans cost-based optimization and caching strategies.
- Experience performing index and partition design based on query patterns.
- Familiarity with data versioning and temporal tables.
- Experience profiling and optimizing application code interacting with databases.
- Project Experience: Performance tuning for complex queries implementing database replication and sharding strategies to support high availability and scalability.
- Skills:
- Data Governance and Security
- Skills:
- Data privacy encryption and compliance with regulations (GDPR CCPA)
- Implementing data governance frameworks (data lineage cataloging metadata management)
- Role-based access control and user management for sensitive data
- Experience with automated policy enforcement and data lineage visualization tools (e.g. DataHub Collibra Alation).
- Knowledge of data quality frameworks integrated into CI/CD pipelines.
- Familiarity with data contract testing between producer and consumer teams.
- Project Experience: Developing and implementing data governance policies and security controls across the organizations data assets ensuring compliance with industry standards.
- Skills:
- Programming and Scripting Languages
- Skills:
- Proficiency in Python and SQL
- Experience with version control (Git) and CI/CD for data engineering (Gitlab Jenkins CircleCI)
- API design and integration (Postman)
- Strong understanding of object-oriented programming (OOP) principles and design patterns in Python.
- Familiarity with software engineering best practices (modularity testing documentation linting).
- Understanding of algorithmic complexity (Big O notation) and ability to optimize code for scale.
- Experience with parallel and distributed computation frameworks (Spark Dask Ray).
- Ability to profile and debug performance bottlenecks in data workflows.
- Use of type hinting logging frameworks and automated testing frameworks (pytest unittest)
- Skills:
- AI/ML Pipeline Support and Analytics
- Skills:
- Experience in supporting data scientists with feature engineering data wrangling and model deployment
- Knowledge of ML orchestration tools (MLflow Kubeflow)
- Hands-on experience with analytics tools (e.g. Tableau Power BI)
- Familiarity with feature store design and model feature lineage tracking.
- Understanding of data versioning and reproducibility for ML workflows.
- Experience supporting real-time model inference pipelines.
- Project Experience: Designing architectures that support AI/ML initiatives enabling scalable data pipelines for training models and supporting experimentation in the production environment.
- Skills:
- Leadership and Mentorship
- Skills:
- Leading data engineering teams cross-functional collaboration with data scientists analysts and business units
- Project management (Agile Scrum Kanban) and stakeholder communication
- Experience with mentorship and growing junior data engineers
- Experience establishing data architecture standards and best practices.
- Ability to review and approve technical designs for consistency and scalability.
- Proven success in mentoring engineers in code quality modeling and system design.
- Project Experience: Leading the technical direction for large-scale data initiatives such as enterprise data lake implementations or the creation of a unified data platform.
- Skills:
Required Experience:
Senior IC
About Company
SteerBridge Strategies is proud to be an Equal Opportunity Employer. We are committed to creating a diverse and inclusive workplace where all qualified applicants and employees are treated with respect and dignity—regardless of race, color, gender, age, religion, national origin, ance ... View more