About the Role RECRUITERS MUST RUN CHECKLISTS KEYWORDS UNDERLINED We are building a platform that converts unstructured financial data(emails corporate actions index announcements) into high-quality structured datasets used by financial institutions. This is not a typical LLM wrapper role. You will work on systems that:
Extract data from noisy inconsistent sources
Validate and reconcile outputs across multiple inputs
Ensure correctness traceability and auditability
The challenge is not just applying LLMs-its making them reliable in productionfor financial workflows. What Youll Work On
Designing pipelines that process high-volume financial documents (batch near real-time)
Building LLM-powered extraction workflows ( classification parsing summarization )
Implementing validation layers (rule-based model-based) to reduce hallucinations
Developing retrieval systems using embeddings and vector search
Ensuring data quality observability and fault tolerance
Collaborating with product to turn messy data into usable financial intelligence
Core Requirements
Strong Python and backend/data engineering experience
Experience building production data pipelines (ETL streaming or async systems)
Solid understanding of distributed systems and failure modes
Experience working with LLM-based systems in production:
Prompt design
Output validation
Retry/fallback strategies
Evaluation and monitoring
Experience with data storage systems (SQL NoSQL)
Familiarity with cloud infrastructure (AWS or similar)
Preferred Experience
Experience with RAG / vector search systems
Background in financial data or capital markets
Experience with streaming systems (Kafka etc.)
Experience building multi-step or agent-style workflows
What Makes This Role Interesting
Work on high-accuracy AI systems where correctness matters
Solve real problems around:
LLM reliability and hallucination mitigation
Data consistency across conflicting sources
Real-time vs correctness tradeoffs
Build systems used in financial decision-making workflows
High ownership over core architecture in an early-stage environment
Nice to Know (but not required)
Experience with orchestration tools ( Airflow etc.)
Exposure to evaluation frameworks for LLMs
Experience working with large-scale document processing
Tech Stack (Representative not exhaustive)
Python APIs async processing
LLM APIs embeddings
SQL / NoSQL databases
Cloud infrastructure (AWS)
Data pipelines and streaming systems
Vector Databases
* If they have 6-8 years of software development/engineering with AI and Data Engineering experience * If they have worked in the investment management investment banking area processing FINANCIAL MARKET DATA pipelines RAG Vector databases * If they are fluent with Python and API development and streaming systems like Kafka or similar * Prefer people who have worked at BlackRock Fidelity Investments Vanugard State Street Global Advisors ETrade Charles Schwab etc.
at Vanguard Group an investment management company that deals with Mutual Funds Index Funds ETFs etc. So must come from this business domain or they wont understand what to do.
About the Role RECRUITERS MUST RUN CHECKLISTS KEYWORDS UNDERLINED We are building a platform that converts unstructured financial data ( emails corporate actions index announcements ) into high-quality structured datasets used by financial institutions. This is not a typical LLM wrapper role. You w...
About the Role RECRUITERS MUST RUN CHECKLISTS KEYWORDS UNDERLINED We are building a platform that converts unstructured financial data(emails corporate actions index announcements) into high-quality structured datasets used by financial institutions. This is not a typical LLM wrapper role. You will work on systems that:
Extract data from noisy inconsistent sources
Validate and reconcile outputs across multiple inputs
Ensure correctness traceability and auditability
The challenge is not just applying LLMs-its making them reliable in productionfor financial workflows. What Youll Work On
Designing pipelines that process high-volume financial documents (batch near real-time)
Building LLM-powered extraction workflows ( classification parsing summarization )
Implementing validation layers (rule-based model-based) to reduce hallucinations
Developing retrieval systems using embeddings and vector search
Ensuring data quality observability and fault tolerance
Collaborating with product to turn messy data into usable financial intelligence
Core Requirements
Strong Python and backend/data engineering experience
Experience building production data pipelines (ETL streaming or async systems)
Solid understanding of distributed systems and failure modes
Experience working with LLM-based systems in production:
Prompt design
Output validation
Retry/fallback strategies
Evaluation and monitoring
Experience with data storage systems (SQL NoSQL)
Familiarity with cloud infrastructure (AWS or similar)
Preferred Experience
Experience with RAG / vector search systems
Background in financial data or capital markets
Experience with streaming systems (Kafka etc.)
Experience building multi-step or agent-style workflows
What Makes This Role Interesting
Work on high-accuracy AI systems where correctness matters
Solve real problems around:
LLM reliability and hallucination mitigation
Data consistency across conflicting sources
Real-time vs correctness tradeoffs
Build systems used in financial decision-making workflows
High ownership over core architecture in an early-stage environment
Nice to Know (but not required)
Experience with orchestration tools ( Airflow etc.)
Exposure to evaluation frameworks for LLMs
Experience working with large-scale document processing
Tech Stack (Representative not exhaustive)
Python APIs async processing
LLM APIs embeddings
SQL / NoSQL databases
Cloud infrastructure (AWS)
Data pipelines and streaming systems
Vector Databases
* If they have 6-8 years of software development/engineering with AI and Data Engineering experience * If they have worked in the investment management investment banking area processing FINANCIAL MARKET DATA pipelines RAG Vector databases * If they are fluent with Python and API development and streaming systems like Kafka or similar * Prefer people who have worked at BlackRock Fidelity Investments Vanugard State Street Global Advisors ETrade Charles Schwab etc.
at Vanguard Group an investment management company that deals with Mutual Funds Index Funds ETFs etc. So must come from this business domain or they wont understand what to do.