Machine Learning Engineer Visual Agents Special Projects

Job Location:

Cupertino, CA - USA

Monthly Salary: Not Disclosed

Posted on: Yesterday

Vacancies: 1 Vacancy

Job Summary

Apple is where individual imaginations gather together committing to the values that lead to great work. Every new product we build service we create or experience we deliver is the result of us making each others ideas stronger. The diversity of our people and their thinking inspires the innovation that runs through everything we do. When we bring everybody in we can do the best work of our lives. Here youll do more than join something youll add something.

The Special Projects team at Apple is developing novel experiences powered by state-of-the-art agentic vision-language models that incorporate visual context into conversational interaction. We are looking for a Machine Learning Engineer to help us build fine-tune and rigorously evaluate these systems. A successful candidate has hands-on experience with vision-language models knows how to translate ambiguous product requirements into measurable evaluation criteria and is excited to work at the intersection of multimodal modeling and agentic AI.

Build and evaluate vision-language agents that perceive real-world scenes and incorporate that context into conversational modelsnCurate annotate and build multimodal datasets to support model training and evaluationnDevelop automated evaluation pipelines including LLM-as-judge frameworks human evaluation protocols and domain-specific benchmarksnFine-tune Large Language Models (LLMs) and Visual-Language Models (VLMs) to improve performance for specific use casesnWork closely with other ML Researchers to define evaluation criteria and methodology to systematically evaluate foundation modelsnDesign controlled experiments to measure model capabilities identify failure modes and drive iterative model improvementsnConduct robust statistical analysis to identify model deficiencies and failure modes and performance gaps.

BA or Masters degree in Computer Science or Machine Learningn2 years of hands-on experience building and evaluating generative AI or multimodal modelsnExperience working with vision-language models or multimodal systemsnProficiency in Python and ML frameworks (Pytorch or Tensorflow)

PhD in Computer Science Machine Learning Statistics or other STEM fieldnPrior industry internship or research experience applying ML to product use casesnExperience with video understanding temporal reasoning or activity recognitionnFamiliarity with agentic system design including tool use grounding or perceive-act loopsnExperience building or working with large-scale multimodal data and annotation pipelinesnProficiency in training fine-tuning and evaluation of foundation models and frameworksnPublications or technical presentations in Machine Learning journals or conferencesnExcellent communication skills and cross functional collaboration

Required Experience:

Unclear Seniority

Required Experience:

Unclear Seniority

Apply Now

About Company

Apple

Ask Siri to name the most successful company in the world and it might respond: Apple. And it's not just out of familial pride. Apple consistently ranks highly in profit, revenue, market capitalization, and consumer cachet. In 2018, the company became the first reach a trillion dollar ... View more

View Profile View Profile

AI AutoApply

Apply to 100+ jobs with one click