Apple logo

Lead Forward Deployed Engineer, AI Evaluation Platform

Apple
3 days ago
On-site
Seattle, Washington, United States
AI systems are only as trustworthy as the methods used to evaluate them. At Apple, where AI powers experiences for billions of people, getting evaluation right is not a support function. It is a foundational. Join Apple Services Engineering to build the next generation of AI evaluation systems. We are building the scientific foundation and self-service tools for how AI evaluation is done at scale, spanning LLMs, agentic systems, and human-AI interaction.\\n\\nWe are looking for a Lead Forward Deployed Engineer (FDE) to lead the solutions and adoption strategy for our organization. In this highly strategic, hybrid role, you will act as the connective tissue between our science, platform engineering, and partner teams, transforming complex workflows into intuitive, developer-first platforms.

As the Lead FDE, you will balance deep technical advocacy with organizational execution. You will partner directly with Science, Platform Engineering, and Product Management to ensure we are building and adopting the right solutions. You will serve as a technical bridge between the research organization and the broader engineering ecosystem, ensuring our tools integrate seamlessly with existing ML infrastructure and developer workflows.\\n\\nYou will spend your time equally between internal alignment and external engagement. You will be \\"boots on the ground\\" with ML practitioners across Apple, working hand-in-hand with researchers and developers to operationalize sophisticated measurement techniques. You will then bring those insights back to the team, representing the voice of the developer to help Product and Engineering leadership refine the roadmap. If you thrive in the ambiguity of new initiatives, are passionate about democratizing AI evaluation, and want to be the strategic bridge for a rapidly growing AI organization, this is the role for you.

Strategic Alignment \u0026 Execution: Partner closely with Science, Platform Engineering, and Product Management to connect high-level strategy to our day-to-day execution plan.\\nSolutions \u0026 Integration: Partner directly with feature teams to integrate evaluation primitives and SDKs into their unique workflows. Champion the transition from a consulting model to a self-service model.\\nVoice of the Developer: Work with external stakeholders on their current evaluation gaps and needs. Gather deep, actionable signal on our API quality and developer experience, and funnel these insights back to the team to inform roadmap timing and implementation.\\nEcosystem \u0026 Community Building: Build demos, tutorials, and reference implementations showcasing evaluation best practices. Present a consistent branding and voice for the platform to foster a thriving internal community of ML developers.\\nOperationalizing Science: Partner closely with applied scientists and engineers to translate novel metrics and scoring algorithms into scalable, production-grade services. Help define what a \\"world-class\\" developer experience looks like for our platform.

5+ years of experience in Solutions Architecture, Forward-Deployed Engineering, Developer Advocacy, Technical Program Management, or a related highly technical, cross-functional role.\\nCustomer Obsession \u0026 Product Thinking: Experience acting as a technical partner to internal customers. You can translate vague requirements from other teams into concrete engineering specifications.\\nFunctional literacy in AI/ML concepts: You understand the fundamental lifecycle of machine learning (datasets, training vs. inference, evaluation metrics) and can discuss the engineering challenges involved.\\nDemonstrated experience partnering with Applied Scientists or Researchers: You have the ability to navigate the ambiguity of research workflows and operationalize scientific code.\\nExceptional communication skills, with the ability to represent the platform to executive leadership, partner teams, and the broader engineering community.\\nDemonstrated ability to navigate extreme ambiguity, define roadmaps where none existed, and influence without direct authority.

Deep familiarity with AI Evaluation Frameworks: You have used or contributed to modern evaluation tools like DeepEval, Ragas, TruLens, or LangSmith.\\nExperience designing research or tools with self-service adoption as a first-class constraint.\\nPrevious experience operating in a \\"Chief of Staff\\" or strategic proxy capacity for a technical organization.\\nA background in bridging research-heavy environments with production engineering teams