AI Product Manager (Evaluation) - Flow
About the Team Flow is ByteDance's AI-native innovation business team, focusing on developing products in emerging fields such as LLM. Our product portfolio includes Doubao (China) and Dola (Global). The Model Evaluation Team plays a key role within Flow, providing accurate and timely assessment signals for product iterations. We are responsible for the entire evaluation process, from designing methodology to executing evaluations, identifying issues, and proposing feasible improvement suggestions. Join us to explore innovative opportunities in AI, grow in a dynamic team environment, and create real value for users! Responsibilities - Evaluation Framework Establishment: Design evaluation frameworks for the Dola App including standards, test cases, and workflows. Create metrics that measure performance across key dimensions and enable data-driven decisions in existing and upcoming use scenarios - Human-Machine Evaluation Integration: Build workflows combining human expert and automated evaluation. Ensure automated processes align with human judgment, identify performance gaps, and increase testing coverage to supplement human evaluation - Evaluation Results Analysis: Interpret and analyze evaluation findings with depth, transforming nuanced case analysis into actionable insights. Develop data production and product design strategies aimed at enhancing model capabilities and/or user experience - Cross-functional Collaboration: Align product, engineering and evaluation teams across cultural and professional boundaries. Facilitate communication, resolve differences, and build consensus around overall evaluation methodologies - Evaluation Methodology Research: Stay at the forefront of LLM evaluation by researching industry-leading assessment methodologies and reports. Integrate insights with practical applications to continuously improve our evaluation framework