Training Operation Coding Analyst - Seed Global Data
About the team Seed Global Data is a team focused on producing international data for LLMs. For the training of large models, data is the lifeline of model quality — and the Global Data team is working closely with technical, product, and operations teams to ensure effective data production strategies and execution management. As a key member of our LLM Global Data Team, the LLM Training Operations Analyst will play a pivotal role in managing the intricate processes involved in training large language models (LLMs) with diverse coding datasets. This role focuses on overseeing and improving operational workflows, primarily for safety-related projects, ensuring they are delivered with high quality and efficiency. Job Responsibilities - Driving complex, fast-paced, cross-functional projects from incubation to execution. You will be responsible for designing and managing multiple Large Language Model (LLM) training projects (mostly coding-based but may involve other STEM related projects). - Coordinating across functions (including product managers, engineers, and internal or external content experts), planning workflows, tracking progress, identifying risks and taking necessary corrective actions to ensure high-quality, timely project delivery. - Working closely with your leads, product managers and engineers to design, test, and optimize operational workflows including model training strategies, quality assurance processes and productivity enhancements. - Analyzing operational and model training or performance data to provide actionable insights through reports and presentations to stakeholders, driving future model training directions or adjustments. - Designing and implementing robust data analysis strategies to systematically evaluate the quality of training and validation sets. - Leading or supporting cross-domain operational improvement initiatives to optimize processes, share transferrable learnings and scale the generation of high-quality data.