Backend Engineer - ARK Large Model Platform (Singapore)
About the Team The Applied Machine Learning (AML) - Ark team provides machine learning platform products on VolcanoEngine with cloud native resource scheduling system which intelligently orchestrates different tasks and jobs with minimised costs of every experiment and maximised resource utilisation, rich modelling tools including customised machine learning tasks and web IDE, and multi-framework high performance model inference services. In 2021, through VolcanoEngine, we released this machine learning infrastructure to the public, to provide more enterprises with reduced costs of computation power, lower barriers to machine learning engineering and deeper developments in AI capabilities. Responsibilities -Design and develop core pipeline components for the Ark MaaS platform (Model as a Service), supporting API capabilities such as text conversations, multimodal understanding, and multimodal generation. -Participate deeply in the evolution of cloud-native architectures, including Service Mesh, load balancing (LB), and intelligent routing. Design and implement highly available solutions such as full-link canary releases, traffic degradation, circuit breaking, and rate limiting. -Optimize system performance for large-model multimodal inference scenarios involving long connections, high throughput, streaming output, and low-latency requirements. -Ensure system stability under large-scale model invocation scenarios and resolve architectural bottlenecks caused by sudden traffic surges.