Tech Lead (SRE) - Cloud Infrastructure
Team Introduction The Site Reliability Engineering (SRE) team is a fusion of software and systems engineering techniques used to design and operate large-scale, extensively distributed, and resilient systems. Within Infrastructure SRE, our primary focus is to ensure that the reliability and uptime of our infrastructure services meet the needs of our users and support rapid improvement iterations. Our software development efforts are deeply committed to optimising existing systems, constructing essential infrastructure, and streamlining operations through automation. The Role In the role of a Tech Lead, you will assume responsibility for guiding and assembling a team of software and system engineers, leveraging your exceptional technical leadership skills. Your role will involve establishing efficient processes for project execution and promoting sound engineering practices. Additionally, you will maintain regular coordination and communication with other infrastructure teams and our user community. What you will be doing: 1. Establish and oversee the SRE team, which encompasses tasks such as team recruitment, the training of new talent, system operation and maintenance, coordination efforts, and fostering a cohesive team culture; 2. Oversee the acquisition and development of software systems in organisational units. Establish a comprehensive long-term technical strategy with well-defined implementation steps and milestones to continually enhance the team's competitiveness and technological capabilities; 3. Oversee the development of Proof-of-Concept/solutions and provide technical expertise on the development of software and platform features, ensuring that appropriate security and risk factors are considered; 4. Create protocols and strategies for critical aspects of the operating platform, including access management, configuration, disaster recovery, and fault handling; 5. Devise and implement software platforms and monitoring frameworks that promote efficient, automated, and intelligent governance within a service-oriented architecture (SOA); 6. Collaborate closely with the system development team to guarantee the reliability of systems from initial design through to launch. Consistently advance automated operations and maintenance facilities and platforms; 7. Foster improved communication and collaboration with business teams, enhance cross-team coordination, and persistently refine and optimize business processes. Drive the evolution of business architecture design.