Data Center Facilities Manager

Ashburn·Infrastructure·ops
Apply on ByteDance (TikTok) →

The Datacenter Facility Operation team supports the company's hyper-scale growth by operating, maintaining, and optimizing our critical infrastructure. We ensure 100% uptime, maximum energy efficiency, and operational excellence across our global data center footprint. The team focuses on scaling critical infrastructure (Power and Cooling) through rigorous standard operational procedures, innovation, and culture of safety. As a Data Center Facility Manager, you will be responsible for the overall operational excellence, reliability, and financial management of critical infrastructure within your assigned data center site(s). You will transition from tactical hands-on troubleshooting to strategic leadership—working with cross-functional teams, driving colocation partner governance, and maintaining ultimate accountability for site uptime, safety, and efficiency. You will bridge the gap between high-level business strategy and ground-level execution, ensuring that our infrastructure scales seamlessly to meet the demands of our server fleet. Responsibilities People Leadership & Talent Development: - Lead, mentor, and develop a high-performing team of data center facility operation engineers and technicians; build a culture of accountability, safety, and continuous improvement. - Manage shift planning, resource allocation, and succession planning to ensure 24/7 technical coverage. Operational Excellence & SLA Governance: - Accountable for site uptime and strict adherence to strict Service Level Agreements (SLAs). Serve as the escalation point for major site incidents. - Establish, audit, and govern the maintenance programs of colocation partners to ensure high-quality execution of preventative and corrective maintenance. - Oversee Root Cause Analysis (RCA) and Corrective Actions/Preventive Actions (CAPA) for critical infrastructure failures, ensuring lessons learned are institutionalized globally. Vendor Strategy & Contract Management: - Manage critical colocation and vendor partnership, driving Key Performance Indicators (KPIs) and operational governance. - Lead regular Quarterly Business Reviews (QBRs) and operational audits with partners and critical equipment vendors (Generators, UPS, Chillers). Financial & Asset Management (CapEx/OpEx): - Own the site operational budget (OpEx) and forecast lifecycle capital improvement projects (CapEx). - Identify opportunities for infrastructure optimization, energy efficiency (PUE reduction), and cost-saving initiatives without compromising reliability. Risk & Change Management: - Serve as the assigned site authority for Critical Environment Work Authorizations (CEWA) and high-risk Method of Procedures (MOPs). - Enforce a zero-injury safety culture, ensuring compliance with global and local environmental, health, and safety (EHS) regulations. Deployment, Commissioning & Lifecycle Support: - Partner with Design, Construction, and Global Commissioning teams to oversee data hall fit-outs, capacity expansion, commissioning tests and facility audits. - Ensure seamless handovers from construction to operations (Operational Readiness), including updated single-line diagrams, SOPs, and EOPs.

More open roles at ByteDance (TikTok)