- Career Center Home
- Search Jobs
- Data Center Production Operations Engineer
Results
Job Details
Explore Location
Meta
Singapore, Singapore
(on-site)
Posted
13 hours ago
Meta
Singapore, Singapore
(on-site)
Industry Categories
Internet / E-Commerce
Job Function
Other
Data Center Production Operations Engineer
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Data Center Production Operations Engineer
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Description
We seek an IT professional with advanced, hands-on technical skills in server hardware and Linux - ideally in a Data Center environment. Having broad knowledge of server administration and participating in projects in a large-scale distributed data center environment is a core competency of this individual. You should also have working knowledge and experience in a few of the following core areas: Hardware repair, OS management, Tooling and Automation, Networking, or Technical Project Management.Data Center Production Operations Engineer Responsibilities:
- Support platform health by successfully resolving and closing tickets, while addressing the overall issue (i.e. addressing root cause) including, but not limited to, remote troubleshooting and physical inspection of services in data halls
- Participate in n-depth exploration and root cause analysis of highly technical issues within the data center, ranging from automated tooling to hardware failures and network issues
- Collaborate with cross-functional teams on projects and initiatives related to topics such as process, hardware and automation
- Point of contact for the introduction of new platforms and hardware to the site, in collaboration with partners and global resources, accelerating the time it takes to bring these products to sustained mass production
- Use tools and data analysis effectively to identify issues. Take actions to communicate with all stakeholders appropriately and manage or escalate as needed
- Identify corrective actions of hardware issues, work with internal teams and vendors
- influence future design changes to ensure ease of serviceability
- Solve systemic hardware and/or software issues at scale using scripting, automation, and tooling to drive global resolution
- Continuously evaluate and identify areas for improvement in processes, tools, and systems to optimize efficiency and quality of repairs
- Use data analytics to drive maximum server up-time and utilization rates, understanding hardware failure rates and service level agreements
- Support and train team members to evaluate and identify better ways to resolve issues, and define updates to tools and processes
- Provide engineering support and be a go-to technical resource for the team, leadership, and cross-functional teams in operating and maintaining data center servers
- Maintain and update documentation i.e. procedures, runbooks and guides
- Build cross functional relationships and influence policies and procedures that improve global data center operations
- Participate in 24/7 on-call rotation
- Ability to travel up to 15% of the time
Minimum Qualifications:
- BS, BA or BEng in technical field or commensurate experience
- 5+ years of technical IT experience within an infrastructure environment, in a role such as Systems Administrator, DevOps Engineer, or Site Reliability Engineer
- Intermediate-level understanding in Linux (or equivalent OS) in a complex IT environment with the ability to triage, debug, and troubleshoot server issues
- Hands-on experience and knowledge of server hardware and components, including storage
- Intermediate-level knowledge of the interdependencies of data center functions and technologies including electrical, cooling, structured cabling, security, and network
- Experience managing technical issues and driving to the root cause
- Experience participating in technical projects related to areas such as process improvement, technology, and/or automation
- Ability to communicate effectively, in a clear and concise manner, appropriately tailoring messages to the audience
- Intermediate-level knowledge of technologies such as HTTP, DNS, RAID, and DHCP
- Experience in providing technical guidance to external vendors
- Experience in debugging, modifying and developing commonly used scripting or programming languages in at least one of these languages: Bash, PHP, Python, SQL, Rust, Go or Perl
- Knowledge of out-of-band/lights-out server communication methods, such as IPMI and serial console
- Experience using data and metrics to drive decisions
Preferred Qualifications:
- Experience in fostering growth in others, and driving influence across all organizational levels
- Experience in a large-scale data center environment
- Experience with large-scale AI implementations
- Six Sigma knowledge/certification
- Demonstrated ability to integrate AI tools to optimize/redesign workflows and drive measurable impact (e.g., efficiency gains, quality improvements)
- Experience adhering to and implementing responsible, ethical AI practices (e.g., risk assessment, bias mitigation, quality and accuracy reviews)
- Demonstrated ongoing AI skill development (e.g., prompt/context engineering, agent orchestration) and staying current with emerging AI technologies
About Meta:
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.
Job ID: 84842198
Please refer to the company's website or job descriptions to learn more about them.
View Full Profile
More Jobs from Meta
Sustainability Program Manager, Responsible Supply Chain
Menlo Park, California, United States
13 hours ago
Sustainability Program Manager, Responsible Supply Chain
Remote, Remote, United States
13 hours ago
Research Scientist, Wearables AI
Burlingame, California, United States
13 hours ago
View your connections
Jobs You May Like
Median Salary
Net Salary per month
$4,282
Cost of Living Index
92/100
92
Median Apartment Rent in City Center
(1-3 Bedroom)
$2,787
-
$5,817
$4,302
Safety Index
78/100
78
Utilities
Basic
(Electricity, heating, cooling, water, garbage for 915 sq ft apartment)
$92
-
$253
$162
High-Speed Internet
$23
-
$39
$27
Transportation
Gasoline
(1 gallon)
$9.27
Taxi Ride
(1 mile)
$1.26
Data is collected and updated regularly using reputable sources, including corporate websites and governmental reporting institutions.
Loading...
