- Career Center Home
- Search Jobs
- Production Systems Engineer, AI Systems
Results
Job Details
Explore Location
Meta
Menlo Park, California, United States
(on-site)
Posted
14 hours ago
Meta
Menlo Park, California, United States
(on-site)
Industry Categories
Internet / E-Commerce
Job Function
Other
Production Systems Engineer, AI Systems
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Production Systems Engineer, AI Systems
The insights provided are generated by AI and may contain inaccuracies. Please independently verify any critical information before relying on it.
Description
Meta is seeking a Hardware Systems Engineer to join our Release to Production (RTP) team. As a key member of the RTP team, you will be responsible for driving the end-to-end hardware lifecycle of Meta's servers, from prototyping and pre-production to production-ready system monitoring, automated provisioning, and remediation of issues. You will work closely with cross-functional teams, including hardware designers, networking teams, system manufacturers, component vendors, capacity engineering, production engineering, production services, and data center operations teams to enable new systems that will be deployed in our production data centers.Production Systems Engineer, AI Systems Responsibilities:
- Drive and execute comprehensive end-to-end system validation strategy (hardware and software) for various AI/HPC hardware systems in datacenter applications
- Lead the bring-up, validation, and deployment of cutting-edge hardware systems in large-scale deployment with active hands-on participation
- Explore new use cases with customer teams and identify related test methodologies/test cases accordingly
- Investigate and troubleshoot complex failures potentially related to hardware systems with cross-functional teams
- Triage failures and continue root-causing while driving project development work forward
- Identify gaps and opportunities to improve the test process and test methodologies across the New Product Introduction (NPI) space
- Guide automation efforts and data analysis for New Product Introduction projects through engagement with related cross-functional teams
- Communicate project progress and assessments to the related internal and external teams
- Interface with external vendors and internal hardware, mechanical, power, thermal, manufacturing, and software engineers to understand the system's architecture
- Develop visibility through data visualization and implement systemic solutions to hardware health issues
- Proactively create experiments and tooling to detect and diagnose hardware/firmware/software health issues
Minimum Qualifications:
- 8+ years of experience in hands-on software, firmware or hardware engineering to build any of the following products (AI silicon, GPUs, TPUs, Autonomous cars, AI servers)
- Experience in one or more domains such as: ASIC development (silicon design, bringup, characterization, validation), board-level debug, firmware validation, system validation
- Knowledge of architecture and components on one of the following products: server/PC/Laptop
- Development or debug experience in one or more of the following areas: hardware fault management, error reporting, error handling on hardware products
Preferred Qualifications:
- 6+ years experience in Networking space: Switches, Network Interface Cards (NICs), DPU etc
- Knowledge of TCP/IP and experience using tools like iperf/uperf
- Experience working with RDMA/RoCE, including scale-out networks
- Experience working with AI server systems
- Experience working with large scale deployments
About Meta:
Meta builds technologies that help people connect, find communities, and grow businesses. When Facebook launched in 2004, it changed the way people connect. Apps like Messenger, Instagram and WhatsApp further empowered billions around the world. Now, Meta is moving beyond 2D screens toward immersive experiences like augmented and virtual reality to help build the next evolution in social technology. People who choose to build their careers by building with us at Meta help shape a future that will take us beyond what digital connection makes possible today-beyond the constraints of screens, the limits of distance, and even the rules of physics.
Meta is proud to be an Equal Employment Opportunity and Affirmative Action employer. We do not discriminate based upon race, religion, color, national origin, sex (including pregnancy, childbirth, or related medical conditions), sexual orientation, gender, gender identity, gender expression, transgender status, sexual stereotypes, age, status as a protected veteran, status as an individual with a disability, or other applicable legally protected characteristics. We also consider qualified applicants with criminal histories, consistent with applicable federal, state and local law. Meta participates in the E-Verify program in certain locations, as required by law. Please note that Meta may leverage artificial intelligence and machine learning technologies in connection with applications for employment.
Meta is committed to providing reasonable accommodations for candidates with disabilities in our recruiting process. If you need any assistance or accommodations due to a disability, please let us know at [email protected].
$173,000/year to $245,000/year + bonus + equity + benefits
Individual compensation is determined by skills, qualifications, experience, and location. Compensation details listed in this posting reflect the base hourly rate, monthly rate, or annual salary only, and do not include bonus, equity or sales incentives, if applicable. In addition to base compensation, Meta offers benefits. Learn more about benefits at Meta.
Job ID: 84932398
Please refer to the company's website or job descriptions to learn more about them.
View Full Profile
More Jobs from Meta
Network Engineer, Optical Deployment
Singapore, Singapore
14 hours ago
Site Loss Control Specialist - Construction Safety
Bowling Green, Ohio, United States
13 hours ago
Data Engineer, Product Analytics
Remote, Remote, United States
13 hours ago
View your connections
Jobs You May Like
Median Salary
Net Salary per month
$4,920
Median Apartment Rent in City Center
(1-3 Bedroom)
$3,167
-
$6,250
$4,709
Utilities
Basic
(Electricity, heating, cooling, water, garbage for 915 sq ft apartment)
$155
-
$334
$218
High-Speed Internet
$50
-
$85
$66
Transportation
Gasoline
(1 gallon)
$4.88
Taxi Ride
(1 mile)
$3.86
Data is collected and updated regularly using reputable sources, including corporate websites and governmental reporting institutions.
Loading...
