Member of Technical Staff - Post-Training and RL

Palo Alto, CA·Model·other

<div class="content-intro"><h3><strong><span style="font-family: arial, helvetica, sans-serif;">ABOUT xAI</span></strong></h3> <p><span style="font-family: arial, helvetica, sans-serif;">xAI’s mission is to create AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. </span><span style="font-family: arial, helvetica, sans-serif;">Our team is small, highly motivated, and focused on engineering excellence. This organization is for individuals who appreciate challenging themselves and thrive on curiosity. </span><span style="font-family: arial, helvetica, sans-serif;">We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company’s mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important. </span><span style="font-family: arial, helvetica, sans-serif;">All employees are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.</span></p></div><h3 data-pm-slice="1 1 []"><span style="font-family: arial, helvetica, sans-serif;">ABOUT THE ROLE:</span></h3> <ul> <li>You will work on the most critical post-training and reinforcement learning challenges at any given time — including reward modeling, preference optimization (RLHF/DPO), and RL for improving reasoning, truthfulness, and real-world capabilities.</li> <li>You will get clarity on your first project before an offer.</li> </ul> <h3><span style="font-family: arial, helvetica, sans-serif;">BASIC QUALIFICATIONS:</span></h3> <ul> <li>You believe truth-seeking AI is the most important and challenging problem.</li> <li>You are obsessed about building incredibly useful models through post-training and RL techniques.</li> <li>You are a power user of AI models and eager to push the boundaries of what’s possible with reinforcement learning and alignment methods.</li> <li>If you previously worked on post-training, RLHF, or trained models used by millions of people it’s a big plus, but relevant experience is not required.</li> <li>You take pride in your work and thrive in meritocratic environments.</li> </ul> <h3><span style="font-family: arial, helvetica, sans-serif;"><strong>COMPENSATION AND BENEFITS:</strong></span></h3> <p>$180,000 - $600,000 USD</p> <p>Base salary is just one part of our total rewards package at xAI, which also includes equity, comprehensive medical, vision, and dental coverage, access to a 401(k) retirement plan, short & long-term disability insurance, life insurance, and various other discounts and perks.</p><div class="content-conclusion"><p><em>xAI is an equal opportunity employer. For details on data processing, view our </em><em><a href="https://x.ai/legal/recruitment-privacy-notice" target="_blank">Recruitment Privacy Notice</a>.</em></p></div>

Member of Technical Staff - Post-Training and RL

More open roles at X (formerly Twitter)